A critical review of ECMAScript 6 quasi-literals
Quasi-literals (update: now formally called “template strings”) are a proposed addition to ECMAScript 6 designed to solve a whole host of problems. The proposal seeks to add new syntax that would allow the creation of domain-specific languages (DSLs)[1] for working with content in a way that is safer than the solutions we have today. The description on the template string-literal strawman page[2] is as follows:
This scheme extends EcmaScript syntax with syntactic sugar to allow libraries to provide DSLs that easily produce, query, and manipulate content from other languages that are immune or resistant to injection attacks such as XSS, SQL Injection, etc.
In reality, though, template strings are ECMAScript’s answer to several ongoing problems. As best I can figure, these are the immediate problems template strings attempt to address:
- Multiline strings – JavaScript has never had a formal concept of multiline strings.
- Basic string formatting – The ability to substitute parts of the string for values contained in variables.
- HTML escaping – The ability to transform a string such that it is safe to insert into HTML.
- Localization of strings – The ability to easily swap out string from one language into a string from another language.
I’ve been looking at template strings to figure out if they actually solve these problems sufficiently or not. My initial reaction is that template strings solve some of these problems in certain situations but aren’t useful enough to be the sole mechanism of addressing these problems. I decided to take some time and explore template strings to figure out if my reaction was valid or not.
The basics
Before digging into the use cases, it’s important to understand how template strings work. The basic format for template strings is as follows:
`literal${substitution}literal`
This is the simplest form of template string that simply does substitutions. The entire template string is enclosed in backticks. In between those backticks can be any number of characters including white space. The dollar sign ($) indicates an expression that should be substituted. In this example, the template string would replace ${substitution} With the value of the JavaScript variable called substitution That is available in the same scope in which the template string is defined. For example:
var name = "Nicholas",
msg = `Hello, ${name}!`;
console.log(msg); // "Hello, Nicholas!"
In this code, the template string has a single identifier to substitute. The sequence ${name} is replaced by the value of the variable name. You can substitute more complex expressions, such as:
var total = 30,
msg = `The total is ${total} (${total*1.05} with tax)`;
console.log(msg); // "The total is 30 (31.5 with tax)"
This example uses a more complex expression substitution to calculate the price with tax. You can place any expression that returns a value inside the braces of a template string to have that value inserted into the final string.
The more advanced format of a template string is as follows:
tag`literal${substitution}literal`
This form includes a tag, which is basically just a function that alters the output of the template string. The template string proposal includes a proposal for several built-in tags to handle common cases (those will be discussed later) but it’s also possible to define your own.
A tag is simply a function that is called with the processed template string data. The function receives data about the template string as individual pieces that the tag must then combined to create the finished value. The first argument the function receives is an array containing the literal strings as they are interpreted by JavaScript. These arrays are organized such that a substitution should be made between items, so there needs to be a substitution made between the first and the second item, the second and the third item, and so on. This array also has a special property called raw, which is an array containing the literal strings as they appear in the code (so you can tell what was written in the code). Each subsequent argument to the tag after the first is the value of a substitution expression in the template string. For example, this is what would be passed to a tag for the last example:
- Argument 1 =
[ "The total is ", " (", " with tax)" ].raw = [ "The total is ", " (", " with tax)" ]
- Argument 2 =
30 - Argument 3 =
31.5
Note that the substitution expressions are automatically evaluated, so you just receive the final values. That means the tag is free to manipulate the final value in any way that is appropriate. For example, I can create a tag that behaves the same as the defaults (when no tag specified) like this:
function passthru(literals) {
var result = "",
i = 0;
while (i < literals.length) {
result += literals[i++];
if (i < arguments.length) {
result += arguments[i];
}
}
return result;
}
And then you can use it like this:
var total = 30,
msg = passthru`The total is ${total} (${total*1.05} with tax)`;
console.log(msg); // "The total is 30 (31.5 with tax)"
In all of these examples, there has been no difference between raw and cooked Because there were no special characters within the template string. Consider template string like this:
tag`First line\nSecond line`
In this case, the tag would receive:
- Argument 1 =
cooked = [ "First line\nSecond line" ].raw = [ "First line\\nSecond line" ]
Note that the first item in raw is an escaped version of the string, effectively the same thing that was written in code. May not always need that information, but it’s around just in case.
Multiline strings
The first problem that template string literals were meant to address his multiline strings. As I’ve mentioned in previous posts, this isn’t a big problem for me, but I know that there are a fair number of people who would like this capability. There has been an unofficial way of doing multiline string literals and JavaScript for years using a backslash followed by a newline, such as this:
var text = "First line\n\
Second line";
This has widely been acknowledged as a mistake and something that is considered a bad practice, although it was blessed as part of ECMAScript 5. Many resort to using arrays in order not to use the unofficial technique:
var text = [
"First line",
"Second line"].join("\n");
However, this is pretty cumbersome if you’re writing a lot of text. It would definitely be easier to have a way to include
this directly within the literal. Other languages have had this feature for years.
There are of course heredocs[3], such as what is supported in PHP:
$text = <<<EOF
First line
Second line
EOF;
And Python has its triple quoted strings:
text = """First line
Second line"""
In comparison, template string-literals look cleaner because they use fewer characters:
var text = `First line
Second line`;
So it’s pretty easy to see that template strings solve the problem of multiline strings in JavaScript pretty well. This is undoubtedly a case where new syntax is needed, because both the double quote and single quote are already spoken for (and pretty much are exactly the same).
Basic string formatting
The problem the basic string formatting hasn’t been solved in JavaScript yet. When I say basic string formatting, I’m talking about simple substitutions within text. Think of sprintf in C or String.format() in C# or Java. This comment isn’t particularly for and JavaScript, finding life in a few corners of development.
First, the console.log() method (and its related methods) support basic string formatting in Internet Explorer 8+, Firefox, Safari, and Chrome (Opera doesn’t support string formatting on the console). On the server, Node.js also supports string formatting for its console.log()[4]. You can include %s to substitute a string, %d or %i to substitute an integer, or %f for floating-point vlaues (Node.js also allows %j for including JSON, Firefox and Chrome allow %o for outputting an object[5]). For example:
console.log("Hello %s", "world"); // "Hello world"
console.log("The count is %d", 5); // "The count is 5"
Various JavaScript libraries have also implemented similar string formatting functions. YUI has the substitute()[6] method, which uses named values for string replacements:
YUI().use("substitute", function(Y) {
var msg = Y.substitute("Hello, {place}", { place: "world" });
console.log(msg); // "Hello, world"
});
Dojo has a similar mechanism via dojo.string.substitute()[7], though it can also deal with positional substitutions by passing an array:
var msg = dojo.string.substitue("Hello, ${place}", { place: "world" });
console.log(msg); // "Hello, world"
msg = dojo.string.substitue("Hello, ${0}", [ "world" ]);
console.log(msg); // "Hello, world"
It’s clear that basic string formatting is already alive and well in JavaScript and chances are that many developers have used it at some point in time. Keep in mind, simple string formatting isn’t concerned with escaping of values because it is doing simple string manipulation (HTML escaping is discussed later).
In comparison to the already available string formatting methods, template strings visually appear to be very much the same. Here’s how the previous examples would be written using a template string:
var place = "world",
msg = `Hello, ${place}`;
console.log(msg); // "Hello, world"
Syntactically, one could argue that template strings are easier to read because the variable is placed directly into the literal so you can guess the result more easily. So if you’re going to be converting code using older string formatting methods into template strings, it’s a pretty easy conversion if you are using string literals directly in your JavaScript.
The downside of template strings is the same downside experienced using heredocs: the literal must be defined in a scope that has access to the substitution variables. There are couple of problems with this. First, if a substitution variable isn’t defined in the scope in which a template string is defined, it will throw an error. For example:
var msg = `Hello, ${place}`; // throws error
Because place isn’t defined in this example, the template string actually throws an error because it’s trying to evaluate the variable that doesn’t exist. That behavior is also cause of the second major problem with template strings: you cannot externalize strings.
When using simple string formatting, as with console.log(), YUI, or Dojo, you have the ability to keep your strings external from the JavaScript code that uses it. This has the advantage of making string changes easier (because they aren’t buried inside of JavaScript code) and allowing the same strings to be used in multiple places. For example, you can define your strings in one place such as this:
var messages = {
welcome: "Hello, {name}"
};
And use them somewhere else like this:
var msg = Y.substitute(messages.welcome, { name: "Nicholas" });
With template strings, you are limited to using substitution only when the literal is embedded directly in your JavaScript along with variables representing the data to substitute. In effect, format strings have late binding to data values and template strings have early binding to data values. That early binding severely limits the cases where template strings can be used for the purpose of simple substitutions.
So while template strings solve the problem of simple string formatting when you want to embed literals in your JavaScript code, they do not solve the problem when you want to externalize strings. For this reason, I believe that even with the addition of template strings, some basic string formatting capabilities need to be added to ECMAScript.
Localization of strings
Closely related to simple string formatting is localization of strings. Localization is a complex problem encompassing all aspects of a web application, but localization of strings is what template strings are supposed to help with. The basic idea is that you should be able to define a string with placeholders in one language and be able to easily translate the strings into another language that makes use of the same substitutions.
The way this works in most systems today is that strings are externalized into a separate file or data structure. Both YUI[9] and Dojo[10] support externalized resource bundles for internationalization. Fundamentally, these work the same way as simple string formatting does, where each of the strings is a separate property in an object that can be used in any number of places. The strings can also contain placeholders for substitutions by the library’s method for doing so. For example:
// YUI
var messages = Y.Intl.get("messages");
console.log(messages.welcome, { name: "Nicholas" });
Since the placeholder in the string never changes regardless of language, the JavaScript code is kept pretty clean and doesn’t need to take into account things like different order of words and substitutions in different languages.
The approach that template strings seemed to be recommending is more one of a tool-based process. The strawman proposal talks about a special msg tag that is capable of working with localized strings. The purpose of msg is only to make sure that the substitutions themselves are being formatted correctly for the current locale (which is up to the developer to define). Other than that, it appears to only do basic string substitution. The intent seems to be to allow static analysis of the JavaScript file such that a new JavaScript file can be produced that correctly replaces the contents of the template string with text that is appropriate for the locale. The first example given is of translating English into French assuming that you already have the translation data somewhere:
// Before
alert(msg`Hello, ${world}!`);
// After
alert(msg`Bonjour ${world}!`);
The intent is that the first line is translated to the second line by some as-yet-to-be-defined tool. For those who don’t want to use this tool, the proposal suggests including the message bundle in line such that the msg tag looks up its data in that bundle in order to do the appropriate replacement. Here is that example:
// Before
alert(msg`Hello, ${world}!`);
// After
var messageBundle_fr = { // Maps message text and disambiguation meta-data to replacement.
'Hello, {0}!': 'Bonjour {0}!'
};
alert(msg`Hello, ${world}!`);
The intent is that the first line is translated into the several lines after it before going to production. You’ll note that in order to make this work, the message bundle is using format strings. The msg tag is then written as:
function msg(parts) {
var key = ...; // 'Hello, {0}!' given ['Hello, ', world, '!']
var translation = myMessageBundle[key];
return (translation || key).replace(/\{(\d+)\}/g, function (_, index) {
// not shown: proper formatting of substitutions
return parts[(index < < 1) | 1];
});
}
So it seems that in an effort to avoid format strings, template strings are only made to work for localization purposes by implementing its own simple string formatting.
For this problem, it seems like I’m actually comparing apples to oranges. The way that YUI and Dojo deal with localized strings and resource bundles is very much catered towards developers. The template string approach is very much catered towards tools and is therefore not very useful for people who don’t want to go through the hassle of integrating an additional tool into their build system. I’m not convinced that the localization scheme in the proposal represents a big advantage over what developers have already been doing.
HTML escaping
This is perhaps the biggest problem that template strings are meant to address. Whenever I talk to people on TC-39 about template strings, the conversation always seems to come back to secure escaping for insertion into HTML. The proposal itself starts out by talking about cross-site scripting attacks and how template strings help to mitigate them. Without a doubt, proper HTML escaping is important for any web application, both on the client and on the server. Fortunately, we have seen some more logical typesetting languages pop up, such as Mustache, which automatically escape output by default.
When talking about HTML escaping, it’s important to understand that there are two distinct classes of data. The first class of data is controlled. Controlled data is data that is generated by the server without any user interaction. That is to say the data was programmed in by developer and was not entered by user. The other class of data is uncontrolled, which is the very thing that template strings were intended to deal with. Uncontrolled data is data that comes from the user and you can therefore make no assumptions about its content. One of the big arguments against format strings is the threat of uncontrolled format strings[11] and the damage they can cause. This happens when uncontrolled data is passed into a format string and isn’t properly escaped along the way. For example:
// YUI
var html = Y.substitute(">p<Welcome, {name}>/p<", { name: username });
In this code, the HTML generated could potentially have a security issue if username hasn’t been sanitized prior to this point. It’s possible that username could contain HTML code, most notably JavaScript, that could compromise the page into which the string was inserted. This may not be as big of an issue on the browser, where script tags are innocuous when inserted via innerHTML, but on the server this is certainly a major issue. YUI has Y.Escape.html() to escape HTML that could be used to help:
// YUI
YUI().use("substitute", "escape", function(Y) {
var escapedUsername = Y.Escape.html(username),
html = Y.substitute(">p<Welcome, {name}>/p<", { name: escapedUsername });
});
After HTML escaping, the username is a bit more sanitized before being inserted into the string. That provides you with a basic level of protection against uncontrolled data. The problems can get a little more complicated than that, especially when you’re dealing with values that are being inserted into HTML attributes, but essentially escaping HTML before inserting into an HTML string is the minimum you should do to sanitize data.
Template strings aim to solve the problem of of HTML escaping plus a couple of other problems. The proposal talks about a tag called safehtml, which would not only perform HTML escaping, but would also look for other attack patterns and replace them with innocuous values. The example from the proposal is:
url = "http://example.com/";
message = query = "Hello & Goodbye";
color = "red";
safehtml`<a href="${url}?q=${query}" onclick=alert(${message}) style="color: ${color}">${message}</a>`
In this instance, there are a couple potential security issues in the HTML literal. The URL itself could end up being a JavaScript URL that does something bad, the query string could also end up being something bad, and the CSS value could end up being a CSS expression in older versions of Internet Explorer. For instance:
url = "javascript:alert(1337)";
color = "expression(alert(1337))";
Inserting these values into a string using simple HTML escaping, as in the previous example, would not prevent the resulting HTML from containing dangerous code. An HTML-escaped JavaScript URL still executes JavaScript. The intent of safehtml is to not only deal with HTML escaping but also to deal with these attack scenarios, where a value is dangerous regardless of it being escaped or not.
The template string proposal claims that in a case such as with JavaScript URLs, the values will be replaced with something completely innocuous and therefore prevent any harm. What it doesn’t cover is how the tag will know whether a “dangerous” value is actually controlled data and intentionally being inserted versus uncontrolled data that should always be changed. My hunch from reading the proposal is that it always assumes dangerous values to be dangerous and it’s up to the developer to jump through hoops to include some code that might appear dangerous to the tag. That’s not necessarily a bad thing.
So do template strings solve the HTML escaping problem? As with simple string formatting, the answer is yes, but only if you are embedding your HTML right into JavaScript where the substitution variables exist. Embedding HTML directly into JavaScript is something that I warned people not to do because it becomes hard to maintain. With templating solutions such as Mustache, the templates are often read in at runtime from someplace or else precompiled into functions that are executed directly. It seems that the intended audience for the safehtml tag might actually be the template libraries. I could definitely see this being useful when templates are compiled. Instead of compiling into complicated functions, the templates could be compiled into template strings using the safehtml tag. That would eliminate some of the complexity of template languages, though I’m sure not all.
Short of using a tool to generate template strings from strings or templates, I’m having a hard time believing that developers would go through the hassle of using them when creating a simple HTML escape function is so easy. Here’s the one that I tend to use:
function escapeHTML(text) {
return text.replace(/[<>"&]/g, function(c) {
switch (c) {
case "< ": return "<";
case ">": return ">";
case "\"": return """;
case "&": return "&";
}
});
}
I recognize that doing basic HTML escaping isn’t enough to completely secure an HTML string against all threats. However, if the template string-based HTML handling must be done directly within JavaScript code, and I think many developers will still end up using basic HTML escaping instead. Libraries are already providing this functionality to developers, it would be great if we could just have a standard version that everyone can rely on so we can stop shipping the same thing with every library. As with the msg tag, which needs simple string formatting to work correctly, I could also see safehtml needing basic HTML escaping in order to work correctly. They seem to go hand in hand.
Conclusion
Template strings definitely address all four of the problems I outlined at the beginning of this post. They are most successful in addressing the need to have multiline string literals in JavaScript. The solution is arguably the most elegant one available and does the job well.
When it comes to simple string formatting, template strings solve the problem in the same way that heredocs solve the problem. It’s great if you are going to be embedding your strings directly into the code near where the substitution variables exist. If you need to externalize your strings, then template strings aren’t solving the problem for you. Given that many developers externalize strings into resource bundles that are included with their applications, I’m pessimistic about the potential for template strings to solve the string formatting needs of many developers. I believe that a format string-based solution, such as the one that was Crockford proposed[12], still needs to be part of ECMAScript for it to be complete and for this problem to be completely solved.
I’m not at all convinced that template strings solve a localization use case. It seems like this use case was shoehorned in and that current solutions require a lot less work implement. Of course, the part that I found most interesting about the template strings solution for localization is that it made use of format strings. To me, that’s a telling sign that simple string formatting is definitely needed in ECMAScript. Template strings seem like the most heavy-handed solution to the localization problem, even with the as-yet-to-be-created tools that the proposal talks about.
Template strings definitely solve the HTML escaping problem, but once again, only in the same way that simple string formatting is solved. With the requirement of embedding your HTML inside of the JavaScript and having all variables present within that scope, the safehtml tag seems to only be useful from the perspective of templating tools. It doesn’t seem like something that developers will use by hand since many are using externalized templates. If templating libraries that precompiled templates are the target audience for this feature then it has a shot at being successful. I don’t think, however, that it serves the needs of other developers. I still believe that HTML escaping, as error-prone as it might be, is something that needs to be available as a low-level method in ECMAScript.
Note: I know there are a large number of people who believe HTML escaping shouldn’t necessarily be part of ECMAScript. Some say it should be a browser API, part of the DOM, or something else. I disagree with that sentiment because JavaScript is used quite frequently, on both the client and server, to manipulate HTML. As such, I believe that it’s important for ECMAScript to support HTML escaping along with URL escaping (which it has supported for a very long time).
Overall, template strings are an interesting concept that I think have potential. Right off the bat, they solve the problem of having multiline strings and heredocs-like functionality in JavaScript. They also appear to be an interesting solution as a generation target for tools. I don’t think that they are a suitable replacement for simple string formatting or low-level HTML escaping for JavaScript developers, both of which could be useful within tags. I’m not calling for template strings to be ripped out of ECMAScript, but I do think that it doesn’t solve enough of the problems for JavaScript developers that it should preclude other additions for string formatting and escaping.
Update (01-August-2012) – Updated article to mention that braces are always required in template strings. Also, addressed some of the feedback from Allen’s comment by changing “quasi-literals” to “template strings” and “quasi handlers” to “tags”. Updated description of slash-trailed multiline strings.
Update (02-August-2012) – Fixed YUI method name based on Ryan’s comment. Fixed escapeHTML() function encoding issue per Jakub’s comment.
References
- Domain-specific language (Wikipedia)
- ECMAScript quasi-literals (ECMAScript Wiki)
- Here-Documents (Wikipedia)
- The built-in console module by Charlie McConnell (Nodejitsu)
- Outputting text to the console (Mozilla Developer Network)
- YUI substitute method (YUILibrary)
- dojo.string.substitute() (Dojo Toolkit)
- YUI Internationalization (YUILibrary)
- Translatable Resource Bundles by Adam Peller
- Uncontrolled format string (Wikipedia)
- String.prototype.format by Douglas Crockford (ECMAScript Wiki)
Disclaimer: Any viewpoints and opinions expressed in this article are those of Nicholas C. Zakas and do not, in any way, reflect those of my employer, my colleagues, Wrox Publishing, O'Reilly Publishing, or anyone else. I speak only for myself, not for them.
Both comments and pings are currently closed.




26 Comments
This article is based on the ES6 strawman wiki pages. There are a number of difference between quasi-literal as describe in this article and what is actualy specified in the current ES6 draft. A summary (PDF) of the differences between the wiki and the spec. draft is available.
Also, as of the most recent TC-39 meeting we are now calling this language feature “template strings” instead of “quasi-literals”.
BTW, string literal line continuations (backslash newline in a string literal) were standardized as part of ES5.
As you noted from the quasi-literal strawman, they are designed to support embedded domain specific languages. Designing an appropriate DSL is an important part of using them. The examples, on the wiki are neither mature or even good DSLs but just example given to show the flavor of what is possible. More sophisticated DSL are needed to adequate solve problems like localization and HTML embedding.
For example, you might imagine a DSL for constructing localized strings from separately stored localized string resources. You might come up with something that would be used like this:
Localize(bundleURL, currentLocale) `msg1%date:${expr1}msg3${expr2}msg3%$zzzn.00:${expr4}`
The Localize call produces the template string tag function. The template string is expressed in a DSL that consists of resources identifiers (relative to the bundle) and format strings (perhaps based on the ECMAScript Internationalization API). The format strings describe how to process the substitution values.
ES6 template strings are just a building blocks for higher level solutions. The real power comes from the DSLs and associated tag functions that will be defined. Feedback from actual experiments designing and implementing such DSLs will be particularly appreciated by TC-39
Allen Wirfs-Brock on August 1st, 2012 at 4:12 pm
@Allen – thanks for all the updates, I was having a hard time jumping around between strawman, REPLs, and such.
Nicholas C. Zakas on August 1st, 2012 at 4:20 pm
Note that
var text = “First line\
Second line”;
is a misnomer because that’s not actually creating a multi-line string, it’s defining a single-line string over multiple lines of code. It’s equivalent to:
["First line","Second line"].join(“”);
You could instead actually create a multi-line string by actually inserting the new lines:
var text = “First line\nSecond line”;
var text = “First line\n\
Second line”;
By contrast, the multi-line quasis seem to actually automatically insert those new-lines automatically (haven’t tested THAT to confirm for myself or not).
I bring this up to counter the implication that quasis are a better \-line-terminated string. I don’t think they’re the same thing at all in that respect. Sometimes you’ll want the automatic new-line’ing, and sometimes (more often for me) you will just want a single-line string which has better source code formatting without any unintended side-effects on the contents of the string.
Kyle Simpson on August 1st, 2012 at 5:45 pm
@Kyle
Note that line continuations can be used in template strings, just like they are used in string literals. This allows a template string to be broken across multiple lines without inserting new lines into the the actual template string text.
let str1 = `this is a \
template string that spans multiple lines\
with no internal new lines`;
let str2 = `this is a
template string that spans multiple lines
and has internal new lines`;
let str3 = `this is also a \n\
template string that spans multiple lines \n\
and has internal new lines.\n\
But it uses explicit new lines escapes and line continuations`;
Allen Wirfs-Brock on August 1st, 2012 at 6:02 pm
@Allen
Thanks for the clarification. That further reinforces my point, which is that for line continuations itself, the “template strings” are not “better”.
In fact, personally, I would them a little “worse”, because I would find the semantics that you’ve now confirmed confusing/surprising. In a template string, if I span a string across lines and don’t add a \, I get new-lines inserted, but if I do add a \, I *don’t* get them added? It’s confusing to omit something and get stuff added. That to me seems like a potentially kinda surprising behavior.
I’m sure this boat has already sailed, but it’d sure be nice if there was a way to flag (kinda like regexes) this behavior to control it. If I could flag a template string as not wanting automatically-inserted new-lines, then I could use multi-line strings in an intuitive way without the unintended side effect.
As it currently stands, template strings are slightly worse for my typical use case, because unlike string literals (which would fail quickly with syntax errors), if I omit the line-continuation, I get this (often undesired) side effect.
Just my 2 cents.
Kyle Simpson on August 1st, 2012 at 6:36 pm
@Kyle – I think it’s easy enough to not use line continuations if you don’t want to, isn’t it? No one is going to force you to do it, and I’m not sure why you would end up using that pattern if that wasn’t your intent. Keep in mind, you can always change the results of the template string by creating your own tag. You could very easily create one that strips out all new lines, for example.
Nicholas C. Zakas on August 1st, 2012 at 7:13 pm
You mentioned HTML escaping is trivial and I’d also like to point out that basic string substitution is similarly trivial:
function sub(template, context) {
return template.replace(/\$\{([^}]+)\}/g, function (match, key) {
return escapeHTML(context[key] || '') || match;
});
}
Karl G on August 1st, 2012 at 7:25 pm
@Nicholas – my point was the opposite. That to get only a line continuation, and not an implicit new-line, you have to keep doing the \, whereas if you accidentally leave one off, you’ll get no syntax error or warning, you’ll just silently get the extra side-effect of getting a new-line inserted.
If you want new-lines inserted as a regular use-case, that will probably be welcomed relief from all those \n’s. But if, like me, you use “multiple lines for strings” purely to keep code formatting cleaner, and you don’t want new-lines implied by how you style your code, you’ll have to take extra care inside of template strings, because the silent footgun is always lurking.
I just don’t know that many real use-cases for why you want a string literal to have explicit new-lines internally. And the fewer number of actual use-cases for it, I’d say being explicit with \n is far clearer in code than relying on implied auto-inserted new-lines. But then again, I also hate implied auto-inserted semi-colons, too. Call me crazy, but I like explicit code.
Again, if I could add a flag to the template string (like i can for regex literals) to turn off THAT behavior, then I’d fully agree that template strings replace our current “multi-line strings”. But as it stands, I think they’re “worse” for *that* use-case.
Kyle Simpson on August 1st, 2012 at 8:36 pm
Minor nit: YUI’s HTML escaping function is Y.Escape.html(), not Y.Escape.escapeHTML().
Ryan Grove on August 1st, 2012 at 8:52 pm
The example code defining escapeHTML function is incorrectly formatted: for example for > it should return > not > (double escaping is needed to correctly format code).
As to heredoc syntax: Perl also supports it in the form of
$subst = <<<"EOF";
foo $bar baz
EOF
$literal = <<<’EOF’;
foo $bar baz
EOF
Nb. POSIX shell supports second format (no variable substitution in heredoc) via \EOF.
Do template string support advanced formatting, like e.g. %.2f or %02h?
Jakub Narebski on August 2nd, 2012 at 3:25 am
@Kyle
One of the principles applied to this design is that a “cooked” template string without any substitutions produces the same value as the equivalent string literal. This requires that line continuations work exactly the same for both template strings and string literals.
However, the “raw” value of a template string includes the actual line continuation characters. You can define your own “flag” for ignoring line continuations and/or literal embedded line terminators by defining a template handler function (a “tag”). For example, you could then write things like this:
noNL`this
will have \
no embedded new lines`
The major complication of processing the raw value is that you will also have to do all the escape sequence expansions for the string yourself within the handler function.
Allen Wirfs-Brock on August 2nd, 2012 at 8:42 am
@Allen-
I’m not saying that *line-continuations* should work differently. I agree they should work the same.
What I’m (slightly) objecting to is that template strings have this extra magical behavior of inserting new lines when a line-continuation is not present. I think that’s a different/orthagonal complaint to the design principle you are asserting.
You might leave out the line-continuation intentionally, or you might accidentally omit it. The fact that you get magical side-effect behavior when you accidentally omit it, that’s my concern.
It’s similar to how ASI in a return statement can give you “unexpected” results if the return expression is on the next line from the `return` statement. That’s a “footgun” that forces people to understand and not accidentally forget/omit the ASI rules.
Yes, I’m sure I could create a `noNL` template processing function, but I’m not sure how it could possibly distinguish between explicit \n’s that I may choose to include, and implicitly added ones based on this magical behavior. What I was asking for in a flag was essentially a way to turn off the magical behavior (opt-out), so that my common use-case of strings-defined-on-multiple-lines would be as robust in template strings as it currently is in string literals. (by robust, I mean less susceptible to unseen errors).
Without a way to effectively opt-out (as I understand it, the `noNL` approch would NOT suffice), since template strings remove the syntax warning complaint that string literals give me if I forget a line-continuation, I think that makes string literals weaker for the specific use-case of strings-defined-on-multiple-lines.
Kyle Simpson on August 2nd, 2012 at 9:12 am
This is a honest question:
With regards to Multiline strings, you say This is undoubtedly a case where new syntax is needed, because both the double quote and single quote are already spoken for (and pretty much are exactly the same).
But if there’s a new syntax added to handle all these new features, why isn’t possible to state that string literals can include line breaks?
There’s no real need to keep backward compatibility as any script that uses those backticks can work correctly in older browsers, but if the new parser is so smart as to understand that a string that starts with a backtick ends only when the next one is found, Why isn’t it possible to adjust that same behavior for single or double quoted strings?
Alfonso on August 2nd, 2012 at 10:03 am
@Kyle – I can’t help but disagree with your assertion that this behavior is a magical side effect when, in fact, treating actual new lines as new lines within the template string was one of the design goals. It’s not a side effect, it’s an intentional decision. To me this is as simple as using the right tool for the right job. If you are using PHP, you would use strings in one case and heredocs in another. Template strings are different animal and so they should be allowed to behave differently. I don’t hear a lot of people arguing for requiring you to write “\n” for new lines, to me, that’s a big headache.
Nicholas C. Zakas on August 2nd, 2012 at 10:34 am
@Nicholas-
You made the assertion in this article that template strings are better at multi-line strings than string literals are, but as I pointed out earlier, you actually confused the concept of strings-defined-on-multiple-lines and strings-that-have-multiple-lines-in-them.
All I’ve been trying to counter is that for strings-defined-on-multiple-lines, template strings are, in fact, more magical and thus more surprising. If you *want* new-lines in your content, they’re great. If you don’t want new-lines, they’re worse than just using string literals.
Moreover, I really can’t come up with very many use-cases for *wanting* new-lines automatically added to your string content. Can you elaborate on which use-cases you think are now so much better because of auto-new-lining?
I can see that creating content which you intend to inject into a `textarea` content would be nice to have the new-lines automatically added. And maybe similarly `contenteditable`. And also maybe content in alert() boxes.
But beyond that, I’m just not sure I see why this magical behavior is awesome? It seems the vast majority of use-cases where you want to create multi-line content, you have to do so with markup (like `p` or `div` or `br` separators).
In what other places will it be awesome to have magically added new-lines? I don’t get it.
Kyle Simpson on August 2nd, 2012 at 10:50 am
@Kyle – This is a feature that is available in numerous languages, and it’s there for a reason: if you need to write formatted text in your code, it’s a pain in the ass to do that using regular strings. There are a bunch of use cases for this (including the several that you mentioned), I would argue more than for the line continuation pattern. If you want a simple example, consider a help screen for a command line program (our Node.js friend!). In that case, I just want a block of text formatted the way that I want it. Otherwise, I end up using something horrible like an array of strings as I showed earlier in this article. Anyplace you’d use heredocs in another language, you would use template strings in JavaScript. (You’ll note that I prefer externalizing strings when possible, but I do see times when it is nice to be able to do this.)
I don’t think that the new syntax or this behavior is confusing at all, especially given my experience with similar features in other languages.
Nicholas C. Zakas on August 2nd, 2012 at 11:03 am
@Alfonso
It would be possible to extend the syntax of string literals to allow them to span multiple lines, without using explicit line continuations. However, that won’t necessary be a good idea. The reason, such strings are currently a syntax error is because leaving out the trailing quote is a common error. It’s desirable to catch the errors as close of its source, as possible. We don’t really want to change that aspect of JS.
This is in conflict with the need to occasionally express strings multiple line strings and a desire to keep such literal readable by avoiding escape sequences or having excessively long lines. The back-tick literals provide a way to explicitly opt-in to expressing a multiple line string. Because they are new syntax, they also provide a place to introduce other new syntax such as ${} substitutions that cannot compatibility be added to existing literal forms.
Arguably, those are independent use cases that don’t need to be combined. But we want to be economical in introducing new syntax into the language and this leads to combining multiple use cases into a single construct. Making such decisions is where language design becomes more an art than a science. Particularly, when evolving an existing language. However, it also leads to disagreements about esthetics.
Allen Wirfs-Brock on August 2nd, 2012 at 11:05 am
I see you corrected the article to be clearer about the distinction of strings-on-multiple-lines and multiple-lines-in-strings. One final nit on what you wrote:
“There has been an unofficial way of doing multiline string literals and JavaScript for years using a backslash followed by a newline, such as this:”
should read:
“There has been an unofficial way of doing multiline string literals and JavaScript for years using a newline followed by a backslash, such as this:”
————-
BTW, I’ve literally never once used (or needed) HEREDOC syntax in any other language I’ve worked in (perl, php, c/c++, pascal, basic, etc). I’m sure this is why I see this stuff as foreign to JavaScript. In almost every case where I can see it being useful in JavaScript, I’d actually get that string content from a data source (variable, etc), and the “literal” syntax would be moot. As such, it seems like, at best, a toy feature, IMHO.
Kyle Simpson on August 2nd, 2012 at 11:18 am
Found this really informative. Thanks!
Minor suggestion for the escape function you mention: Single quotes need escaping too, and IE8 and below will treat a backtick like a quote for HTML attribute purposes. I prefer an html escape that makes that an entity as well.
function escapeHTML(text) {
return text.replace(/["'&`]/g, function(c) {
switch (c) {
case "": return ">";
case "\"": return """;
case "'": return "'";
case "&": return "&";
case "`": return "`";
}
});
}
Adam Ahmed on August 4th, 2012 at 6:21 am
@Allen
So the only reason is that you don’t want to change that syntax.
Everyone makes little mistakes while hand coding, so a lint tool for me is essential instead of wait for the browser to load and execute the code to find out that I made a little mistake, and it’s as easy to make mistakes with the new markers as with the old ones.
And people now will try to use this new syntax to get easy multiline strings even if they don’t care about the rest of features (and forces the js engine to waste time parsing the string)
Another question:
Is it possible to turn a normal string into these “template strings”?
As the article points out, at least at the moment this format it’s not suitable to use as a translation helper: in those cases (at least in some projects, I’m not gonna claim that this is the only or the best way) an object that stores all the strings indexed by keys and according to the user language only the file with those translations are loaded.
But it’s not possible to store template strings as a reference in an object because it seems that they are converted as soon as they are defined, so if I want to store a string to format the file size we can’t do
translation.size = `Size: {size} Kb.`We have to keep using
translation.size = "Size: %0 Kb."because the size variable isn’t defined when the translation file is loaded (and of course, it really needs to be replaced for each file that we want to print)
This is the kind of problem that String.format(translation.size, size) would allow, or even better translation.size.format(size)
Alfonso on August 4th, 2012 at 8:47 am
Hi,
A quick note: the references inside the article seem to be bit messed up, the index seems to be off by one starting from [8].
Harri on August 5th, 2012 at 2:08 pm
I just wonder, for multi-line strings, what would actually be the problem with just “extending” the syntax so that
var someString = 'foobar';
// resp.
var someString = "foo
bar";
would be considered valid? (So basically the same way PHP does with normal strings.)
ChrisB on August 6th, 2012 at 7:42 am
@Chris – Allen already answered that in his comment.
Nicholas C. Zakas on August 6th, 2012 at 9:42 am
@Adam – Thanks, great tip!
Nicholas C. Zakas on August 6th, 2012 at 9:42 am
@Harri – Sorry about that, fixed now!
Nicholas C. Zakas on August 6th, 2012 at 9:44 am
And what exactly would be the difference between leaving out the trailing ” of a doublequote terminated multi line text literal compared to leaving out the trailing ` of a backtick terminated multi line text literal …?
ChrisB on August 8th, 2012 at 2:12 am
Comments are automatically closed after 14 days.