Gist: <https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f>
Multiline string literals
* Proposal: SE-NNNN
<https://github.com/apple/swift-evolution/blob/master/proposals/NNNN-name.md>
* Author(s): Brent Royal-Gordon <https://github.com/brentdax>
* Status: First Draft
* Review manager: TBD
<https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#introduction>Introduction
In Swift 2.2, the only means to insert a newline into a string literal is
the |\n| escape. String literals specified in this way are generally ugly
and unreadable. We propose a multiline string feature inspired by English
punctuation which is a straightforward extension of our existing string
literals.
Swift-evolution thread: multi-line string literals.
<https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20160418/015500.html>
<https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#draft-notes>Draft
Notes
*
This draft differs from the prototypes being thrown around on the list
in that it specifies that comments should be treated as whitespace, and
that whitespace-only lines in the middle of a multiline string should
be ignored. I'm not sure if this is feasible from a parsing standpoint,
and I'd like feedback from implementers on this point.
*
This draft also specifies diagnostics which should be included.
Feedback on whether these are good choices would be welcome.
*
I am considering allowing you to put a backslash before the newline to
indicate it should /not/ be included in the literal. In other words,
this code:
print("foo\
"bar")
Would print |"foobar"|. However, I think this should probably be
proposed separately, because there may be a better way to do it.
*
I've listed only myself as an author because I don't want to put anyone
else's name to a document they haven't seen, but there are others who
deserve to be listed (John Holdsworth at least). Let me know if you
think you should be included.
<https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#motivation>Motivation
As Swift begins to move into roles beyond app development, code which needs
to generate text becomes a more important use case. Consider, for instance,
generating even a small XML string:
let xml = "<?xml version=\"1.0\"?>\n<catalog>\n\t<book id=\"bk101\"
empty=\"\">\n\t\t<author>\(author)</author>\n\t</book>\n</catalog>"
The string is practically unreadable, its structure drowned in escapes and
run-together characters; it looks like little more than line noise. We can
improve its readability somewhat by concatenating separate strings for each
line and using real tabs instead of |\t| escapes:
let xml = "<?xml version=\"1.0\"?>\n" +
"<catalog>\n" +
" <book id=\"bk101\" empty=\"\">\n" +
" <author>\(author)</author>\n" +
" </book>\n" +
"</catalog>"
However, this creates a more complex expression for the type checker, and
there's still far more punctuation than ought to be necessary. If the most
important goal of Swift is making code readable, this kind of code falls
far short of that goal.
<https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#proposed-solution>Proposed
solution
We propose that, when Swift is parsing a string literal, if it reaches the
end of the line without encountering an end quote, it should look at the
next line. If it sees a quote mark there (a "continuation quote"), the
string literal contains a newline and then continues on that line.
Otherwise, the string literal is unterminated and syntactically invalid.
Our sample above could thus be written as:
let xml = "<?xml version=\"1.0\"?>
"<catalog>
" <book id=\"bk101\" empty=\"\">
" <author>\(author)</author>
" </book>
"</catalog>"
(Note that GitHub is applying incorrect syntax highlighting to this code
sample, because it's applying Swift 2 rules.)
This format's unbalanced quotes might strike some programmers as strange,
but it attempts to mimic the way multiple lines are quoted in English
prose. As an English Stack Exchange answer illustrates
<http://english.stackexchange.com/a/96613/64636>:
“That seems like an odd way to use punctuation,” Tom said. “What harm
would there be in using quotation marks at the end of every paragraph?”
“Oh, that’s not all that complicated,” J.R. answered. “If you closed
quotes at the end of every paragraph, then you would need to reidentify
the speaker with every subsequent paragraph.
“Say a narrative was describing two or three people engaged in a
lengthy conversation. If you closed the quotation marks in the previous
paragraph, then a reader wouldn’t be able to easily tell if the
previous speaker was extending his point, or if someone else in the
room had picked up the conversation. By leaving the previous
paragraph’s quote unclosed, the reader knows that the previous speaker
is still the one talking.”
“Oh, that makes sense. Thanks!”
Similarly, omitting the ending quotation mark tells the code's reader (and
compiler) that the literal continues on the next line, while including the
continuation quote reminds the reader (and compiler) that this line is part
of a string literal.
<https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#benefits-of-continuation-quotes>Benefits
of continuation quotes
It would be simpler to not require continuation quotes, so why are they
required by this proposal? There are three reasons:
1.
*They help the compiler pinpoint errors in string literal
delimiting.* If continuation quotes were not required, then a missing
end quote would be interpreted as a multiline string literal. This
string literal would continue until the compiler encountered either
another quote mark—perhaps at the site of another string literal or in
a comment—or the end of the file. In either case, the compiler could at
best only indicate the start of the runaway string literal; in
pathological cases (for instance, if the next string literal
was |"+"|), it might not even be able to do that properly.
With continuation quotes required, if you forget to include an end
quote, the compiler can tell that you did not intend to create a
multiline string and flag the line that actually has the problem. It
can also provide immediately actionable fix-it assistance. The fact
that there is a redundant indication on each line of the programmer's
intent to include that line in a multiline quote allows the compiler to
guess the meaning of the code.
2.
*They separate indentation from the string's contents.* Without
continuation quotes, there would be no obvious indication of whether
whitespace at the start of the line was intended to indent the string
literal so it matched the surrounding code, or whether that whitespace
was actually meant to be included in the resulting string. Multiline
string literals would either have to put subsequent lines against the
left margin, or apply error-prone heuristics to try to guess which
whitespace was indentation and which was string literal content.
3.
*They improve the ability to quickly recognize the literal.* The |"| on
each line serves as an immediately obvious indication that the line is
part of a string literal, not code, and the row of |"| characters in a
well-formatted file allows you to quickly scan up and down the file to
see the extent of the literal.
<https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#detailed-design>Detailed
design
When Swift is parsing a string literal and reaches the end of a line
without finding a closing quote, it examines the next line, applying the
following rules:
1.
If the next line is all whitespace, it is ignored; Swift moves on to
the line afterward, applying these rules again.
2.
If the next line begins with whitespace followed by a continuation
quote, then the string literal contains a newline followed by the
contents of the string literal starting on that line. (This line may
itself have no closing quote, in which case the same rules apply to the
line which follows.)
3.
If the next line contains anything else, Swift raises a syntax error
for an unterminated string literal. This syntax error should offer two
fix-its: one to close the string literal at the end of the current
line, and one to include the next line in the string literal by
inserting a continuation quote.
Rules 1 and 2 should treat comments as though they are whitespace; this
allows you to comment out individual lines in a multiline string literal.
(However, commenting out the last line of the string literal will still
make it unterminated, so you don't have a completely free hand in
commenting.)
<https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#impact-on-existing-code>Impact
on existing code
Failing to close a string literal before the end of the line is currently a
syntax error, so no valid Swift code should be affected by this change.
<https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#alternatives-considered>Alternatives
considered
<https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#requiring-no-continuation-character>Requiring
no continuation character
The main alternative is to not require a continuation quote, and simply
extend the string literal from the starting quote to the ending quote,
including all newlines between them. For example:
let xml = "<?xml version=\"1.0\"?>
<catalog>
<book id=\"bk101\" empty=\"\">
<author>\(author)</author>
</book>
</catalog>"
This has several advantages:
1.
It is simpler.
2.
It is less offensive to programmers' sensibilities (since there are no
unmatched |"| characters).
3.
It does not require that you edit the string literal to insert a
continuation quote in each line.
Balanced against the advantages, however, is the loss of the improved
diagnostics, code formatting, and visual affordances mentioned in the
"Benefits of continuation quotes" section above.
In practice, we believe that editor support (such as "Paste as String
Literal" or "Convert to String Literal" commands) can make adding
continuation quotes less burdensome, while also providing other
conveniences like automatic escaping. We believe the other two factors are
outweighed by the benefits of continuation quotes.
<https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#use-a-different-delimiter-for-multiline-strings>Use
a different delimiter for multiline strings
The initial suggestion was that multiline strings should use a different
delimiter, |"""|, at the beginning and end of the string, with no
continuation characters between. This solution was rejected because it has
the same issues as the "no continuation character" solution, and because it
was mixing two orthogonal issues (multiline strings and alternate
delimiters).
Another suggestion was to support a heredoc syntax, which would allow you
to specify a placeholder string literal on one line whose content begins on
the next line, running until some arbitrary delimiter. For instance, if
Swift adopted Perl 5's syntax, it might support code like:
connection.sendString(<<"END")
<?xml version="1.0"?>
<catalog>
<book id="bk101" empty="">
<author>\(author)</author>
</book>
</catalog>
END
In addition to the issues with the |"""| syntax, heredocs are complicated
both to explain and to parse, and are not a natural extension of Swift's
current string syntax.
Both of these suggestions address interesting issues with string literals,
solving compelling use cases. They're just not that good at fixing the
specific issue at hand. We might consider them in the future to address
those problems to which they are better suited.
<https://gist.github.com/brentdax/c580bae68990b160645c030b2d0d1a8f#fixing-other-string-literal-readability-issues>Fixing
other string literal readability issues
This proposal is narrowly aimed at multiline strings. It intentionally
doesn't tackle several other problems with string literals:
*
Reducing the amount of double-backslashing needed when working with
regular expression libraries, Windows paths, source code generation,
and other tasks where backslashes are part of the data.
*
Alternate delimiters or other strategies for writing strings
with |"| characters in them.
*
String literals consisting of very long pieces of text which are best
represented completely verbatim.
These are likely to be subjects of future proposals, though not necessarily
during Swift 3.
This proposal also does not attempt to address regular expression literals.
The members of the core team who are interested in regular expression
support have ambitions for that feature which put it out of scope for
Swift 3.
--
Brent Royal-Gordon
Architechies
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution