> I’ve raised a speculative PR against the Swift Lexer to support multi-line 
> string literals

Wow, it's pretty cool that this change is so surgical.

> I’m trying to avoid more advanced features such as the handling of indenting 
> which
> for me complicates something that if kept simple can be documented very 
> easily.

I don't think you can tackle multiline strings without worrying about 
indenting. Indentation may fundamentally change the approach you choose.

I continue to believe that we're actually looking at three orthogonal features 
here:

* Multiline string literals
* Alternative string literal delimiters
* Disabling escapes in string literals

The way I would prefer to tackle these is:

* Multiline literals: If the closing quote of a string is not present, look at 
the next line. If it consists of (optional) indentation followed by a matching 
opening quote, the string has a newline and then continues after the quote on 
the next line. (The handling of comments is an open question here.)

        let xml: String = "<?xml version=\"1.0\"?>
                                "<catalog>
                                "\t<book id=\"bk101\" empty=\"\">
                                "\t\t<author>\(author)</author>
                                "\t</book>
                                "</catalog>"

The cool things about this are that (a) the compiler can tell you really do 
mean this to be part of the literal and you haven't just forgotten to close the 
string, and (b) there's no guesswork about how indentation should be handled. 
The uncool thing is that you need to insert the quote at the beginning of each 
line, so you can't just blindly paste into a multiline literal. Editors can 
help make that easier, though—a "paste as string literal" feature would be a 
nice addition to Xcode, and not just for multiline strings or just for Swift.

* Alternative delimiters: If a string literal starts with three, or five, or 
seven, or etc. quotes, that is the delimiter, and fewer quotes than that in a 
row are simply literal quote marks. Four, six, etc. quotes is a quote mark 
abutting the end of the literal.

        let xml: String = """<?xml version="1.0"?>
                                """<catalog>
                                """\t<book id="bk101" empty="">
                                """\t\t<author>\(author)</author>
                                """\t</book>
                                """</catalog>"""

You can't use this syntax to express an empty string, or a string consisting 
entirely of quote marks, but `""` handles empty strings adequately, and 
escaping can help with quote marks. (An alternative would be to remove the 
abutting rule and permit `""""""` to mean "empty string", but abutting quotes 
seem more useful than long-delimiter empty strings.)

* Disabling escapes: If you use single quotes instead of double quotes, 
backslash escapes are disabled. (There is no escaping at all, not even \\ or 
\'. If you need to include the delimiter, use a delimiter with more quote 
marks. I'm not sure if this should disable interpolation; for now, I'm assuming 
it should. If it doesn't disable interpolation, the only way to get a \( into 
the string would be by interpolating it in, not by escaping it somehow.)

        let xml: String = '''<?xml version="1.0"?>
                                '''<catalog>
                                '''     <book id="bk101" empty="">
                                '''             <author>''' + author + 
'''</author>
                                '''     </book>
                                '''</catalog>'''

I'm not sure if single quotes should allow interpolation. Options are:

* No, just concatenate (as shown above).
* Yes, with the ordinary syntax: '''            <author>\(author)</author>
* Yes, with a number of backslashes matching the number of quotes, which allows 
you to insert literal \( text: '''              <author>\\\(author)</author>

Note that you can use these features in any combination. I've shown a few 
combinations above, but here are some others.

A single-line literal with an alternate delimiter:
        """     <book id="bk101" empty="">"""

The same thing, but no-escaping:
        '''     <book id='bk101' empty=''>'''

A no-escaping multiline literal with a normal delimiter:
        '<?xml version="1.0"?>
        '<catalog />'

* * *

Notes on alternatives:

1. If you wanted to not provide no-escaping strings, an alternative would be to 
say that *all* escapes require as many backslashes as there are quotes in the 
string delimiter. Thus, a newline escape in a `"""` string would be `\\\n`. 
This would in practice give you the same flexibility to write a literal without 
worrying (much) about escaping.

2. However, it's not entirely clear to me that we really need escapes other 
than interpolations at all. You could write "\(.newline)" or "\(.doubleQuote)" 
or "\(.backslash)" to get those characters. (These might be static members 
required by StringInterpolationConvertible.) Plain backslashes would have no 
special meaning at all; only "\(" would be special.

3. It might be useful to make multiline `"` strings trim trailing whitespace 
and comments like Perl's `/x` regex modifier does. That would allow you to 
document things in literals. Then you would want `'` again so that you could 
turn that smartness off. (Of course, the big problem here is that a naïve 
implementation would consider "http://"; to have a comment at the end of it.)

* * *

Finally, a brief aside:

> For example, a regular expression that detects a might be written "\N*\n". If 
> escaping is enabled, then the compiler changes "\n" into line feed, which 
> does not have the same meaning to the regular expression engine as "\n".

There is a special place in Hell reserved for authors of languages which use 
`\` as an escape character, provide no regex literals or non-escaping string 
literals, and ship with regex libraries which use `\` as a metacharacter. It's 
in the outer circles—Satan has some sense of perspective—but it's definitely 
there.

Sorry if that's not very constructive, but *man*, that burns my biscuits.

-- 
Brent Royal-Gordon
Architechies

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Reply via email to