Yes, I figured R followed by a non-alphabetical character could serve
the same purpose as ruby's %<char>.

martin

On Thu, Sep 19, 2013 at 2:37 PM, Kevin Ballard <[email protected]> wrote:
> I didn't look at Ruby's syntax, but what you just described sounds a little 
> too free-form to me. I believe Ruby at least requires a % as part of the 
> syntax, e.g. %q{test}. But I don't think %R{test} is a good idea for rust, as 
> it would conflict with the % operator. I don't think other punctuation would 
> work well either.
>
> -Kevin
>
> On Sep 19, 2013, at 2:10 PM, Martin DeMello <[email protected]> wrote:
>
>> How complicated would it be to use R"" but with arbitrary paired
>> delimiters (the way, for instance, ruby does it)? It's very handy to
>> pick a delimiter you know does not appear in the string, e.g. if you
>> had a string containing ')' you could use R{this is a string with a )
>> in it} or R|this is a string with a ) in it|.
>>
>> martin
>>
>> On Thu, Sep 19, 2013 at 1:36 PM, Kevin Ballard <[email protected]> wrote:
>>> One feature common to many programming languages that Rust lacks is "raw" 
>>> string literals. Specifically, these are string literals that don't 
>>> interpret backslash-escapes. There are three obvious applications at the 
>>> moment: regular expressions, windows file paths, and format!() strings that 
>>> want to embed { and } chars. I'm sure there are more as well, such as large 
>>> string literals that contain things like HTML text.
>>>
>>> I took a look at 3 programming languages to see what solutions they had: D, 
>>> C++11, and Python. I've reproduced their syntax below, plus one more custom 
>>> syntax, along with pros & cons. I'm hoping we can come up with a syntax 
>>> that makes sense for Rust.
>>>
>>> ## Python syntax:
>>>
>>> Python supports an "r" or "R" prefix on any string literal (both "short" 
>>> strings, delimited with a single quote, or "long" strings, delimited with 3 
>>> quotes). The "r" or "R" prefix denotes a "raw string", and has the effect 
>>> of disabling backslash-escapes within the string. For the most part. It 
>>> actually gets a bit weird: if a sequence of backslashes of an odd length 
>>> occurs prior to a quote (of the appropriate quote type for the string), 
>>> then the quote is considered to be escaped, but the backslashes are left in 
>>> the string. This means r"foo\"" evaluates to the string `foo\"`, and 
>>> similarly r"foo\\\"" is `foo\\\"`, but r"foo\\" is merely the string 
>>> `foo\\`.
>>>
>>> Pros:
>>> * Simple syntax
>>> * Allows for embedding the closing quote character in the raw string
>>>
>>> Cons:
>>> * Handling of backslashes is very bizarre, and the closing quote character 
>>> can only be embedded if you want to have a backslash before it.
>>>
>>> ## C++11 syntax:
>>>
>>> C++11 allows for raw strings using a sequence of the form R"seq(raw 
>>> text)seq". In this construct, `seq` is any sequence of (zero or more) 
>>> characters except for: space, (, ), \, \t, \v, \n, \r. The simplest form 
>>> looks like R"(raw text)", which allows for anything in the raw text except 
>>> for the sequence `)"`. The addition of the delimiter sequence allows for 
>>> constructing a raw string containing any sequence at all (as the delimiter 
>>> sequence can be adjusted based on the represented text).
>>>
>>> Pros:
>>> * Allows for embedding any character at all (representable in the source 
>>> file encoding), including the closing quote.
>>> * Reasonably straightforward
>>>
>>> Cons:
>>> * Syntax is slightly complicated
>>>
>>> ## D syntax:
>>>
>>> D supports three different forms of raw strings. The first two are similar, 
>>> being r"raw text" and `raw text`. Besides the choice of delimiters, they 
>>> behave identically, in that the raw text may contain anything except for 
>>> the appropriate quote character. The third syntax is a slightly more 
>>> complicated form of C++11's syntax, and is called a delimited string. It 
>>> takes two forms.
>>>
>>> The first looks like q"(raw text)" where the ( may be any non-identifier 
>>> non-whitespace character. If the character is one of [(<{ then it is a 
>>> "nesting delimiter", and the close delimiter must be the matching ])>} 
>>> character, otherwise the close delimiter is the same as the open. 
>>> Furthermore, nesting delimiters do exactly what their name says: they nest. 
>>> If the nesting delimiter is (), then any ( in the raw text must be balanced 
>>> with a ) in the raw text. In other words, q"(foo(bar))" evaluates to 
>>> "foo(bar)", but q"(foo(bar)" and q"(foobar))" are both illegal.
>>>
>>> The second uses any identifier as the delimiter. In this case, the 
>>> identifier must immediately be followed by a newline, and in order to close 
>>> the string, the close delimiter must be preceded by a newline. This looks 
>>> like
>>>
>>> q"delim
>>> this is some raw text
>>> delim"
>>>
>>> It's essentially a heredoc. Note that the first newline is not part of the 
>>> string, but the final newline is, so this evaluates to "this is some raw 
>>> text\n".
>>>
>>> Pros:
>>> * Flexible
>>> * Allows for constructing a raw string that contains any desired sequence 
>>> of characters (representable in the source file's encoding)
>>>
>>> Cons:
>>> * Overly complicated
>>>
>>> ## Custom syntax
>>>
>>> There's another approach that none of these three languages take, which is 
>>> to merely allow for doubling up the quote character in order to embed a 
>>> quote. This would look like R"raw string literal ""with embedded 
>>> quotes"".", which becomes `raw string literal "with embedded quotes"`.
>>>
>>> Pros:
>>> * Very simple
>>> * Allows for embedding the close quote character, and therefore, any 
>>> character (representable in the source file encoding)
>>>
>>> Cons:
>>> * Slightly odd to read
>>>
>>> ## Conclusion
>>>
>>> Of the three existing syntaxes examined here, I think C++11's is the best. 
>>> It ties with D's syntax for being the most powerful, but is simpler than 
>>> D's. The custom syntax is just as powerful though. The benefit of the C++11 
>>> syntax over the custom syntax is it's slightly easier to read the C++11 
>>> syntax, as the raw text has a 1-to-one mapping with the resulting string. 
>>> The custom syntax is a bit more confusing to read, especially if you want 
>>> to add multiple quotes. As a pathological case, let's try representing a 
>>> Python triple-quoted docstring using both syntaxes:
>>>
>>> C++11: R"("""this is a python docstring""")"
>>> Custom: R"""""""this is a python docstring"""""""
>>>
>>> Based on this examination, I'm leaning towards saying Rust should support 
>>> C++11's raw string literal syntax.
>>>
>>> I welcome any comments, criticisms, or suggestions.
>>>
>>> -Kevin
>>> _______________________________________________
>>> Rust-dev mailing list
>>> [email protected]
>>> https://mail.mozilla.org/listinfo/rust-dev
>
_______________________________________________
Rust-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/rust-dev

Reply via email to