This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

Single quotes don't interpolate \' and \\

=head1 VERSION

  Maintainer: Nicholas Clark <[EMAIL PROTECTED]>
  Date: 28 Sep 2000
  Last Updated: 29 Sep 2000
  Mailing List: [EMAIL PROTECTED]
  Number: 328
  Version: 2
  Status: Developing

=head1 CHANGES

Reissued on [EMAIL PROTECTED] - I goofed the list.

Added discussion section

Clarified the description slightly; by single quoted string I mean '' and q()

=head1 DISCUSSION

Limited discussion so far because I wrongly issued the RFC to
[EMAIL PROTECTED] The only responses were from two people both
of whom valued their ability to use single quotes to make strings including
making strings containing single quotes. Here docs already provide a means to
get "quote nothing, everything is literal". One argued that

        Single-quoted strings need to be able to contain single quotes,
        which means some escape mechanism is required.

which I did not agree with. My view is that single quotes are a means to an
end, a way to create string constants. String constants need to contain C<'>
(and every other character). There's more than just '' to make a string.

Although consensus so far is against the change, views were from B<existing>
perl users [who do you expect as the majority on perl6 lists? :-)]. The
change would penalise existing perl users, but benefit new perl users (and
presumably people teaching perl).

=head1 ABSTRACT

Remove all interpolation within single quotes and the C<q()> operator, to
make single quotes 100% shell-like. C<\> rather than C<\\> gives a single
backslash; use double quotes or C<q()> if you need a single quote in your
string.

=head1 DESCRIPTION

Camel III (page 7) says "Double quotation marks (double quotes) do
I<variable interpolation> and I<backslash interpolation> while single quotes
suppress interpolation."  Page 60 qualifies this with "except for C<\'> and
C<\\>".

In perl single quotes are used to generate strings. Double quotes also generate
strings.

In C single quotes are used to make character constants. Double quotes are
used to make string constants. Backslash interpretation is performed in
single quotes in C. While multi-character constants are allowed by C, they
are strongly discouraged as they are non-portable, and a character constant
in C is a type distinct from a string constant. Hence double quotes and
single quotes signify different things.

In shell, single quotes are used to make strings. Double quotes also make
strings. Within single quotes backslashes are ordinary characters, and do
not quote anything. As one can't quote a C<'> with a C<\> there is no way to
interpolate a single quote within a single quoted string, but a workaround
such as C<'don'\''t'> relying on the concatenation of C<'don'> C<\'> and
C<'t'> achieves the desired results.

Hence perl's single quoted strings are analogous to shell's single quoted
strings, not C's. However, they're not identical, as perl allows C<\\> to
mean an embedded C<\>, C<\'> to mean an embedded C<'>.

This RFC argues that the exception is confusing and proposes to remove it.
This makes perl more regular in shell terms, and slightly more easy to learn
for the shell programmer.

It also makes perl internally simpler more regular. Currently the behaviour
for C<q()> strings is that C<\(> C<\)> and C<\\> map to 1 character,
C<\>I<?> for all other I<?> maps to 2 characters. C<qq()> differs as
C<\>I<?> maps to 1 character both when I<?> is recognised as a backslash
escapes, B<and> when it is unrecognised. A further irregularity is that
currently single quoted here docs don't interpolate C<\\> or C<\'>. The
consequence of this is that currently

        'foo\\bar' eq 'foo\bar'

which sure looks odd.

With this RFC it is proposed that in a single quoted string and the q()
operator C<\> is not special. Hence C<\>I<?> always maps to 2 characters
(C<\> then I<?>) unless I<?> is the closing terminator, in which case the
string terminates with that C<\> . Single quoted strings behave like single
quoted here docs, and like shell single quoted strings.

You don't lose any functionality, as

   'don\'t implement this RFC, the benefits don\'t outweigh the confusion'

can still be written

   q(don't implement this RFC, the benefits don't outweigh the confusion)

which is actually less typing.

=head1 IMPLEMENTATION

Modify the tokeniser/lexer not to treat C<\> as special, hence the first end
delimiter ends the string.

For 5.7's toke.c this doesn't appear that simple. it looks like
modifications would be needed to C<S_tokeq>, C<scan_str> and C<Perl_yylex>
(for a quoted string at the start of curlies). There are probably more; the
code that makes single quoted strings interpolate C<\'> and C<\\> appear to
be deeply ingrained into the core.

The perl5 to perl6 convertor would need to convert single quoted strings and
C<q()> operators containing C<\'> to the shortest (clearest?) equivalent of:

=over 4

=item *

double quoted string with C<\Q>...C<\E> or  C<\> escapes

=item *

single quoted string(s) and explicit concatenation of C<"'">

=item *

C<q()> operator with delimiters not in the string

=back

Single quoted strings containing C<\\> but no quoting of delimiters would
need to have C<\\> converted to C<\>

=head1 REFERENCES

Camel III - Programming Perl (3rd Edition)

perlop manpage for interpolation

RFC 226: Selective interpolation in single quotish context.

I believe that Larry Wall once made a comment about \' and \\ in single
quoted strings being a mistake, but I can't find any reference. The idea
certainly isn't mine, but I feel it worthy of consideration, even if
considered opinion is that any gain is to small to outweigh the upheaval.

Reply via email to