I've integrated most of the proposed suggestions, as well as a section
on vstrings and a winged section on hash interpolation. So that leaves
these known issues:

- Reference stringification
- Default Object Strinigifcation
(.AS_STRING needs to be added to the doc as well, but I figure it
is still getting hammered out)
- Does <<>> mess up here-docs?
(I'm inclined to say that <<>> is more trouble than it is worth,
and to ditch <<>>, simply sticking with qw())
Also, would any sort of diff be helpful with these document revisions?
There's Text::ParagraphDiff, but that doesn't work too well with pod,
since pod is line-oriented rather than paragraph oriented. Regular
diffs aren't that helpful on text either. However, either one is
better than nothing, so if you'd like one, let me know.


Joseph. F Ryan
[EMAIL PROTECTED]



=pod

=head1 Strings

A string is formed when text is enclosed by a quoting operator.
There are two types of quoting operators: interpolating and
non-interpolating. In interpolating constructs, the value of a
variable is substituted for the variable name within the string
and certain characters have special meaning when preceded by a
backslash (C<\>). In non-interpolating constructs, a variable
name that appears within the string is used as-is. The simplest
examples of these two types of quoting operators are strings
delimited by double (interpolating) and single quotes
(non-interpolating). For example:

'The quick brown $animal'
"The quick brown $animal"

In the first string, perl will take each character literally and
perform no special processing. In the second string, the value
of the variable $animal is inserted within the string at that
location. If $animal had had the value "fox", then the second
string would have become "The quick brown fox".

More on the various quoting operators below.

=head2 Non-Interpolating Constructs

Non-Interpolating constructs are strings in which expressions do not
interpolate, or expand. The one exception to this is that the
backslash character, \, will always escape the character that
immediately follows the it.

The base form for a non-interpolating string is the single-quoted
string: 'string'. However, non-interpolating strings can also be formed
with the q() operator. The q() operator allows strings to be made with
any non-space, non-letter, non-digit character as the delimeter instead
of '. In addition, if the starting delimeter is a part of a paired
set, such as (, [, <, or {, then the closing delimeter may be the
matching member of the set. In addition, the reverse holds true;
delimeters which are the tail end of a pair may use the starting item
as the closing delimeter.

=over 3
Examples:

$string = 'string' # $string = 'string'
$string = q|string| # $string = 'string'
$string = q(string) # $string = 'string'
$string = q]string[ # $string = 'string'
=back

There are a few special cases for delimeters; specifically : and #.
: is not allowed because it might be used by custom-defined quoting
operators to apply a property; # is allowed, but there cannot be a
space between the operator and the #. In addition, comments are not
allowed within # delimeted expressions (for obvious reasons).

=head3 Embedding Interpolated Strings

It is also possible to embed an interpolating string within a non-
interpolating string by the use of the \qq{} construct. A string
inside a \qq{} constructs acts exactly as if it were an interpolated
string. Note that any end-brackets, "}", must be escaped within the
the \qq{} construct so that the parser can read it correctly.

=over 3
Examples ( assuming C<< $var="two" >> ):

$string = 'one \qq{$var} two' # $string = 'one two three'
$string = 'one\qq{ {$var\} }two' # $string = 'one {two} three'
=back

=head3 <<>>; expanding a string as a list.

A set of braces is a special op that evaluates into the list of words
contained, using whitespace as the delimeter. It is similar to qw()
from perl5, and can be thought of as roughly equivalent to:
C<< "STRING".split(' ') >>

=over 3
Examples:

@array = <one two three>; # @array = ('one', 'two', 'three');
@array = <one <\> three>; # @array = ('one', '<>', 'three');
=back

=head2 Interpolating Constructs

Interpolating constructs are another form of string in which variables
that are embedded into the string are expanded into their value at
runtime. Interpolated strings are formed using the double quote:
"string". In addition, qq() is a synonym for "", which is similar to
q() being a synoynm for ''. The rules for interpolation are as
follows:

=head3 Interpolation Rules

=over 3

=item Scalars: C<"$scalar">, C<"$(expression)">
Non-Reference scalars will simply interpolate as their value. $()
forces its expression into scalar context, which is then handled as
either a scalar or a reference, depending on how expression evaluates.

=item Lists: C<"@list">, C<"@(expression)">
Arrays and lists are interpolated by joining their list elements by the
list's separator property, which is by default a space. Therefore, the
following two expressions are equivalent:

=over 3
print "@list";
print "" ~ @list.join(@list.separator) ~ "";
=back

=item Hashes: C<"%hash">, C<"%(expression)">
Hashes interpolate by joining its pairs on its .separator property,
which by default is a newline. Pairs stringify by joining the key and
value with the hash's .pairsep property, which by default is a space.
Note that hashes are unordered, and so the output will be unordered.
Therefore, the following two expressions are equivalant:

=over 3
print "%hash";
print "" ~
join ( %hash.separator,
map { $_ ~ %hash.pairsep ~ %hash{$_} } %hash.keys
~ "";
=back

=item Subroutines and Methods: C<"&sub($a1,$a2)">, C<"$obj.meth($a)">
Subroutines and Methods will interpolate their return value into the
string, which will be handled in whichever type the return value is.
Same for object methods. Note that parens B<are> required during
interpolation so that the parser can disambiguate between object
methods and object members.

=item References C<"$ref">
# Behavior not defined

=item Default Object Stringification C<"$obj">
# Behavior not defined

=item Escaped Characters
# Basically the same as Perl5; also, how are locale semantics handled?

\t tab
\n newline
\r return
\f form feed
\b backspace
\a alarm (bell)
\e escape
\b10 binary char
\o33 octal char
\x1b hex char
\x{263a} wide hex char
\c[ control char
\N{name} named Unicode character

=item Modifiers: C<\Q{}>, C<\L{}>, C<\U{}>

Modifiers apply a modification to text which they enclose; they can be
embedded within interpolated strings.

\L{} Lowercase all characters within brackets
\U{} Uppercase all characters within brackets
\Q{} Escape all characters that need escaping
within brackets (except "}")

=back

=head3 Stopping Interpolation (\Q)

Within an interpolated string, interpolation of expressions can be
stopped by \Q.

=over 3
Example:
@list = (1,2);
print "@list\Q[0]"; # prints '1 2[0]'
=back

=head3 Embedding non-interpolated constructs: C<\q{}>

Similar to embedding an interpolated string within a non-interpolated
string, it is possible to embed a non-interpolated string within a
interpolated string with \q{}. Any characters within a \q{} construct
are treated as if they were in an non-interpolated string.

=over 3
Example:
"string \q{$variable}" # $variable will not be interpolated
=back
=head3 C<qx()>, backticks (C<``>)

A string which is (possibly) interpolated and then executed as a system
command with /bin/sh or its equivalent. Shell wildcards, pipes, and
redirections will be honored. The collected standard output of the
command is returned; standard error is unaffected. In scalar context,
it comes back as a single (potentially multi-line) string, or undef if
the command failed. In list context, returns a of list of lines split
on the standard input separator, or an empty list if the command
failed.

=head2 Special Quoting

=head3 Here-Docs

A line-oriented form of quoting is based on the shell "here-document"
syntax. Following a << you specify a string to terminate the quoted
material, and all lines following the current line down to the
terminating string are the value of the item. The terminating string
may be either an identifier (a word), or some quoted text. If quoted,
the type of quotes you use determines the treatment of the text, just
as in regular quoting. An unquoted identifier works like double quotes.
The terminating string must appear by itself, and any preceding or
following whitespace on the terminating line is discarded.

=over 3
Examples:

print << EOF;
The price is $Price.
EOF

print << "EOF"; # same as above
The price is $Price.
EOF

print << "EOF"; # same as above
The price is $Price.
EOF

print << `EOC`; # execute commands
echo hi there
echo lo there
EOC

print <<"foo", <<"bar"; # you can stack them
I said foo.
foo
I said bar.
bar

myfunc(<< "THIS", 23, <<'THAT');
Here's a line
or two.
THIS
and here's another.
THAT

=back

Don't forget that you have to put a semicolon on the end to finish the
statement, as Perl doesn't know you're not going to try to do this:

=over 3
print <<ABC
179231
ABC
+ 20;
=back

If you want your here-docs to be indented with the rest of the code,
you'll need to remove leading whitespace from each line manually:

=over 3
($quote = <<'FINIS') =~ s/^\s+//gm;
The Road goes ever on and on,
down from the door where it began.
FINIS
=back

If you use a here-doc within a delimited construct, such as in s///eg,
the quoted material must come on the lines following the final
delimiter. So instead of:

=over 3
s/this/<<E . 'that'
the other
E
. 'more '/eg;
=back

you have to write

=over 3
s/this/<<E . 'that'
. 'more '/eg;
the other
E
=back

Also note that with single quoted here-docs, backslashes are not
special, and are taken for a literal backslash, a behaivor that is
different from normal single-quoted strings.

=head3 V-Strings

V-Strings are formed when 3 or digits are joined by decimal points,
with a possible leading v. The resulting item is then treated like
a string, rather than a number.

=over 3
Examples:
$var = v5.8.0; # $var = "5.8.0";
$var = 192.168.0.1; # $var = "192.168.0.1";
=back

=head2 Gory Details of parsing quoted constructs

No string section would be complete without a "Gory details of parsing
quoted constructs"; however, since the current implementation in P6C
doesn't have support for \Q, \Q{}, \L{}, \U{}, \N{name}, or \x{}, the
implementation may have to change. If you really need your blood and
guts, please see P6C/Tree/String.pm for the current string-parsing
semantics.

=cut

Reply via email to