First of all, apologies for sending through Hotmail; I'm home for the
weekend, and have no access to email. However, I figured I should
send this out as soon as I finished so that people have something to
think about besides numbers :)

Anyways, here's a first draft of the string documentation, I've put
comments in as:

# comment.

Known issues:

- Larry was never specific as to how hashes interpolate; anyone have
any ideas?
- References and Object stringification hasn't been defined.
- If References interpolate in some sort of readable way, how do
multi-leveled references interpolate, and how do self-referring
data structures interpolate?
- I assumed escaping would be similar to perl5's, so I just used the
table from perl5's perlop.


Joseph F. Ryan
[EMAIL PROTECTED]



=pod

=head1 Strings

A string is a literal value that represents a sequence of characters.
A string object is formed when a sequence of characters is enclosed in
one of the quoting operators, of which there are 3 types:
Interpolating, Non-Interpolating, and Here-Docs; each are explained
below.

=head2 Non-Interpolating Constructs

Non-Interpolating constructs are strings in which expressions do not
interpolate, or expand. The one exception to this is that the
backslash character, \, will always escape the character that
immediately follows the it.

The base form for a non-interpolating string is the single-quoted
string: 'string'. However, non-interpolating strings can also be formed
with the q() operator. The q() operator allows strings to be made with
any non-space, non-letter, non-digit character as the delimeter instead
of '. In addition, if the starting delimeter is a part of a paired
set, such as (, [, <, or {, then the closing delimeter may be the
matching member of the set. In addition, the reverse holds true;
delimeters which are the tail end of a pair may use the starting item
as the closing delimeter.

=over 3
Examples:

$string = 'string' # $string = 'string'
$string = q|string| # $string = 'string'
$string = q(string) # $string = 'string'
$string = q]string[ # $string = 'string'
=back


=head3 Embedding Interpolated Strings

It is also possible to embed an interpolating string within a non-
interpolating string by the use of the \qq{} construct. A string
inside a \qq{} constructs acts exactly as if it were an interpolated
string. Note that any end-brackets, "}", must be escaped within the
the \qq{} construct so that the parser can read it correctly.

=over 3
Examples ( assuming C<< $var="two" >> ):

$string = 'one \qq{$var} two' # $string = 'one two three'
$string = 'one\qq{ {$var\} }two' # $string = 'one {two} three'
=back

=head3 <>; expanding a string as a list.

A set of braces is a special op that evaluates into the list of words
contained, using whitespace as the delimeter. It is similar to qw()
from perl5, and can be thought of as roughly equivalent to:
C<< "STRING".split(' ') >>

=over 3
Examples:

@array = <one two three>; # @array = ('one', 'two', 'three');
@array = <one <\> three>; # @array = ('one', '<>', 'three');
=back

=head2 Interpolating Constructs

Interpolating constructs are another form of string in which variables
that are embedded into the string are expanded into their value at
runtime. Interpolated strings are formed using the double quote:
"string". In addition, qq() is a synonym for "", which is similar to
q() being a synoynm for ''. The rules for interpolation are as
follows:

=head3 Interpolation Rules

=over 3

=item Scalars: C<"$scalar">, C<"$(expression)">
Non-Reference scalars will simply interpolate as their value. $()
forces its expression into scalar context, which is then handled as
either a scalar or a reference, depending on how expression evaluates.

=item Lists: C<"@list">, C<"@(expression)">
Arrays and lists are interpolated by joining their list elements by the
list's separator property, which is by default a space. Therfore, the
following two expressions are equivalant:

=over 3
print "@list";
print "" ~ @list.join(@list.separator) ~ "";
=back

=item Hashes: C<"%hash">, C<"%(expression)">
# RFC 237 proposes: join( $/, map { qq($_$"$hash{$_}) } keys %hash )
# However, Larry never made any definite decision.
# Well, maybe he did, and just didn't tell anyone :)

=item Subroutines and Methods: C<"&sub($a1,$a2)">, C<"$obj.meth($a)">
Subroutines and Methods will interpolate their return value into the
string, which will be handled in whichever type the return value is.
Same for object methods. Note that parens B<are> required during
interpolation so that the parser can disambiguate between object
methods and object members.

=item References C<"$ref">
# Behaivor not defined

=item Default Object Stringification C<"$obj">
# Behaivor not defined

=item Escaped Characters
# Basically the same as Perl5; also, how are locale semantics handled?

\t tab
\n newline
\r return
\f form feed
\b backspace
\a alarm (bell)
\e escape
\b10 binary char
\o33 octal char
\x1b hex char
\x{263a} wide hex char
\c[ control char
\N{name} named Unicode character

=item Modifiers: C<\Q{}>, C<\L{}>, C<\U{}>

Modifiers apply a modification to text which they enclose; they can be
embedded within interpolated strings.

\L{} Lowercase all characters within brackets
\U{} Uppercase all characters within brackets
\Q{} Escape all characters that need escaping
within the current string (except "}")

=back

=head3 Stopping Interpolation (\Q)

Within an interpolated string, interpolation of expressions can be
stopped by \Q.

=over 3
Example:
@list = (1,2);
print "@list\Q[0]"; # prints '1 2[0]'
=back

=head3 Embedding non-interpolated constructs: C<\q{}>

Similar to embedding an interpolated string within a non-interpolated
string, it is possible to embed a non-interpolated string within a
interpolated string with \q{}. Any characters within a \q{} construct
are treated as if they were in an non-interpolated string.

=over 3
Example:
"string \q{$variable}" # $variable will not be interpolated
=back

# from perl5's perlop
=head3 C<qx()>, backticks (C<``>)

A string which is (possibly) interpolated and then executed as a system
command with /bin/sh or its equivalent. Shell wildcards, pipes, and
redirections will be honored. The collected standard output of the
command is returned; standard error is unaffected. In scalar context,
it comes back as a single (potentially multi-line) string, or undef if
the command failed. In list context, returns a list of lines (however
you've defined lines with $/ or $INPUT_RECORD_SEPARATOR), or an empty
list if the command failed.

=head2 Here Docs

# modified from perl5's perlop
A line-oriented form of quoting is based on the shell "here-document"
syntax. Following a << you specify a string to terminate the quoted
material, and all lines following the current line down to the
terminating string are the value of the item. The terminating string
may be either an identifier (a word), or some quoted text. If quoted,
the type of quotes you use determines the treatment of the text, just
as in regular quoting. An unquoted identifier works like double quotes.
The terminating string must appear by itself, and any preceding or
following whitespace on the terminating line is discarded.

=over 3
Examples:

print << EOF;
The price is $Price.
EOF

print << "EOF"; # same as above
The price is $Price.
EOF

print << "EOF"; # same as above
The price is $Price.
EOF

print << `EOC`; # execute commands
echo hi there
echo lo there
EOC

print <<"foo", <<"bar"; # you can stack them
I said foo.
foo
I said bar.
bar

myfunc(<< "THIS", 23, <<'THAT');
Here's a line
or two.
THIS
and here's another.
THAT

=back

Don't forget that you have to put a semicolon on the end to finish the
statement, as Perl doesn't know you're not going to try to do this:

=over 3
print <<ABC
179231
ABC
+ 20;
=back

If you want your here-docs to be indented with the rest of the code,
you'll need to remove leading whitespace from each line manually:

=over 3
($quote = <<'FINIS') =~ s/^\s+//gm;
The Road goes ever on and on,
down from the door where it began.
FINIS
=back

If you use a here-doc within a delimited construct, such as in s///eg,
the quoted material must come on the lines following the final
delimiter. So instead of:

=over 3
s/this/<<E . 'that'
the other
E
. 'more '/eg;
=back

you have to write

=over 3
s/this/<<E . 'that'
. 'more '/eg;
the other
E
=back

Also note that with single quoted here-docs, backslashes are not
special, and are taken for a literal backslash, a behaivor that is
different from normal single-quoted strings.

=head2 Gory Details of parsing quoted constructs

No string section would be complete without a "Gory details of parsing
quoted constructs"; however, since the current implementation in P6C
doesn't have support for \Q, \Q{}, \L{}, \U{}, \N{name}, or \x{}, the
implementation may have to change. If you really need your blood and
guts, please see P6C/Tree/String.pm for the current string-parsing
semantics.

=cut

_________________________________________________________________
The new MSN 8: smart spam protection and 2 months FREE* http://join.msn.com/?page=features/junkmail

Reply via email to