Just a few more nits to pick... On 12/02/2002 6:58 AM, Joseph F. Ryan wrote:
We need to decide if this is a user doc or a developer doc/language specification. If it's the later, we need a regirous defintion of what a pair is.The q() operator allows strings to be made with any non-space, non-letter, non-digit character as the delimeter instead of '. In addition, if the starting delimeter is a part of a paired set, such as (, [, <, or {, then the closing delimeter may be the matching member of the set. In addition, the reverse holds true; delimeters which are the tail end of a pair may use the starting item as the closing delimeter.
Are comments ever allowed within q() constructs? If not, ditch the statement about comments not being allowed in q## constructs.There are a few special cases for delimeters; specifically : and #. : is not allowed because it might be used by custom-defined quoting operators to apply a property; # is allowed, but there cannot be a space between the operator and the #. In addition, comments are not allowed within # delimeted expressions (for obvious reasons).
A doubled set of angle brackets (<<text here>>) or a set of double-angle quotation marks (guillemets, «text here»).=head3 <<>>; expanding a string as a list.A set of braces is a special op that evaluates into the list of word
Are we getting rid of qw()? I assumed that we were keeping it as a longhand form of <<>>/guillemets, just like qq() is the longhand form of "".contained, using whitespace as the delimeter. It is similar to qw() from perl5, and can be thought of as roughly equivalent to:
I'd be more explicit here, and say C<<"STRING".split(/\s+/)>>. (The two are equivlent, but only because of special-casing; the second is more explicit.)C<< "STRING".split(' ') >>
=head2 Interpolating Constructs Interpolating constructs are another form of string in which variables that are embedded into the string are expanded into their value at runtime. Interpolated strings are formed using the double quote:
...using double quotes, as in "string".
"string". In addition, qq() is a synonym for "", which is similar to
q() being a synoynm for ''.
...similarly to...
Have these defaults been defined somewhere? I'd rather see them be ', ' and '=>' by default...=item Hashes: C<"%hash">, C<"%(expression)"> Hashes interpolate by joining its pairs on its .separator property, which by default is a newline. Pairs stringify by joining the key and value with the hash's .pairsep property, which by default is a space.
Get rid of the therefore; it seems to refer to the preceding sentance, which has nothing to do with the example.Note that hashes are unordered, and so the output will be unordered. Therefore, the following two expressions are equivalant:
Has this been vetted? $(...)/etc seem to cover this case, and & being a qq() metachar makes using qq() strings to print HTML/XML difficult.=item Subroutines and Methods: C<"&sub($a1,$a2)">, C<"$obj.meth($a)"> Subroutines and Methods will interpolate their return value into the string, which will be handled in whichever type the return value is. Same for object methods. Note that parens B<are> required during interpolation so that the parser can disambiguate between object methods and object members.
Can we get some riggor here? Also, is \n the same everwhere, or do we play the same tricks we did with it in p5? (I think it should be the same everywhere, a CR char, "\cM". Disciplines, or encodings, or whatever we're calling them, can take care of it on IO.) Oh, and it might be nice for \0 to be NUL. (This used to be implicit with \0 as octal, but since \0 isn't octal anymore...)=item Escaped Characters # Basically the same as Perl5; also, how are locale semantics handled? \t tab \n newline \r return \f form feed \b backspace \a alarm (bell) \e escape
Numeric Literals, take 3 (http:[EMAIL PROTECTED]/msg00462.html), in the "*** Bin/Hex/Oct shorthands" section, gives 0c123 as the shorthand form of octal numbers, so it doesn't make much sense for octal character constants to be \o123. Do we want to change shorthand octal literal numbers to 0o123 (I don't like this, it's hard to read), change octal chars to \c123 (can't do this without getting rid of, or changing, \c for control-character), get rid of octal chars entirely, or somthing else? (Baring a good "somthing else", I vote for killing octal chars.)\b10 binary char \o33 octal char
Exactly two digits after the \x? Perl5 attempts to do the right thing either way, but this can be confusing too -- "\xA" eq chr(0xA), "\xABar" eq chr(0xAB)."ar", "\xAQux" eq chr(0xA)."Qux".\x1b hex char
Rigor? What is \c~? perl5 thinks it's >, should perl6 agree? How about \c\x{1000} (that's invalid, but you get the point), is that equiv to \x{ff9c}? What about \cé, (e+acute accent), does that capitalize, then subtract 64, or just subtract?\x{263a} wide hex char \c[ control char
Reference to charnames pragmata, or however we end up defining the exact semantics of \N. (Since we don't know yet, just put in a FIXME, I suppose.)\N{name} named Unicode character
Is there any way to give the ordnal in decimal, like "\d192"? (I'm not sure how useful this would be, but it would be nice parrellelisim. OTOH, you can use chr() easily enough.
Rigor: escape all non-alphanumerics.=item Modifiers: C<\Q{}>, C<\L{}>, C<\U{}> Modifiers apply a modification to text which they enclose; they can be embedded within interpolated strings. \L{} Lowercase all characters within brackets \U{} Uppercase all characters within brackets \Q{} Escape all characters that need escaping within brackets (except "}")
Do we still have the other modifiers that p5 supports, \l and \u? Do we want a new titlecase modifier, \T{james mastros} eq "James Mastros", doing the Right Thing for other languages, where it isn't so simple (there are complicated cases for this, but IIRC Unicode defines a robust algo to do this). I'll check on the Unicode stuff if anybody thinks it's a good idea... I'm uncertian, myself, I never liked the qq() case-modifers, so don't use them.
This whole section is very unix-centric, but I'm not certian what to do about that -- the functionality is very system-specifc. Also, I suspect we're going to want to rewrite it anyway when we hammer out iterators, files, and context.A string which is (possibly) interpolated and then executed as a system command with /bin/sh or its equivalent. Shell wildcards, pipes, and redirections will be honored. The collected standard output of the command is returned; standard error is unaffected. In scalar context, it comes back as a single (potentially multi-line) string, or undef if the command failed. In list context, returns a of list of lines split on the standard input separator, or an empty list if the command failed.
A line-oriented form of quoting is based on the shell "here-document"
s/shell/unix borne shell/
I could have sworn that Larry recently put somthing out about the edge cases between << heredoc and << beginning-of-qw. I /think/ he said that qw("Foo" bar) must be written as << "Foo" bar>>, because otherwise it would be interpreted as a here-doc ending with Foo with double-quote interpolation. Can anybody find this, or is Larry watching?syntax. Following a << you specify a string to terminate the quoted material, and all lines following the current line down to the terminating string are the value of the item. The terminating string may be either an identifier (a word), or some quoted text. If quoted, the type of quotes you use determines the treatment of the text, just as in regular quoting. An unquoted identifier works like double quotes. The terminating string must appear by itself, and any preceding or following whitespace on the terminating line is discarded.
Are \qq()s still special, even in <<'noninterpolating's? Either way, it should be explicitly noted.Also note that with single quoted here-docs, backslashes are not special, and are taken for a literal backslash, a behaivor that is different from normal single-quoted strings.
Note that the v is non-optional for two-character v-strings.V-Strings are formed when 3 or digits are joined by decimal points, with a possible leading v. The resulting item is then treated like a string, rather than a number. =over 3 Examples: $var = v5.8.0; # $var = "5.8.0"; $var = 192.168.0.1; # $var = "192.168.0.1"; =back
I'd say somthing like:
V-strings are actualy strings that just happen to look like numbers. Each dot-sepperated number is transformed into the character with that Unicode ordnal, and the string is concotantaed together.
(The transformation from normal string to v-string looks like C<<$vstring='v' ~ join '.', map {ord} split //, $instring>>; the transformation from v-string to normal string looks like
C<<print join '', map {chr} split /\./, $vstring>>;
(Where vstring cannot begin with a leading 'v', for purposes of illistration.))
Thus, C<<80.101.114.108.32.54.33 eq 'Perl 6!'>>
Also, your examples are misleading at best. v5.8.0 eq "\x05\x08\x00".
192.168.0.1 eq chr(192)~chr(168)~chr(0)~chr(1).
-=- James Mastros