On Fri, Jan 23, 2004 at 06:43:04PM -0800, Dave Whipp wrote:
: "Larry Wall" <[EMAIL PROTECTED]> wrote in message
: news:[EMAIL PROTECTED]
: > That is, suppose you have:
: >
: >     macro leach () { return "ï" }
: >     macro reach () { return "ï" }
: >
: > You could unambiguosly write
: >
: >     leach+reach
: >
: > but (assuming spaces not allowed within distributed operators) you can't
: > write
: >
: >     leacheqreach
: 
: But, presumably, you could write a macro that has a whitespace-eater encoded
: somehow. That is,
: 
: macro leach() { chomp_trailing_whitespace; return "ï" }
: macro reach () { chomp_leading_whitespace; return "ï" }
: 
: then the macro magic would expand "leach eq reach" as "ïeqï" (which,
: hopefully, it then re-parses as a single token^Woperator).

Unfortunately, it wouldn't.  The second one would only reparse the text
provided by the second macro.  It would have to be written as a syntax
tree munging macro.  Or we'd have to have

    leach(eq)
    reach(eq)
    each(eq)

: This doesn't
: solve the generalized problem of disambiguating, though I could see a "_"
: operator defined as a macro that eats all its surrounding whitespace.

That has interesting possiblilities.  I've always wanted to extend the rule
that says a lonesome right curly implies a trailing semicolon.  I'd
like the rule to be that a final curly on *any* line implies a semicolon
after it.  Then your _ could be used to "eat" the whitespace and extend
the line that happens to end with curly accidentally:

    map { $_ + 1 }_
        1,2,3,4,5,6,7,8,9.10;

Given that the usual reason for needing extra whitespace is that you
need a linebreak, I suspect that _ would want to eat comments as well:

    map { $_ + 1 }_ # increment by one
        1,2,3,4,5,6,7,8,9.10;

A _ would also be useful for gluing postfix operators to their
preceding token in the cases where there's also a conflicting infix
operator and the parser is trying to use whitespace to disambiguate.
Note: we've trying not to define Perl's grammar in those terms, but
we want to allow for the fact that someone might define their own
infix:++ operator, and be willing to differentiate based on whitespace:

    $a++ + $b
    $a ++ $b

In such a case, your _ would come in handy, so that either of

    $a _ ++ 
    $a _++

means

    $a++

Note however that it wouldn't be the same as

    $a_++

unless we disallowed _ on the end of an identifier, which seems a bit
callous.  Likewise you couldn't say

    lreach_eq_rreach

but would have to say

    lreach _ eq _ rreach

or rely on _ within the macro definition to work correctly, which
might be tricky to implement if some grammar rule has already claimed
the whitespace.  And the _ might have the untoward effect of turning
it back into the single token:

    lreach_eq_rreach

So we might need to differentiate a token that glues tokens from one
that doesn't.  Maybe __ glues tokens.  Shades of the C preprocessor...

Given all that, though, I'm not sure that >>_+<<_<< would successfully
disambiguate things unless we view >>op<< as three tokens where the
third token is a postfix operator distinguished by whitespace from
something like >>op <<qwlist>>.  I said we were trying to avoid that
distinction in Perl's grammar, but maybe we need it here.

Anyway, if we do use _ for that, the people who want to warp Perl
into Prolog will have to use something else for unnamed bindings.  :-)

Larry

Reply via email to