On Fri, Jan 23, 2004 at 06:43:04PM -0800, Dave Whipp wrote: : "Larry Wall" <[EMAIL PROTECTED]> wrote in message : news:[EMAIL PROTECTED] : > That is, suppose you have: : > : > macro leach () { return "ï" } : > macro reach () { return "ï" } : > : > You could unambiguosly write : > : > leach+reach : > : > but (assuming spaces not allowed within distributed operators) you can't : > write : > : > leacheqreach : : But, presumably, you could write a macro that has a whitespace-eater encoded : somehow. That is, : : macro leach() { chomp_trailing_whitespace; return "ï" } : macro reach () { chomp_leading_whitespace; return "ï" } : : then the macro magic would expand "leach eq reach" as "ïeqï" (which, : hopefully, it then re-parses as a single token^Woperator).
Unfortunately, it wouldn't. The second one would only reparse the text provided by the second macro. It would have to be written as a syntax tree munging macro. Or we'd have to have leach(eq) reach(eq) each(eq) : This doesn't : solve the generalized problem of disambiguating, though I could see a "_" : operator defined as a macro that eats all its surrounding whitespace. That has interesting possiblilities. I've always wanted to extend the rule that says a lonesome right curly implies a trailing semicolon. I'd like the rule to be that a final curly on *any* line implies a semicolon after it. Then your _ could be used to "eat" the whitespace and extend the line that happens to end with curly accidentally: map { $_ + 1 }_ 1,2,3,4,5,6,7,8,9.10; Given that the usual reason for needing extra whitespace is that you need a linebreak, I suspect that _ would want to eat comments as well: map { $_ + 1 }_ # increment by one 1,2,3,4,5,6,7,8,9.10; A _ would also be useful for gluing postfix operators to their preceding token in the cases where there's also a conflicting infix operator and the parser is trying to use whitespace to disambiguate. Note: we've trying not to define Perl's grammar in those terms, but we want to allow for the fact that someone might define their own infix:++ operator, and be willing to differentiate based on whitespace: $a++ + $b $a ++ $b In such a case, your _ would come in handy, so that either of $a _ ++ $a _++ means $a++ Note however that it wouldn't be the same as $a_++ unless we disallowed _ on the end of an identifier, which seems a bit callous. Likewise you couldn't say lreach_eq_rreach but would have to say lreach _ eq _ rreach or rely on _ within the macro definition to work correctly, which might be tricky to implement if some grammar rule has already claimed the whitespace. And the _ might have the untoward effect of turning it back into the single token: lreach_eq_rreach So we might need to differentiate a token that glues tokens from one that doesn't. Maybe __ glues tokens. Shades of the C preprocessor... Given all that, though, I'm not sure that >>_+<<_<< would successfully disambiguate things unless we view >>op<< as three tokens where the third token is a postfix operator distinguished by whitespace from something like >>op <<qwlist>>. I said we were trying to avoid that distinction in Perl's grammar, but maybe we need it here. Anyway, if we do use _ for that, the people who want to warp Perl into Prolog will have to use something else for unnamed bindings. :-) Larry