On Sat, Apr 16, 2005 at 11:30:49AM -0700, Larry Wall wrote: : The basic rule of thumb is that we pretend we're a top-down parser : even if we aren't, and we only look for the trailing delimiter when : we're not trying to parse something embedded that would naturally : slurp up the trailing delimiter as part of the internal construct. : Certainly any kind of bracketing structure hides anything inside it : from the delimiter scanner, but so do tokens like identifiers.
I think I have to clarify what I mean by that last phrase. Trailing delimiters are hidden inside any token that has already been started, but not at the start of a token (where token is taken to be fairly restrictive). Therefore these are errors: qq. $foo.bar() . qq: @foo::bar[] : However qq/ &foobar( $a / $b ) / is just fine, since (...) is looking for its own termination. Basically we don't have to keep track of sets of terminators (unless we want to use that info after a syntax error to make hypotheses and explore alternate realities in the service of better error messages). Given our plan of a hybrid parser with a bottom-up operator precedence parser sandwiched between top-down parsers, and assuming that "." is the tightest operator that the bottom-up expression parser treats as an operator, it more or less comes down to the fact that anything the expression parser pulls in as a single term is going to be treated as a construct that ignores any outer delimiters because it's calling out to a lower-level top-down parser at that point to parse the term in question. Hmm, I guess there's still a little ambiguity in there in the case of lookahead. And the fact is, a construct like qq. $foo.bar() . either has to do some lookhead or some backtracking to determine that the entire interpolated expression ends with a bracketed construct, since we've said that " $foo.bar() " interpolates $foo.bar(), while " $foo.bar " interpolates only $foo. (With similar constraints on array and hash interpolation.) So it's possible that qq. $foo.bar() . could parse okay if we treat the () as a terminator that some grammatical construct is looking ahead for. But given that $foo is the one interpolator that doesn't require trailing brackets, it seems like it's terribly ambigous in this case. However, only dot has that problem, and with qq: @foo::bar[] : you know it requires the [] to interpolate at all. So I guess this is one of those we can argue both ways. The chance of someone writing qq:@foo::bar[] when they mean qq:@foo: :bar[] seems fairly remote. So my best guess at this point is that we should let the interpolative lookahead hide the trailing delimiter also, and that is probably what the user expects in any event, since when they were writing the expression, the nearby context is the preceding term, but the distant context is the delimiter, which they've probably just forgotten is potentially ambiguous. So let's just resolve it that way without telling them. I guess this is the one place we're requiring arbitrarily long lookahead to figure things out, since we interpolate " @foo::bar::baz::fee::fie::foe[] " but not " @foo::bar::baz::fee::fie::foe " under the current rules. I think the lookahead doesn't have to parse past the [ (or other opener), though. All it has to decide is whether the next : (or dot) is to be treated as part of the interpolation. So this is a syntax error (of the runaway "" variety, presumably): " @foo::bar::baz::fee::fie::foe[ " Larry