On 02/07/2012 02:51 PM, Mark S. Miller wrote:
On Tue, Feb 7, 2012 at 1:52 PM, Waldemar Horwat <[email protected] 
<mailto:[email protected]>> wrote:
[...]

    That's going back to the previous approach of treating the whole quasi as a 
single token.  This doesn't work because it's not possible to specify the 
BalancedCurlySequence production as a lexical grammar.  You're confusing the 
lexical with the syntactic grammars here.


Hi Waldemar, I am first of all trying to make clear what we're actually 
proposing, and to resolve any genuine ambiguity. As for how we phrase this 
proposal so that it fits with the rest of our spec language, what do you 
suggest?


    Examples of why BalancedCurlySequence doesn't work:

    {/[{]/}
    (interior parses as five single-character tokens but no matching closing 
bracket)


Yes, and therefore a program consisting of

     `{/[{]/}`

fails to lex and fails to parse. That seems like the correct outcome.

Why?  It's just a regexp.

    {ainb}
    (interior parses as three tokens: a in b)

Why doesn't it parse as one token: ainb ?

The point is that a in b is one valid parse.  I don't need to show that there 
are no other valid parses.  In fact, there are lots of other valid parses 
because the grammar is very ambiguous.

    {3.toString()}
    (interior parses as 3 . toString ( ))

Why? That's not what the JS lexer does anywhere else?

That's the problem with the rule you gave.

I don't at all see how you arrived at your conclusions. Is it actually unclear 
what I am trying to say, or are you simply taking issue with how I'm saying it? 
If you find Erik's way of specifying ok, let's just use that. As I just said in 
reply to him, it does capture my actual intent more directly.

The bug is in what you're trying to say, not in how you're saying it.  You're 
confusing the lexical and syntactic grammars.  Due to this confusion you're 
trying lexical productions such as

BalancedCurlySequence ::
    Token *but not one of { or }*
    { Spacing* (BalancedCurlySequence Spacing*)* }

To illustrate the problem, consider a simpler lexer rule:

TokenSequence ::
  Token*

This will lex ainb as many things, including for example a in b.  The existing 
lexer resolves it by always chomping the largest sequence of characters to bite 
off as the next lexical token.  Once it accepts a token, it doesn't backtrack 
if it later finds an alternative parse for that token that would have made 
future tokens work better.  On the other hand, if you allow productions such as 
a TokenSequence inside a lexical token, then you get full backtracking and 
ambiguity across the Tokens that make up the TokenSequence because they are all 
part of one lexical token.

I was favorable to splitting up a quasi into multiple tokens, where this 
problem for the most part doesn't arise.  If you want to make the whole quasi 
into one token, then you'll need to solve this problem.

    Waldemar
_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss

Reply via email to