On 02/07/2012 02:51 PM, Mark S. Miller wrote:
On Tue, Feb 7, 2012 at 1:52 PM, Waldemar Horwat <[email protected]
<mailto:[email protected]>> wrote:
[...]
That's going back to the previous approach of treating the whole quasi as a
single token. This doesn't work because it's not possible to specify the
BalancedCurlySequence production as a lexical grammar. You're confusing the
lexical with the syntactic grammars here.
Hi Waldemar, I am first of all trying to make clear what we're actually
proposing, and to resolve any genuine ambiguity. As for how we phrase this
proposal so that it fits with the rest of our spec language, what do you
suggest?
Examples of why BalancedCurlySequence doesn't work:
{/[{]/}
(interior parses as five single-character tokens but no matching closing
bracket)
Yes, and therefore a program consisting of
`{/[{]/}`
fails to lex and fails to parse. That seems like the correct outcome.
Why? It's just a regexp.
{ainb}
(interior parses as three tokens: a in b)
Why doesn't it parse as one token: ainb ?
The point is that a in b is one valid parse. I don't need to show that there
are no other valid parses. In fact, there are lots of other valid parses
because the grammar is very ambiguous.
{3.toString()}
(interior parses as 3 . toString ( ))
Why? That's not what the JS lexer does anywhere else?
That's the problem with the rule you gave.
I don't at all see how you arrived at your conclusions. Is it actually unclear
what I am trying to say, or are you simply taking issue with how I'm saying it?
If you find Erik's way of specifying ok, let's just use that. As I just said in
reply to him, it does capture my actual intent more directly.
The bug is in what you're trying to say, not in how you're saying it. You're
confusing the lexical and syntactic grammars. Due to this confusion you're
trying lexical productions such as
BalancedCurlySequence ::
Token *but not one of { or }*
{ Spacing* (BalancedCurlySequence Spacing*)* }
To illustrate the problem, consider a simpler lexer rule:
TokenSequence ::
Token*
This will lex ainb as many things, including for example a in b. The existing
lexer resolves it by always chomping the largest sequence of characters to bite
off as the next lexical token. Once it accepts a token, it doesn't backtrack
if it later finds an alternative parse for that token that would have made
future tokens work better. On the other hand, if you allow productions such as
a TokenSequence inside a lexical token, then you get full backtracking and
ambiguity across the Tokens that make up the TokenSequence because they are all
part of one lexical token.
I was favorable to splitting up a quasi into multiple tokens, where this
problem for the most part doesn't arise. If you want to make the whole quasi
into one token, then you'll need to solve this problem.
Waldemar
_______________________________________________
es-discuss mailing list
[email protected]
https://mail.mozilla.org/listinfo/es-discuss