On Thu, Apr 20, 2006 at 09:24:09AM -0500, Patrick R. Michaud wrote:
: First, let me say I really like the changes to S05.  Good work
: once again.
: 
: Here are my questions and comments.
: 
: On Thu, Apr 20, 2006 at 02:07:51AM -0700, [EMAIL PROTECTED] wrote:
: > -(To get rule interpolation use an assertion - see below)
: > +However, if C<$var> contains a rule object, rather attempting to
: > +convert it to a string, it is called as if you said C<< <$var> >>.
: 
: Does this mean it's a capturing rule?  Or is it called as
: if one had said  C<< <?var> >>?   (I would prefer it default
: to non-capturing.)

I'd say the intent is non-capturing.  In fact, it seems like a machanism
for stealth rule injection.  It falls just a wee bit short of a security
hole, though, I think, since an interloper would have to be in the same
process to compile the rule.  We probably shouldn't try to run a tainted
rule, on the theory that the interloper tricked some other code into
compiling the stealth rule.

: > +If it is a string, it is matched literally, starting after where the
: > +key left off matching.
: > ..
: > +If it is a rule object, it is executed as a subrule, with an initial
: > +position after the matched key.
: > ..
: > +If it has the value 1, nothing special happens except that the key match
: > +succeeds.
: > ..
: > +Any other value causes the match to fail.  In particular, shorter keys
: > +are not tried if a longer one matches and fails.
: 
: Is there a way to say to continue with the next shortest key?

Yeah, use <@rules> rather than <%tokens>.  :)

Actually, how about we say that '' just succeeds, and a number says to
retry ignoring keys longer than the number?

: > +As with bare hash, the longest key matches according to the longest token
: > +rule, but in addition, you may combine multiple hashes under the same
: > +longest-token consideration like this:
: > +
: > +    <%statement|%prefix|%term>
: 
: This will be interesting from an implementation perspective.  :-)

Has to be done somewhere anyway.  I'd rather the rule syntax grok the
notion than to sluff it off to some kind of magical hash constructor.
This way the rule knows exactly which hashes it has to track and cache.
It's also plain to the reader of the rule which syntactic categories
are being lumped together at this state in the parse.

: > +It is a syntax error to use an unbalanced C<< <( >> or C<< )> >>.
: 
: On #perl6 I think it was discussed that C<< <( >> and C<< )> >>
: could be unbalanced -- that the first simply set the "from"
: position and the second set the "to/pos" position.  I think I
: would prefer this.
: 
: Assuming we require the balance, what do we do with things like...?
: 
:     / aaa <( bbb { return 0; } ccc )> ddd /
: 
: And are we excluding the possibility of:
: 
:     / aaa <( [ bbb )> ccc 
:              | dd ee )> ff 
:              ]
:     /
: 
: (The last example might be the anti-use case that shows that
: <( and )> ought to be properly nested and balanced.)

Lemme think about that some more.  I was worrying about accidental )>,
and not thinking about alternation.  Certainly your example could
be rewritten as

     / aaa [
           | <( bbb )> ccc 
           | <( dd ee )> ff 
           ]
     /

but there are obviously cases where it wouldn't work.  On the other
hand, there's perhaps some mental efficiency by lumping in <(...)>
with all the other <...> constructs, none of which can be unbalanced.
I'm inclined to say that the conservative thing is to require balance.
We could relax it later, I suppose.

: > +Conjecture: Multiple opening angles are matched by a corresponding
: > +number of closing angles, and otherwise function as single angles.
: > +This can be used to visually isolate unmatched angles inside:
: > +
: > +    <<<Ccode: a >> 1>>>
: 
: Does this eliminate the possibility of ever using french angles
: as a possible rule syntax character?  (It's okay if it does, 
: I simply wanted to make the observation.)

Probably, unless we treat <<...>> as French angles specially, for which
there is something to be said.  I was just trying to make <<<...>>> consistent
with our other q<<<...>>> mechanisms, which recently switched to [[[...]]]
policy like POD has always had.

: > +Just as C<rx> has variants, so does the C<rule> declarator.
: > +In particular, there are two special variants for use in grammars:
: > +C<token> and C<parse>.
: 
: I agree with Audrey that C<parse> is probably too useful in other
: contexts.  C<token:w> works fine for me.

Aesthetically, I hate :w, actually...and the whole point of naming "token"
is that it is *not* a normal parser rule, but a lexer rule.

But I agree that "parse" is probably the wrong word.  Earlier versions
had "prod" (short for "production") or "words".  Even earlier
versions made ordinary "rule" have these semantics, but then it was
too confusing to talk about rules in general.  I was very happy when
I thought of splitting the concepts yesterday.

I will think about that some more today.  Consider "parse" a placeholder
for the concept of a plain old ordinary BNF rule.

: > +With C<:global> or C<:overlap> or C<:exhaustive> the boolean is
: > +allowed to return true on the first match.  
: 
: Nice, nice, nice!  Makes things *much* simpler for PGE.

I don't see much point in not having rules be as lazy as possible.

Larry

Reply via email to