Moritz Lenz wrote:
Am 27.04.2010 06:31, schrieb Stéphane Payrard:
When doing an analyse of a sample parse tree, I note that it is
cluttered by the reduction of optional subrules
to  generate a zero length parse subtree. That is, rules with a '?'
quantifier matching zero time.

Currently the ? quantifier is just syntactic sugar for the general quantifier ** 0..1.
Would that behave the same?
Are you proposing that all quantifiers that match zero times should not appear in the parse tree? Then <twigil>+ could either lead to no capture at all, a single value or a list - not very nice.
Or rather special-casing the 0..1 quantifier?

There's also another problem with your approach: If you have
<twigil>?
in your regex, and it matches the empty string, it is still a successful match - yet with your proposal, it's impossible to distinguis a successful zero-width match from an unsuccessful match (which can happen in alternations).

Seems to me that it's possible to have your cake and eat it. If you look at this as an optimization problem, and note that it's expensive to have these array objects littered around, then perhaps the answer is to use a new class (that still "does Array") whose length is guaranteed to be either zero or one. That's probably cheaper than a full (zero-one-many) array object, yet the user doesn't necessarily see any difference.

Yet, at the same time, Stéphane's useability issue can be addressed by allowing that the node associated with the ? quantifier in the parse tree does, in fact, have some additional capabilities beyond a basic **0..1 quantifier. I see no reason to be dogmatically bound to the idea that ? is nothing more than syntactic sugar.

Reply via email to