Patrick R. Michaud wrote:

On Fri, Dec 10, 2004 at 01:34:03PM -0500, James deBoer wrote:


Currently, the split opcode is declared as 'split(out PMC, in STR, in STR)' where $2 is a regex.

PGE, however, currently supports three types of regular expressions, and more are likely going to be added. So, which type of regular expression should split use?
[...]
A solution:


Declare split as 'split(out PMC, in PMC, in STR)' where $2 would be a compiled PGE::Match object. This lets you pick what kind of regular expression you want to use.



Slight correction: Thus far a "PGE::Match object" is the result of performing a match between a rule and target string, not the compiled form of the rule. At present a rule is just a subroutine that returns
PGE::Match objects. Eventually we may have a PGE::Rule class for
representing compiled rule objects, but we're not there yet. So, $2 would need to be a rule subroutine.


Going beyond that, we might want to just have a "split" method for PGE::Rule objects, and leave the split opcode to do fast separation
of strings based on constant strings. But I'm not entirely familiar
with Parrot's opcode/MMD semantics so I'll follow others' leads on this
one...


Pm



I would even go further than that and say that if we went with PGE::Rule's "split", the split opcode should be obsoleted. I can't think of a place where splitting on constant strings is not a special case of splitting on a regular expression. Evaluating a very simple regular expression (i.e. a constant string) should be fast enough that it is not worth the effort to determine if a pattern can be sent through the split opcode instead of PGE::Rule."split"().

However, using a split opcode that accepts a match subroutine has the advantage that the PGE is not strictly required. It would be possible to write your own subroutines if speed or code size were issues or if you had some other crazy requirements.

This raises the question: How far do we want to let the PGE into our everyday lives?

- James



Reply via email to