(proto RFC possibly, and some generalised ramblings)

Given that expansion of regexes could include (+...) and (*...) I have been
thinking about providing a general purpose way of adding functionality. 

I propose that the entire (+...) syntax is kept free from formal
specification for this and is available for pluggable (module) expansion. 
(+ = addition).

A module or anything that wants to support some enhanced syntax
registers something that handles "regex enhancements".

At regex compile time, if and when (+foo) is found perl calls
each of the registered regex enhancements in turn, these:

1) Are passed the foo string as a parameter exactly as is.  (There is
an issue of actually finding the end of the foo.)

2) The regex enhancement can either recognise the content or not.

3) If not it returns undef and perl goes to the next regex enhancement
(Does it handle the enhancements as a stack (Last checked first) or a list
(First checked first?) how are they scoped?  Job here for the OO fanatics)

4) If perl runs out of regex enhancements it reports an error.  

5) if an enhancement recognises the content it could do either of:

a) return replacement expanded regex using existing capabilities perl will
then pass this back through the regex compiler.

b) return a coderef that is called at run time when the regex gets to this
point.  The referenced code needs to have enough access to the regex
internals to be able to see the current sub-expression, request more
characters, access to relevant flags and visability of greediness.  It may
also need a coderef that is simarly called when the regex is being unwound
when it backtracks.


Thinking from that - the last case should be generalised (it is sort of
like my (?*{...}) from RFC 198.  If so both cases a and b are the same,
b is just a case of returning (?*{...}).  

Following on, if (?{...}) etc code is evaluated in forward match, it would
be a good idea to likewise support some code block that is ignored on a
forward match but is executed when the code is unwound due to backtracking. 
Thus (?{ foo })(?\{ bar }) could be defined to execute foo on the forward
case and bar if it unwinds.  

For example - Think about foo putting something on a stack (eg the
bracket to match [RFC 145]) and bar taking it off.

I dont care at the moment what the syntax is - what about the concepts?

Richard






-- 

[EMAIL PROTECTED]

Reply via email to