Hello everyone,

Thank you all for your comments! Those are some very useful ideas.

I think I'll try roger's (private) and ChrisK's suggestion first: using the match groups. I'm not sure if the match groups inside the individual regexes will cause much trouble, but we'll see. I imagine I'll have to count parentheses, except when it's followed by a \, except when that \ follows another \, etc. There's probably other situations where a () doesn't count as a group, perhaps when it's followed by a * or +. I'll look into that.

If that doesn't work out I'll go for Neil's (from an algorithmic POV beautiful) suggestion.

While I understand that some of you suggest I use parsec (or some other mature parser library) I'm pretty sure that's not what I want here. The patterns will almost always be very simple and regular expressions offer an extremely concise way of expressing when a hook should fire. Forcing the user to use full parsers would cause the programs to become much more verbose. Still, Yogurt is flexible enough to allow the user to use parsec if he or she so chooses.

Thanks again,

Martijn.



Mitchell, Neil wrote:
Hi Martijn,

It's not that tricky if you do a regular expression state machine
yourself, but that's probably a bit too much work. One way to get a
speed up might be to take the regular expressions a,b,c,d and
generate a regex a+b+c+d, and one a+b. You can then check any string
s against a+b+c+d, if that matches check a+b, if that matches check
a. At each stage you eliminate half the regular expressions, which
means a match will take log n, where n is the number of regular
expressions.

This assumes the underlying regular expression engine constructs a
finite state machine, making it O(m) where m is the length of the
string to match.

Thanks

Neil
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to