Hi everybody,

I've been studying parts of the wwwoffle source code the last few weeks. I
greatly admire the overall design of WWWOFFLE, but I find some of the details
could be improved.
One of these features I'd like to discuss here is wildcard pattern matching.
I was a bit annoyed by the 2 '*' limit imposed by WWWOFFLE, so I had a look at
the code to see if it could be improved. I thought up an algorithm that was more
general (the number of *s allowed is practically unlimited), more compact and
somewhat faster. Actually using this rewritten pattern matching function
was a bit more involved that I had anticipated at first, because I used a
different representation of wildcard patterns (not one simple string but a
sequence of strings). I had to make changes in all the places where wildcard
patterns are used, but I eventually managed to pull it off.

My version still tries every pattern in sequence, just as the original WWWOFFLE
code. To do truly fast pattern matching you would probably need to translate a
sequence of patterns into a finite state machine, but I doubt this is worth the
effort.

Nevertheless, I think my version will still be interesting for people with a
large number of patterns in their configuration file (particularly the DonGet
section), or who find the 2 '*' limit annoying.

If someone would like to try out my version, just ask me nicely and I will send
you a patch file.


Paul A. Rombouts <[EMAIL PROTECTED]>
Vincent van Goghlaan 27
5246 GA  Rosmalen
Netherlands

Reply via email to