> From: Sascha Schumann [mailto:[EMAIL PROTECTED]]
> On Wed, 5 Sep 2001, Sander Striker wrote:
>
>>> I'm not totally sure I'm sold on this approach being better. But,
>>> I'm not sure that it is any worse either. Don't have time to
>>> benchmark this right now. I'm going to throw it to the wolves and
>>> see what you think.
>>
>> Me neither. Rabin-Karp introduces a lot of * and %.
>> I'll try Boyer-Moore with precalced tables for '<!--#' and '--->'.
>
> Well, there are more advanced algorithms than BM available
> today which are even easier to implement that the original BM
> algo.
Easier to implement? BM is so simple, I can't imagine something
being simpler.
> I'd suggest looking at BNDM which combines the advantages of
> bit-parallelism (shift-and/-or algorithms) and suffix
> automata (BM-style).
>
> http://citeseer.nj.nec.com/navarro01fast.html
>
> To give you an idea on how a bndm implementation looks like,
> I'm appending an unpolished implementation I did some time
> ago which includes a test-case for locating '<!--#'.
Indeed the LOC is low. Very cool! Considering that we can
precal a bndm_t for '<!--#' and '--->'.
It is certainly more advanced.
> - Sascha Experience IRCG
> http://schumann.cx/ http://schumann.cx/ircg
Ian, since you were doing the mod_include.c optimizations anyway,
can you check this out?
Btw, do we have any string matching code in apr or apr-util? I
can see the applicability of decent string matchers in other parts
than mod_include.
Sander