> From: Sascha Schumann [mailto:[EMAIL PROTECTED] > On Wed, 5 Sep 2001, Sander Striker wrote: > >>> I'm not totally sure I'm sold on this approach being better. But, >>> I'm not sure that it is any worse either. Don't have time to >>> benchmark this right now. I'm going to throw it to the wolves and >>> see what you think. >> >> Me neither. Rabin-Karp introduces a lot of * and %. >> I'll try Boyer-Moore with precalced tables for '<!--#' and '--->'. > > Well, there are more advanced algorithms than BM available > today which are even easier to implement that the original BM > algo.
Easier to implement? BM is so simple, I can't imagine something being simpler. > I'd suggest looking at BNDM which combines the advantages of > bit-parallelism (shift-and/-or algorithms) and suffix > automata (BM-style). > > http://citeseer.nj.nec.com/navarro01fast.html > > To give you an idea on how a bndm implementation looks like, > I'm appending an unpolished implementation I did some time > ago which includes a test-case for locating '<!--#'. Indeed the LOC is low. Very cool! Considering that we can precal a bndm_t for '<!--#' and '--->'. It is certainly more advanced. > - Sascha Experience IRCG > http://schumann.cx/ http://schumann.cx/ircg Ian, since you were doing the mod_include.c optimizations anyway, can you check this out? Btw, do we have any string matching code in apr or apr-util? I can see the applicability of decent string matchers in other parts than mod_include. Sander