> By the way, what CPU did you test on?
A 64-bit dual-core core-2 CPU with 3Mb cache (per CPU).
So it's a fairly modern cpu, and the gcc is compiling for 64-bit
targets.
Simply reading the whole string one long at a time takes 1.2 seconds
(about 6 times faster).
So there are optimization possibilites. But pikes default search is
not really one of them.
memmem is just as fast as my simple loop, but shorter. :-)
for( i=0,j=0; i<hlen; i++ )
{
if( __builtin_expect(haystack[i] == needle[j], 0) ) {
j++;
if( j == nlen ) break;
}
else
j = 0;
}