Re: faster splitter

qznc via Digitalmars-d Tue, 24 May 2016 00:57:12 -0700

On Monday, 23 May 2016 at 22:19:18 UTC, Andrei Alexandrescu wrote:

On 05/23/2016 03:11 PM, qznc wrote:
Actually, std find should be faster, since it could use theBoyer Moore
algorithm instead of naive string matching.
Conventional wisdom has it that find() is brute force andthat's that, but probably it's time to destroy. Selectivelyusing advanced searching algorithms for the appropriate inputsis very DbI-ish.
There are a few nice precedents of blend algorithms, see e.g.http://effbot.org/zone/stringlib.htm.
Writing a generic subsequence search blend algorithm, one thatchooses the right algorithm based on a combination of staticand dynamic decisions, is quite a fun and challenging project.Who wanna?

I observed that Boyer-Moore from std is still slower. My guess isdue to the relatively short strings in my benchmark and the needto allocate for the tables. Knuth-Morris-Pratt might beworthwhile, because only requires a smaller table, which could beallocated on the stack.

The overall sentiment seems to be that KMP vs BM depends on theinput. This means an uninformed find function could only useheuristics. Anyway, Phobos could use a KnuthMorrisPrattFinderimplementation next to the BoyerMooreFinder. I filed an issue [0].

Apart from advanced algorithms, find should not be slower than anaive nested-for-loop implementation.


[0] https://issues.dlang.org/show_bug.cgi?id=16066

Re: faster splitter

Reply via email to