Hello bearophile,
BCS:
Have you compared it to a decisition tree or lex style state mechine?
I have not, I am sorry :-) But more comparisons can be added.
I know what decision trees are in data mining, but I presume you mean
some kind of ternary tree (3 possible results of the string cmp for
each char compared).
Bye,
bearophile
I'm not sure how I'd set it up but I expect it would amount to a hard coded
radex sort:
- do a normal integer switch on string length
- for each length group the strings based on common prefixes (if you have
aaab, aaac, aabb, bbbb the grouping would be: [[aaab, aaac], aabb], [bbbb])
- walk down the tree using if statements.
I guess it's not much different than a lex DFA but for a small list of static
strings it might be faster. Also it has the option of using strcmp for tails
and long sub spans.