On Friday, 9 January 2015 at 14:03:21 UTC, ketmar via
Digitalmars-d-learn wrote:
On Fri, 09 Jan 2015 13:54:00 +0000
Robert burner Schadek via Digitalmars-d-learn
<digitalmars-d-learn@puremagic.com> wrote:
On Friday, 9 January 2015 at 13:25:17 UTC, ketmar via
Digitalmars-d-learn wrote:
> if you *really* concerned with speed here, you'd better
> consider using
> regular expressions. as regular expression can be
> precompiled and then
> search for multiple words with only one pass over the source
> string. i
> believe that std.regex will use variation of Thomson
> algorithm for
> regular expressions when it is able to do so.
IMO that is not sound advice. Creating the state machine and
running will be more costly than using canFind or indexOf how
basically only compare char by char.
If speed is really need use strstr and look if it uses sse to
compare multiple chars at a time. Anyway benchmark and then
benchmark some more.
std.regex can use CTFE to compile regular expressions (yet it
sometimes
slower than non-CTFE variant), and i mean that we compile
regexp before
doing alot of searches, not before each single search. if you
have alot
of words to match or alot of strings to check, regexp can give
a huge
boost.
sure, it all depends of code patterns.
import std.regex;
auto ctr = ctRegex!(`(home|office|sea|plane)`);
auto c2 = !matchFirst("He is in the sea.", ctr).empty;
----------------------------------------------------------
Test by auto r = benchmark!(f0,f1, f2, f3,f4,f5)(10_0000);
Result is :
filter is 42ms 85us
findAmong is 37ms 268us
foreach indexOf is 37ms 841us
canFind is 13ms
canFind indexOf is 39ms 455us
ctRegex is 138ms