On Fri, 09 Jan 2015 15:36:21 +0000 FrankLike via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com> wrote:
> On Friday, 9 January 2015 at 14:03:21 UTC, ketmar via > Digitalmars-d-learn wrote: > > On Fri, 09 Jan 2015 13:54:00 +0000 > > Robert burner Schadek via Digitalmars-d-learn > > <digitalmars-d-learn@puremagic.com> wrote: > > > >> On Friday, 9 January 2015 at 13:25:17 UTC, ketmar via > >> Digitalmars-d-learn wrote: > >> > if you *really* concerned with speed here, you'd better > >> > consider using > >> > regular expressions. as regular expression can be > >> > precompiled and then > >> > search for multiple words with only one pass over the source > >> > string. i > >> > believe that std.regex will use variation of Thomson > >> > algorithm for > >> > regular expressions when it is able to do so. > >> > >> IMO that is not sound advice. Creating the state machine and > >> running will be more costly than using canFind or indexOf how > >> basically only compare char by char. > >> > >> If speed is really need use strstr and look if it uses sse to > >> compare multiple chars at a time. Anyway benchmark and then > >> benchmark some more. > > std.regex can use CTFE to compile regular expressions (yet it > > sometimes > > slower than non-CTFE variant), and i mean that we compile > > regexp before > > doing alot of searches, not before each single search. if you > > have alot > > of words to match or alot of strings to check, regexp can give > > a huge > > boost. > > > > sure, it all depends of code patterns. > import std.regex; > auto ctr = ctRegex!(`(home|office|sea|plane)`); > auto c2 = !matchFirst("He is in the sea.", ctr).empty; > ---------------------------------------------------------- > Test by auto r = benchmark!(f0,f1, f2, f3,f4,f5)(10_0000); > > Result is : > filter is 42ms 85us > findAmong is 37ms 268us > foreach indexOf is 37ms 841us > canFind is 13ms > canFind indexOf is 39ms 455us > ctRegex is 138ms 1. stop doing captures in regexp, this will speedup the comparison. 2. your sample is very artificial. i was talking about alot more keywords and alot longer strings. sorry, i wasn't told that clear enough.
signature.asc
Description: PGP signature