Re: regex issue

Dmitry Olshansky Tue, 20 Mar 2012 03:29:57 -0700

On 19.03.2012 23:24, Jay Norwood wrote:

On Monday, 19 March 2012 at 13:55:39 UTC, Dmitry Olshansky wrote:

That's right, however counting is completely separate from regex,
you'd want to use std.algorithm count:
count(match(....,"\n"));


or more unicode-friendly:
count(match(...., regex("$","m")); //note the multi-line flag

Ehm, forgot "g" flag myself, so it would be

count(match(...., regex("$","gm"));

and

count(match(...., regex("\n","g"));

Note that if your task is to split buffer by exactly '\n' byte then loopwith memchr is about as fast as it gets, no amount of magic compileroptimizations would make other generic ways better (even theoretically).What they *could* do is bring the difference lower.

This only sets l_cnt to 1

void wcp_cnt_match1 (string fn)
{
string input = cast(string)std.file.read(fn);
enum ctr = ctRegex!("$","m");
ulong l_cnt = std.algorithm.count(match(input,ctr));
}

This works ok, but though concise it is not very fast

void wcp (string fn)
{
string input = cast(string)std.file.read(fn);
ulong l_cnt = std.algorithm.count(input,"\n");
}

BTW I suggest to separate I/O from actual work or better yet, time bothseparately via std.datetime.StopWatch.

This fails to build, so I'd guess is missing \p

void wcp (string fn)
{
enum ctr = ctRegex!("\p{WhiteSpace}","m");
}

------ Build started: Project: a7, Configuration: Release Win32
------
Building Release\a7.exe...
a7.d(210): undefined escape sequence \p


Not a bug, a compiler escape sequence.
How do you think \n works in your non-regex examples ? ;)


--
Dmitry Olshansky

Re: regex issue

Reply via email to