On 19.03.2012 23:24, Jay Norwood wrote:
On Monday, 19 March 2012 at 13:55:39 UTC, Dmitry Olshansky wrote:
That's right, however counting is completely separate from regex,
you'd want to use std.algorithm count:
count(match(....,"\n"));

or more unicode-friendly:
count(match(...., regex("$","m")); //note the multi-line flag

Ehm, forgot "g" flag myself, so it would be

count(match(...., regex("$","gm"));

and

count(match(...., regex("\n","g"));

Note that if your task is to split buffer by exactly '\n' byte then loop with memchr is about as fast as it gets, no amount of magic compiler optimizations would make other generic ways better (even theoretically). What they *could* do is bring the difference lower.

This only sets l_cnt to 1

void wcp_cnt_match1 (string fn)
{
string input = cast(string)std.file.read(fn);
enum ctr = ctRegex!("$","m");
ulong l_cnt = std.algorithm.count(match(input,ctr));
}

This works ok, but though concise it is not very fast

void wcp (string fn)
{
string input = cast(string)std.file.read(fn);
ulong l_cnt = std.algorithm.count(input,"\n");
}



BTW I suggest to separate I/O from actual work or better yet, time both separately via std.datetime.StopWatch.

This fails to build, so I'd guess is missing \p

void wcp (string fn)
{
enum ctr = ctRegex!("\p{WhiteSpace}","m");
}

------ Build started: Project: a7, Configuration: Release Win32
------
Building Release\a7.exe...
a7.d(210): undefined escape sequence \p


Not a bug, a compiler escape sequence.
How do you think \n works in your non-regex examples ? ;)


--
Dmitry Olshansky

Reply via email to