On Tuesday, 11 October 2016 at 08:17:52 UTC, Stefan Koch wrote:
On Tuesday, 11 October 2016 at 08:03:40 UTC, Stefan Koch wrote:
On Tuesday, 11 October 2016 at 07:30:26 UTC, Matthias Bentrup wrote:

A branch-free version:

void popFront4(ref char[] s) @trusted pure nothrow {
  immutable c = s[0];
  uint char_length = 1 + (c >= 192) + (c >= 240) + (c >= 248);
  s = s.ptr[char_length .. s.length];
}

Theoretically the char_length could be computed with three sub and addc instructions, but no compiler is smart enough to detect that.

You still need to special case c < 128
as well as the follow chars.

also smaller c's are more common the bigger ones making the branching version faster on average.

Also the code produces conditional set instructions which have a higher latency.
And worse throughput.

void popFront1(ref char[] s) @trusted pure nothrow
{
  import core.bitop, std.algorithm;
  auto v = bsr(~s[0] | 1);
  s = s[clamp(v, 1, v > 6 ? 1 : $)..$];
}

Seems to be less if i'm not wrong.

Reply via email to