On Tuesday, 11 October 2016 at 04:05:47 UTC, Stefan Koch wrote:
On Tuesday, 11 October 2016 at 03:58:59 UTC, Andrei Alexandrescu wrote:
On 10/10/16 11:00 PM, Stefan Koch wrote:
On Tuesday, 11 October 2016 at 02:48:22 UTC, Andrei Alexandrescu wrote:
[...]

If you want to skip a byte it's easy to do as well.

void popFront3(ref char[] s) @trusted pure nothrow {
   immutable c = s[0];
   uint char_length = 1;
   if (c < 127)
   {
   Lend :
     s = s.ptr[char_length .. s.length];
   } else {
     if ((c & b01100_0000) == 0b1000_0000)
     {
//just skip one in case this is not the beginning of a code-point
char
       goto Lend;
     }
     if (c < 192)
     {
       char_length = 2;
       goto Lend;
     }
     if (c < 240)
     {
       char_length = 3;
       goto Lend;
     }
     if (c < 248)
     {
       char_length = 4;
       goto Lend;
     }
   }
 }


Affirmative. That's identical to the code in "[ ... ]" :o). Generated code still does a jmp forward though. -- Andrei

It was not identical.
((c & b01100_0000) == 0b1000_0000))
Can be true in all of the 3 following cases.
If we do not do a jmp to return here, we cannot guarantee that we will not skip over the next valid char.
Thereby corrupting already corrupt strings even more.

For best performance we need to leave the gotos in there.

A branch-free version:

void popFront4(ref char[] s) @trusted pure nothrow {
  immutable c = s[0];
  uint char_length = 1 + (c >= 192) + (c >= 240) + (c >= 248);
  s = s.ptr[char_length .. s.length];
}

Theoretically the char_length could be computed with three sub and addc instructions, but no compiler is smart enough to detect that.

Reply via email to