On Tuesday, 11 October 2016 at 03:58:59 UTC, Andrei Alexandrescu wrote:
On 10/10/16 11:00 PM, Stefan Koch wrote:
On Tuesday, 11 October 2016 at 02:48:22 UTC, Andrei Alexandrescu wrote:
That looks good. I'm just worried about the jump forward - ideally the case c < 127 would simply entail a quick return. I tried a fix, but it didn't do what I wanted in ldc. We shouldn't assert(0) if wrong - just skip one byte. Also, are we right to not worry about 5- and 6-byte sequences? The docs keep on threatening with it, and then immediately
mention those are not valid.

[ ... ]

Andrei


If you want to skip a byte it's easy to do as well.

void popFront3(ref char[] s) @trusted pure nothrow {
   immutable c = s[0];
   uint char_length = 1;
   if (c < 127)
   {
   Lend :
     s = s.ptr[char_length .. s.length];
   } else {
     if ((c & b01100_0000) == 0b1000_0000)
     {
//just skip one in case this is not the beginning of a code-point
char
       goto Lend;
     }
     if (c < 192)
     {
       char_length = 2;
       goto Lend;
     }
     if (c < 240)
     {
       char_length = 3;
       goto Lend;
     }
     if (c < 248)
     {
       char_length = 4;
       goto Lend;
     }
   }
 }


Affirmative. That's identical to the code in "[ ... ]" :o). Generated code still does a jmp forward though. -- Andrei

It was not identical.
((c & b01100_0000) == 0b1000_0000))
Can be true in all of the 3 following cases.
If we do not do a jmp to return here, we cannot guarantee that we will not skip over the next valid char.
Thereby corrupting already corrupt strings even more.

For best performance we need to leave the gotos in there.

Reply via email to