On Sunday, 16 October 2016 at 10:05:37 UTC, Patrick Schluter wrote:
On Sunday, 16 October 2016 at 08:43:23 UTC, Uplink_Coder wrote:
On Sunday, 16 October 2016 at 07:59:16 UTC, Patrick Schluter wrote:

This looks quite slow.
We already have a correct version in utf.decodeImpl.
The goal here was to find a small and fast alternative.

I know but it has to be correct before being fast.
The code is simple and the checks can easily be removed. Here the version without overlong, invalid sequence and codepoint check.

 dchar myFront3(ref char[] str)
 {
   dchar c0 = str.ptr[0];
   if(c0 < 0x80) {
     return c0;
   }
   else if(str.length > 1) {
     dchar c1 = str.ptr[1];
     if(c0 < 0xE0) {
       return ((c0 & 0x1F) << 6)|(c1 & 0x3F);
     }
     else if(str.length > 2) {
       dchar c2 = str.ptr[2];
       if(c0 < 0xF0) {
return ((c0 & 0x0F) << 12)|((c1 & 0x3F) << 6)|(c2 & 0x3F);
       }
       else if(str.length > 3) {
         dchar c3 = str.ptr[3];
         if(c0 < 0xF5) {
return((c0 & 0x07) << 16)|((c1 & 0x3F) << 12)|((c2 & 0x3F) << 6)|(c3 & 0x3F);

Of course, this line is wrong, should shift by 18 not 16 :
           return((c0 & 0x07) << 18)|((c1 & 0x3F) << 12)|((c2 &
 0x3F) << 6)|(c3 & 0x3F);

         }
       }
     }
   }
   Linvalid:
      throw new Exception("yadayada");
 }

Next step will be to loop for length 2,3,4, with or without your table.


Reply via email to