Re: Reducing the cost of autodecoding

Patrick Schluter via Digitalmars-d Sun, 16 Oct 2016 14:19:04 -0700

On Sunday, 16 October 2016 at 10:05:37 UTC, Patrick Schluterwrote:

On Sunday, 16 October 2016 at 08:43:23 UTC, Uplink_Coder wrote:
On Sunday, 16 October 2016 at 07:59:16 UTC, Patrick Schluterwrote:
This looks quite slow.
We already have a correct version in utf.decodeImpl.
The goal here was to find a small and fast alternative.
I know but it has to be correct before being fast.
The code is simple and the checks can easily be removed. Herethe version without overlong, invalid sequence and codepointcheck.
 dchar myFront3(ref char[] str)
 {
   dchar c0 = str.ptr[0];
   if(c0 < 0x80) {
     return c0;
   }
   else if(str.length > 1) {
     dchar c1 = str.ptr[1];
     if(c0 < 0xE0) {
       return ((c0 & 0x1F) << 6)|(c1 & 0x3F);
     }
     else if(str.length > 2) {
       dchar c2 = str.ptr[2];
       if(c0 < 0xF0) {
return ((c0 & 0x0F) << 12)|((c1 & 0x3F) << 6)|(c2 &0x3F);
       }
       else if(str.length > 3) {
         dchar c3 = str.ptr[3];
         if(c0 < 0xF5) {
return((c0 & 0x07) << 16)|((c1 & 0x3F) << 12)|((c2 &0x3F) << 6)|(c3 & 0x3F);


Of course, this line is wrong, should shift by 18 not 16 :
           return((c0 & 0x07) << 18)|((c1 & 0x3F) << 12)|((c2 &
 0x3F) << 6)|(c3 & 0x3F);

         }
       }
     }
   }
   Linvalid:
      throw new Exception("yadayada");
 }
Next step will be to loop for length 2,3,4, with or withoutyour table.

Re: Reducing the cost of autodecoding

Reply via email to