On Mon, Dec 27, 2021 at 3:41 PM Juha Manninen via lazarus
<lazarus@lists.lazarus-ide.org> wrote:

> It must be a Big endian / Little endian issue. IIRC it can be adjusted in ARM 
> CPUs.
> Why do MacOS and Linux use a different setting there? I have no idea.

On second thought: if the function returns grabage for just a single
'€', the code for that should not enter the pasrt where it handles
blocks of size PtrInt and does masking with EIGHTYMASK etc. (The part
of the code that might be endianness dependant).
It should go to one of the 2 loops that simply does:  Result += (pn8^
shr 7) and ((not pn8^) shr 6);
That part should not depend on endianness at all.

On Win32 a sigle '€' will result in something like this:

pn8^              =11100010   //first byte
(pn8^ shr 7)      =11111111  //<<-- I would have expected that to be 00000001 ?
(not pn8^)        =00011101
(not pn8^) shr 6  =00000000
Add: (pn8^ shr 7) and ((not pn8^) shr 6)=0

pn8^              =10000010   //second byte
(pn8^ shr 7)      =11111111
(not pn8^)        =01111101
(not pn8^) shr 6  =00000001
Add: (pn8^ shr 7) and ((not pn8^) shr 6)=1

pn8^              =10101100   //third and last byte of '€'
(pn8^ shr 7)      =11111111
(not pn8^)        =01010011
(not pn8^) shr 6  =00000001
Add: (pn8^ shr 7) and ((not pn8^) shr 6)=1

B.t.w.
I find the code in Utf8LengthFast difficult to read.
Personally I dislike the C-ism of += and >> (even more so if both >>
and shr is used).

-- 
Bart
-- 
_______________________________________________
lazarus mailing list
lazarus@lists.lazarus-ide.org
https://lists.lazarus-ide.org/listinfo/lazarus

Reply via email to