On Mon, 3 Jul 2023 09:34:10 +0700
Hairy Pixels via fpc-pascal <[email protected]> wrote:
>[...]
> Ok today I I just tried to ask ChatGPT and got an answer. I must have
> asked the wrong thing yesterday but it got it right today (with one
> syntax error using an inline "var" in the code section for some
> reason).
>
> How does this look?
>
> procedure SplitUTF8Bytes(unicodeScalar: Integer; var bytes: array of
> Byte);
Useless array of.
And it does not return the bytecount.
> var
> i: Integer;
> byteCount: Integer;
> begin
> // Number of bytes required to represent the Unicode scalar
> if unicodeScalar < $80 then
> byteCount := 1
> else if unicodeScalar < $800 then
> byteCount := 2
> else if unicodeScalar < $10000 then
> byteCount := 3
> else if unicodeScalar < $110000 then
> byteCount := 4
> else
> raise Exception.Create('Invalid Unicode scalar');
>
> // Extract the individual bytes using bitwise operations
> for i := byteCount - 1 downto 0 do
> begin
> bytes[i] := $80 or (unicodeScalar and $3F);
Wrong for byteCount=1
> unicodeScalar := unicodeScalar shr 6;
> end;
>
> // Set the leading bits of each byte
> case byteCount of
> 2:
> bytes[0] := $C0 or bytes[0];
> 3:
> bytes[0] := $E0 or bytes[0];
> 4:
> bytes[0] := $F0 or bytes[0];
> end;
> end;
Well, it got the basic idea of UTF-8 multibytes right and it compiles,
so maybe half the points?
Mattias
_______________________________________________
fpc-pascal maillist - [email protected]
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal