Re: [fpc-pascal] Parse unicode scalar

Hairy Pixels via fpc-pascal Mon, 03 Jul 2023 18:03:39 -0700

> On Jul 4, 2023, at 1:15 AM, Mattias Gaertner via fpc-pascal 
> <fpc-pascal@lists.freepascal.org> wrote:
> 
> function ReadUTF8(p: PChar; ByteCount: PtrInt): PtrInt;
> // returns the number of codepoints
> var
>  CodePointLen: longint;
>  CodePoint: longword;
> begin
>  Result:=0;
>  while (ByteCount>0) do begin
>    inc(Result);
>    CodePoint:=UTF8CodepointToUnicode(p,CodePointLen);
>    ...do something with the CodePoint...
>    inc(p,CodePointLen);
>    dec(ByteCount,CodePointLen);
>  end;
> end;

Thanks, this looks right. I guess this is how we need to iterate over unicode 
now.

Btw, why isn't there a for-loop we can use over unicode strings? seems like 
that should be supported out of the box. I had this same problem in Swift also 
where it's extremely confusing to merely iterate over a string and look at each 
character. Replacing characters will be tricky also so we need some good 
library functions.

Swift is especially terrible because there's NO ANSII string so even a 1 byte 
sequence needs all these confusing as hell functions to do any work with 
strings at all. Terrible experience and slooooow.

Regards,
Ryan Joseph

_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] Parse unicode scalar

Reply via email to