Re: [fpc-devel] Unicode in the RTL (my ideas)

Hans-Peter Diettrich Tue, 21 Aug 2012 04:20:17 -0700

Martin Schreiber schrieb:

All "access a char by index into a string" code I have seen, 99.99% of
the time work in a sequential manner. For that reason there is no
speed difference between using a UTF-16 or UTF-8 encoded string. Both
can be coded equally efficient.
Graeme, this is simply not true. Searching for known German charactersin a UnicodeString the program can use the simple approach by character(code unit) index. It is even possible for known Chinese symbols of theBMP. And a simple "if" for surrogate pairs is more efficent as a 4-stage"case" for utf-8.

The good ole Pos() can do that, why search for more complicatedimplementations?

You still try to use old coding patterns which are simply inappropriatefor dealing with Unicode strings. Why make a distinction betweensearching for a single character or multiple characters, when it's knownthat one character can require multiple bytes or words in UTF-8/16?


DoDi

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Unicode in the RTL (my ideas)

Reply via email to