Sergei Gorelkin escreveu:

Well, with exclusion of the "class helper for TStrings" (notable is that they call it a hack themselves :) the design looks rather clean. Since each string stores its element size, both ansi and unicode strings are probably handled with common set of procedures, avoiding RTL size bloat.


I also like the design since is flexible enough to allow the programmer work with different encodings.

And they explain why there is no compiler option for switching back and forth.

Unfortunately, the article does not provide information about how things like Pos() and Copy() work with utf8 strings.
Here ( http://www.jacobthurman.com/?p=30 see comments) there's an explanation about those functions. Basically they will handle Code Units and not Code Points (characters)

However, one may understand words "utf-8 support is more limited than utf-16" as they continue to work with elements (bytes).

Yes. This is a good decision also IMO.

Luiz
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to