Sergei Gorelkin escreveu:
Well, with exclusion of the "class helper for TStrings" (notable is
that they call it a hack themselves :) the design looks rather clean.
Since each string stores its element size, both ansi and unicode
strings are probably handled with common set of procedures, avoiding
RTL size bloat.
I also like the design since is flexible enough to allow the programmer
work with different encodings.
And they explain why there is no compiler option for switching back
and forth.
Unfortunately, the article does not provide information about how
things like Pos() and Copy() work with utf8 strings.
Here ( http://www.jacobthurman.com/?p=30 see comments) there's an
explanation about those functions. Basically they will handle Code Units
and not Code Points (characters)
However, one may understand words "utf-8 support is more limited than
utf-16" as they continue to work with elements (bytes).
Yes. This is a good decision also IMO.
Luiz
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel