On 11/27/2014 07:29 PM, Hans-Peter Diettrich wrote:
Michael Schnell schrieb:
E.g. there are (are least two "Code pages" for UTF-16 ("LE", and
"BE"), that would be worth supporting.
You are confusing codepages and encodings :-(
That is why I put "goose-feet" around "Code pages". I used this wording
because fpc (and Delphi ?) uses it abbreviated as "CP" in the constant
name "CP_UTF-8", "CP_UTF16" and "CP_UTF16BE) [ see Jonas post:
"CP_UTF16 and CP_UTF16BE can be returned by StringCodePage() when called
on a unicodestring, and that's it." ]
See it as a multi-level protocol for text processing. ....
Yep. I see that is is workable and I understand the (supposedly mostly
historical) reasons. But IMHO not a good (i.e. crafted from ground up)
concept.
It's known that the Delphi AnsiString implementation is flawed,...
And hence it's frustrating to see that fpc needs to follow for
compatibility reasons. That is why I suggested an improved
implementation (see ->
http://wiki.freepascal.org/not_Delphi_compatible_enhancement_for_Unicode_Support).
While the seriously flawed Delphi compatible use of the dynamic
encoding-brand (and bytes-per element) information (only implemented
with RawByteString) can be left at it is and a decent implementation
with a new DynmicString Type (CP_ANY) should be crafted.
I see no problem in using the same names and values. Delphi documents
clearly state: ...
I fear that there will be code that relies on the "flawed" behavior of
RawByteString ("it's a feature, not a bug") and using the same name with
different behavior would brake same. And a really usable DynmicString
would not adhere to that description.
-Michael
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel