On 11/26/2014 06:37 PM, Hans-Peter Diettrich wrote:

An AnsiString consists of AnsiChar's. The *meaning* of these char's (bytes) depends on their encoding, regardless of whether the used encoding is or is not stored with the string.
I understand that the implementation (in Delphi) seems to be driven more by the Wording ("ANSI") than by the logical paradigm the language syntax suggests. The language syntax and the string header fields suggest that both the element-size as the code-ID-number need to be adhered to (be it statically or dynamically - depending on the usage instance). E.g. there are (are least two "Code pages" for UTF-16 ("LE", and "BE"), that would be worth supporting.

It's essential to distinguish between low-level (physical) AnsiChar values, and *logical* characters possibly consisting of multiple AnsiChars.
I now do see that the implementation is done following this concept. But the language syntax and the string header field suggest a more versatile paradigm, providing a universal reference counting "element string" type.

That's why I wonder *when* exactly the result of such an expression *is* converted (implicitly) into the static encoding of the target variable, and when *not*.
I understand that the idea is, to use the static encoding information provided by the type definition whenever possible. I understand that if no RawByteString is involved in the operation, the static encoding information is sufficient and hence the potential calls to the dedicated conversion library functions can completely be constructed at compile time.

In Delphi the use of the dynamic encoding information seems to be very rare (and the implementation does not make much sense to me).


The entire mess results from the bad interpretation of RawByteString assignments, which IMO was well thought by the Delphi language architects, but not understood by the Delphi compiler coders.

I fully agree with you.

I suppose the original idea was to create an (additional) fully dynamic type brand, for that whenever used, the compiler needs to read the dynamic encoding information (both element-size and encoding-ID-number) and act appropriately. With that decently implemented, in fact, TStrings and similar classes could use this type for universal handling of all String type brands.

My hope was, that fpc might be able to correct this error of the Delphi compiler coders. But of course for Delphi compatibility the type name RawByteString and the code-ID-number $FFFF can't be used any more, but a new naming and ID number would need to be invented. IMHO this in fact is possible and viable (see wiki page for details).

-Michael


_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Reply via email to