...: the new ansistring type has a hidden "element size" field (in addition to the reference count, length and codepage), and from what I can see at page 10 of http://edn.embarcadero.com/article/images/38980/Delphi_and_Unicode.pdf, Delphi 2009's unicodestring is simply an ansistring(1200).
So it seems, that if we will have any "GenericString", with properties "reference count", "size", "character width", "codepage", then all other string types can be based on this string type. So other strings will be only any "shortcuts", and internaly will use same structure: AnsiString = GenericString(with actual system ANSI code page (0) ... or ... without any explicit codepage ($ffff))
UTF8String = GenericString(with UTF-8 encoding)
UnicodeString = GenericString(with UTF-16 encoding)

So it seems to me, that there is agreement on adding "character width", "codepage" to internal "string" record structure and provide conversions where needed, isn't it ? (more or less same approach like in Delphi)

Where is not agreement, it is fact what should be default string encoding (AnsiString($ffff) or UTF-8 or UTF-16 or UTF-32)

So if I revert to my original question ... is there any agreement on some points related to "future of String type" ?

P.S. I still does not understand, how can things work correctly if LCL expect that all AnsiStrings (String) are UTF8Strings, byt RTL/FCL does not strictly follow this (at least in Windows) ?

-Laco.
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to