Jonas Maebe schrieb:
And we have to deal with Windows, where the default is UTF16.
... since Delphi 2009 uses (unicode)string everywhere, we need at least also
unicode versions.
Since the generic Delphi "string" type can be any Unicode encoding now,
it IMO would be legal to use UTF-8 or UTF-32 for it internally, in FPC.
Some code, expecting UCS2/BMP text only, may become a bit slower due to
according conversions in indexed access to chars, but no other
*implicit* conversions will ever occur. Likewise the generic "char" type
could become a 32 bit type, so that it can hold *every* Unicode codepoint.
For both "string" and "array of char" the "packed" keyword could be used
to distinguish between different bytecount and encoding, where unpacked
types contain UTF-32 chars. This would speed up user code with indexed
access, in contrast to both UTF-8 and -16 encodings, and it would allow
the user to optimize his code for either speed or size. Indexed access
to packed types simply could be disallowed, without breaking anything
since the default is "not packed".
Just some more ideas...
DoDi
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel