Jonas Maebe schrieb:

And we have to deal with Windows, where the default is UTF16.

... since Delphi 2009 uses (unicode)string everywhere, we need at least also 
unicode versions.

Since the generic Delphi "string" type can be any Unicode encoding now, it IMO would be legal to use UTF-8 or UTF-32 for it internally, in FPC. Some code, expecting UCS2/BMP text only, may become a bit slower due to according conversions in indexed access to chars, but no other *implicit* conversions will ever occur. Likewise the generic "char" type could become a 32 bit type, so that it can hold *every* Unicode codepoint.

For both "string" and "array of char" the "packed" keyword could be used to distinguish between different bytecount and encoding, where unpacked types contain UTF-32 chars. This would speed up user code with indexed access, in contrast to both UTF-8 and -16 encodings, and it would allow the user to optimize his code for either speed or size. Indexed access to packed types simply could be disallowed, without breaking anything since the default is "not packed".

Just some more ideas...

DoDi

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to