To support characters outside of the unicode base plane I'd like to add a new sal_UCS4 type to OpenOffice.
The currently used sal_Unicode type is not sufficient for these characters. Also the use of sal_Unicode is ambigous in OOo. In interfaces with a scalar sal_Unicode it means a character encoded in UCS-2, in interfaces with an array of sal_Unicodes it means UTF-16 encoded characters. Interfaces that currently only take a sal_Unicode in the meaning UCS-2 are broken by design regarding unicode surrogates. At least for the internal interfaces the easiest fix is to change their signature to use UCS-4 instead of scalar sal_Unicodes. For the external interfaces with the design bug mentioned above new methods that are capable of handling unicodes outside the base plane should be added. Of course the interfaces could be changed to something like sal_uInt32, but then a lot of interesting type information would be lost. Though the first step of adding a sal_UCS4 type to sal/types.h seemed to be uncontroversial there was significant opposition to this idea. So I'd like to collect the arguments against it: - sal_uInt32 as an alternative is a good enough - a typedef to sal_uInt32 is not good enough - unicode values beyond 2^32 are not unthinkable Did I miss any important issues against adding a sal_UCS4 type? -- Herbert --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
