Stephan Bergmann wrote: > Herbert Duerr wrote: >> To support characters outside of the unicode base plane I'd like to add a >> new sal_UCS4 type to OpenOffice. >> [...] > See the thread at > <http://www.openoffice.org/servlets/ReadMsg?listName=dev&msgNo=18462> > for ideas how to change interfaces (sometimes it is better to replace > sal_Unicode with rtl::OUString etc.).
Of course the complete string context is preferable to the current 16bit-sal_Unicode interface, but... the rework cost to change all this is overkill in a lot of situations though, e.g. to get a character directionality, to get the character attributes needed for vertical layout, to get the mirrored character, to get the spacing attribute (needed for "word underline"), to get a character digit's localized equivalent, etc. > UCS-2 and UCS-4 are not Unicode (<www.unicode.org>) terms. Yup, they are ISO-10646 terms. > sal_Unicode represents a UTF-16 code unit (without any ambiguity). I hope we can agree that an interface a single UTF-16 code unit is a broken design regarding characters outside the unicode base plane. >> Of course the interfaces could be changed to something like sal_uInt32, >> but then a lot of interesting type information would be lost. > [...] > Typedefs in C++ are, well, strange beasts. As a client you often have > to be aware of exactly what other type the typedef aliases (e.g., when > declaring overloaded functions, when using varargs, printf, when > determining whether there is an appropriate streaming operator <<, when > building expressions on integer types). At least the typedef make meanings more clear and if there ever is a problem with ambiguity than changing the simple typedef to a more explicit type with rigourosly defined conversions from/to other types is possible. >> - unicode values beyond 2^32 are not unthinkable > > How do you come to think that? ;) If your pragmatic approach of using sal_uInt32 is taken, then finding all the places that would need to be adjusted would be much more costly than simply finding and checking uses of sal_UCS4. >> Did I miss any important issues against adding a sal_UCS4 type? > > Would that be "sal_UCS4" or "sal_Ucs4"? I'd prefer sal_UCS4 because I've never seen the abbreviation for "Universal Character Set" spelled in non-caps. Anyway, I'm already more than overloaded with work and don't have much time for discussions about different tastes. I'll use sal_uInt32... :-( -- Herbert --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
