At 10:49 PM 4/7/2004, Peter Constable wrote:
>, and the length it reports
> is the number of code units, not the number of characters or graphemes
in
> the string.
True; that is documented.
However, that's very common; many APIs relating to UTF-8 would report
the number of bytes, not the number of
> > In the .Net Framework, the string class (System
> > namespace) and the System.Globalization and System.Text
> > namespaces *are* designed to be aware of supplementary plane
> > characters.
> IMHO, that's a bit misleading. The String class itself does not appear
to
> be
> aware of SMP characte
There's some discussion of support for supplementary-plane characters in
WinXP at
http://www.microsoft.com/globaldev/DrIntl/columns/018/default.mspx.
There are also slide decks from some relevant presentations at
http://www.microsoft.com/globaldev/reference/presentations/23rd_Unicode_
Conf.mspx
The article shows how to enable OS support for surrogates in fonts and
IMEs, but it is helpful to bear in mind that applications tend not to
care. For instance SQL server does not correctly sort surrogates --
although it doesn't split or truncate them either (which is an
improvement over the com
Benjamin,
> Versions up until Windows 2000 use UCS-2 internally. 2000 and XP use
> UTF-16, although applications tend to have differing levels of awareness
> about surrogates.
You can enable Win2K surrogate support
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicod
e_1
Versions up until Windows 2000 use UCS-2 internally. 2000 and XP use
UTF-16, although applications tend to have differing levels of awareness
about surrogates.
Regardless of whether UCS-2 or UTF-16 is used, Microsoft documentation
always refers to any unicode encoding as 'Unicode'. I attribu
Dan Smith said on Fri, Apr 02, 2004 at 03:04:22PM -0500,:
> 1) The documentation we've found for Unicode support in Windows seems vague on
> how Unicode is implemented. A good deal of it seems to imply that a character
> is always represented by exactly two bytes, no more, no less, under all
>
7 matches
Mail list logo