RE: Newbie questions: 1) Surrogates in WinXP? 2) Unicode in PostScript?

2004-04-08 Thread Asmus Freytag
At 10:49 PM 4/7/2004, Peter Constable wrote: >, and the length it reports > is the number of code units, not the number of characters or graphemes in > the string. True; that is documented. However, that's very common; many APIs relating to UTF-8 would report the number of bytes, not the number of

RE: Newbie questions: 1) Surrogates in WinXP? 2) Unicode in PostScript?

2004-04-07 Thread Peter Constable
> > In the .Net Framework, the string class (System > > namespace) and the System.Globalization and System.Text > > namespaces *are* designed to be aware of supplementary plane > > characters. > IMHO, that's a bit misleading. The String class itself does not appear to > be > aware of SMP characte

RE: Newbie questions: 1) Surrogates in WinXP? 2) Unicode in PostScript?

2004-04-07 Thread Peter Constable
There's some discussion of support for supplementary-plane characters in WinXP at http://www.microsoft.com/globaldev/DrIntl/columns/018/default.mspx. There are also slide decks from some relevant presentations at http://www.microsoft.com/globaldev/reference/presentations/23rd_Unicode_ Conf.mspx

RE: Newbie questions: 1) Surrogates in WinXP? 2) Unicode in PostScript?

2004-04-05 Thread Benjamin Peterson
The article shows how to enable OS support for surrogates in fonts and IMEs, but it is helpful to bear in mind that applications tend not to care. For instance SQL server does not correctly sort surrogates -- although it doesn't split or truncate them either (which is an improvement over the com

RE: Newbie questions: 1) Surrogates in WinXP? 2) Unicode in PostScript?

2004-04-05 Thread Carl W. Brown
Benjamin, > Versions up until Windows 2000 use UCS-2 internally. 2000 and XP use > UTF-16, although applications tend to have differing levels of awareness > about surrogates. You can enable Win2K surrogate support http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicod e_1

Re: Newbie questions: 1) Surrogates in WinXP? 2) Unicode in PostScript?

2004-04-05 Thread Benjamin Peterson
Versions up until Windows 2000 use UCS-2 internally. 2000 and XP use UTF-16, although applications tend to have differing levels of awareness about surrogates. Regardless of whether UCS-2 or UTF-16 is used, Microsoft documentation always refers to any unicode encoding as 'Unicode'. I attribu

Re: Newbie questions: 1) Surrogates in WinXP? 2) Unicode in PostScript?

2004-04-04 Thread Mahesh T. Pai
Dan Smith said on Fri, Apr 02, 2004 at 03:04:22PM -0500,: > 1) The documentation we've found for Unicode support in Windows seems vague on > how Unicode is implemented. A good deal of it seems to imply that a character > is always represented by exactly two bytes, no more, no less, under all >