Re: What's in a wchar_t string on unix?
As specified in C99 (and maybe earlier), if the macro __STDC_ISO_10646__ is defined, then wchar_t values are ucs4. Otherwise, wchar_t is an opaque type and you can't be sure what it is. Noah On Mon, Mar 01, 2004 at 11:13:58 -0800, Rick Cameron wrote: > Hi, all > > This may be an FAQ, but I couldn't find the answer on unicode.org. > > It seems that most flavours of unix define wchar_t to be 4 bytes. If the > locale is set to be Unicode, what's in a wchar_t string? Is it UTF-32, or > UTF-16 with the code units zero-extended to 4 bytes? > > Cheers > > - rick cameron
Re: Questions on ZWNBS - for line initial holam plus alef
On Mon, Aug 11, 2003 at 12:57:11 -0700, Kenneth Whistler wrote: > Kent asked: > > > How should a freestanding double diacritic be encoded (for purposes of > > meta-discussions, and the like): or > diacritic, SPACE>? > > It *could* be represented as , of course, > or for that matter , or other possibilities. > The combining character sequence, in either case, is the > sequence. How should a text rendering library deal with ? Should the character after the diacritic be drawn under the right half of the diacritic, or beyond its rightmost ink? Noah
Re: Vi problem
Try running vim in a UTF-8 locale. $ LANG=en_US.UTF-8 vim Also, see ":help termencoding". Noah On Mon, Aug 18, 2003 at 17:50:42 +0200, Stefan Persson wrote: > Hi! > > I am using Vi (version Vi IMproved 6.1) on Linux using UTF-8 (xterm > -u8). If a UTF-8 characters does, when misinterpreted as Latin-1, > contain a control character, that character is displayed as something > different. For example, the Swedish capital "Ä" is displayed as a > square box followed by '~D'. Is there a way to get rid of this problem? > > Stefan >
Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)
According to the docs at http://www.microsoft.com/typography/otfntdev/indicot/other.htm, uniscribe renders combining marks in isolation when they are applied to SPACE + ZWJ. (Without the ZWJ, it uses a dotted circle.) Perhaps this is an acceptable solution to the people calling for a new character. Combining marks and signs that appear in text not in conjunction with a valid consonant base are considered invalid. Uniscribe displays these marks using the fallback rendering mechanism defined in the Unicode Standard (section 5.12, 'Rendering Non-Spacing Marks' of the Unicode Standard 3.1), i.e. positioned on a dotted circle. Please note that to render a sign standalone (in apparent isolation from any base) one should apply it on a space (see section 2.5 'Combining Marks' of the Unicode Standard). Uniscribe requires a ZWJ to be placed between the space and a mark for them to combine into a standalone sign. Noah
"savvy" images
I'm getting 404 Not Found for the "Unicode Savvy" images. http://www.unicode.org/consortium/unisavvy.html Noah
Re: Letterforms based on p
On Tue, Jun 10, 2003 at 10:30:30 -0400, Jim Allan wrote: > > One can quickly search for these symbols in the OED web edition. How can one? Noah
Re: International Font to be Used
If you are on a system that uses fontconfig, such as most recent linux distros, you can find out which fonts support a particular language using fc-list; for example: $ fc-list :lang=ko Noah On Mon, Jun 09, 2003 at 18:04:58 +0100, Raymond Mercier wrote: > > http://pfaedit.sourceforge.net/ > > http://ourworld.compuserve.com/homepages/RaymondM/unisearch.htm
Re: UNESCO standard keyboards? (Re: Tamazight/berber language : ....)
fontconfig also has lists of characters needed for each language, which it uses to help decide which font to provide: http://keithp.com/cgi-bin/cvsweb/fontconfig/fc-lang/ Noah On Fri, Jun 06, 2003 at 8:14:48 -0700, Doug Ewell wrote: > > If would be interesting to add some informative appendixes to Unicode > > and later make them normative, to clearly state the subset of > > characters that MUST be supported for each written language, and a > > list of legacy equivalents that should be interpreted the same as > > their recommanded encoding in the context of that language. >
Re: Fw: Unicode filename problems
There's also a highly portable open source implementation: http://www.info-zip.org/pub/infozip/Zip.html Noah On Mon, Jun 02, 2003 at 15:05:39 +0100, Raymond Mercier wrote: > > > >http://www.pkware.com/products/enterprise/white_papers/appnote.html
Re: New Unicode Savvy Logo
I had it up at http://gucharmap.sourceforge.net/ by Tue May 27 17:24:00 EDT 2003. I agree with everybody that it's pretty ugly. But frankly, it's not nearly as ugly as most of www.unicode.org, especially the technical reports. Incidentally, the site is hard to navigate, too. But I'm not one to complain. ;-) Noah On Wed, May 28, 2003 at 0:30:04 -0700, Doug Ewell wrote: > Announcement: New Unicode Savvy LogoSnazzy or not, is there some kind of > award for the first Webmaster to actually use it? > > -Doug Ewell > Fullerton, California > http://users.adelphia.net/~dewell/ > 2003-05-28 07:29 UTC >
Re: Detecting UTF-8 Locale Question
On Tue, Mar 25, 2003 at 12:01:11 -0500, Edward H Trager wrote: > > (3) Aside from xterm or mlterm running under Cygwin, are there other UTF-8 > competent terminals available on Win32? Which one are "the best"? PuTTY supports UTF-8, and is open source. Noah
Re: Surrogate supported in Mozilla 1.3
On Sun, Mar 16, 2003 at 9:48:46 +0330, Roozbeh Pournader wrote: > > > I will update the Unicode example intro page to reflect that Mozilla 1.3 > > supports this. > > Just for Windows, I guess. I have a Red Hat Linux 8.0 box, just installed > code2001, but only see question marks on the page. The number of question > marks looks to be right, but that's all. It works for me with mozilla-snapshot (1.4a) on Debian Sid. Noah
Re: FAQ entry
On Fri, Mar 07, 2003 at 17:27:08 +0100, David Oftedal wrote: > We're not necessarily talking about Latin here. In Norwegian and Danish, > æ is not a ligature, but a separate sound almost unpronounceable by > English speakers. I believe æ is also a character in the IPA. Noah