Re: How is UTF8, UTF16 and UTF32 encoded?

2002-05-31 Thread Shlomi Tal
The best non-technical introduction I've seen for UTF-8 is "The Properties and Promizes (sic) of UTF-8" by Martin Dürst, here: http://www.ifi.unizh.ch/mml/mduerst/papers/PDF/IUC11-UTF-8.pdf And another good easy introduction is Richard Gillam's "Unicode Demystified", here: http://www.concent

Unicode and the digital divide. (derives from Re: Towards some more Private Use Area code points for ligatures.)

2002-05-31 Thread William Overington
>John, > >> You are trying to find a solution to a problem that has already >> been solved in a better way, and in the process you will create >> more problems for anyone who uses your solution. So give it >> up already. > >I will buy a drink for whoever can truly make William understand this poin

Re: Towards some more Private Use Area code points for ligatures

2002-05-31 Thread William Overington
Michael Everson wrote as follows. >At 11:24 +0100 2002-05-30, William Overington wrote: > >>I am also having a look at the idea of having code points for the famous >>combination border of the type used by Robert Granjon in the sixteenth >>century. > >Code points are assigned to characters. Even

(informative) Explanation of Microsoft Windows Text-File Modes

2002-05-31 Thread Shlomi Tal
Another FAQ-like essay of mine. Request for corrections. - Explanation of Microsoft Windows Text-File Modes by Shlomi Tal ([EMAIL PROTECTED]) Contents 1. Concepts 2. ANSI Mode 3. Unicode Mode 4. UTF-8 Mode -- Prelimi

Re: (informative) Explanation of Microsoft Windows Text-File Modes

2002-05-31 Thread Michael \(michka\) Kaplan
From: "Shlomi Tal" <[EMAIL PROTECTED]> > Another FAQ-like essay of mine. Very interesting > Request for corrections. Ok, if you insist. :-) > Microsoft Windows can handle text in at least one of three modes: > > 1. 8-bit stream with 256-character repertoire > 2. 16-bit stream with 65536-c

Re: Unicode and the digital divide. (derives from Re: Towards some more Private Use Area code points for ligatures.)

2002-05-31 Thread John H. Jenkins
On Friday, May 31, 2002, at 03:45 AM, William Overington wrote: >> > > This discussion seems to be related to the digital divide. > > Suppose that someone has access to a PC which has Windows 95 or Windows 98 > and has Microsoft Word 97 installed. The person wishes to produce a print > out of a

Re: Unicode and the digital divide.

2002-05-31 Thread Doug Ewell
I think I'm starting to see part of the problem here. William Overington wrote: > By there existing a publicly available document which includes within it a > pairing of ct with the code point U+E707 the possibility exists that some > people might include ct in a TrueType fount and might place

Re: Unicode and the digital divide.

2002-05-31 Thread John H. Jenkins
On Friday, May 31, 2002, at 10:11 AM, Doug Ewell wrote: > > Respefully, Nice one, Doug. Unfortunately, on my system, that collides with the ConScript version of Shavian which I have installed, so I got something unexpected. ☹ Which makes your point. As the Good Book says, "He that hath e

Re: Unicode and the digital divide. (derives from Re: Towards some more Private Use Area code points for ligatures.)

2002-05-31 Thread Philipp Reichmuth
JHJ> Let's alter the question. I'm a sinologist who wants to reproduce JHJ> precisely my twelfth-century copy of the Confucian Analects on my Mac Plus. No problem whatsoever. Mac Paint is the Way To Go (TM). Philipp

Re: Unicode and the digital divide.

2002-05-31 Thread John Cowan
Doug Ewell scripsit: > AFAIK, one of the reasons for > creating the ConScript Unicode Registry was to give font designers a > semi-standard place to put, say, Tengwar glyphs; I had no such idea in mind, though it's a good idea and I certainly endorse it after the fact. I just wanted to have fun

Re: Unicode and the digital divide.

2002-05-31 Thread Tom Gewecke
>I am not aware of any character assignments, official or PUA, gaining >widespread usage through this approach. AFAIK, one of the reasons for >creating the ConScript Unicode Registry was to give font designers a >semi-standard place to put, say, Tengwar glyphs; but if that practice >has caught on

RE: How is UTF8, UTF16 and UTF32 encoded?

2002-05-31 Thread Kenneth Whistler
Rick Cameron asked: > The Unicode Standard 2.0 had a table in Appendix A that is, I think, just > what you're asking for. I can't find this table in the online version of TUS > 3.0 (it's not very useful that the online index gives page numbers, when > there's no way to map a page number to the ap

Re: Towards some more Private Use Area code points for ligatures

2002-05-31 Thread John Hudson
At 01:01 5/31/2002, William Overington wrote: > >>I am also having a look at the idea of having code points for the famous > >>combination border of the type used by Robert Granjon in the sixteenth > >>century. To which Michael Everson replied: > >Code points are assigned to characters. Even in

Re: Unicode and the digital divide.

2002-05-31 Thread starner
>To prove this point, I'm not aware of any Tengwar font using the Conscript >points aside from Code2001. And I know only 2 experimental web pages that >use them. It certainly does not appear to have caught on even with "fans." OTOH, I know of a number of sites and fonts that use the Conscript en

22nd Unicode Conference, September 2002, San Jose, CA, USA

2002-05-31 Thread Misha . Wolf
Twenty-second International Unicode Conference (IUC22) Unicode and the Web: Evolution or Revolution? http://www.unicode.org/iuc/iuc22 September 9-13, 2002

Re: Towards some more Private Use Area code points for

2002-05-31 Thread John Cowan
John Hudson scripsit: > The majority of existing (8-bit) ornament fonts use ASCII codes for > ornaments, often arranged in such a way that, in the case of border units, > there is a logical layout on (US) keyboard. [...] > For a very large number of users, this is expected behaviour, so one >

Re: Towards some more Private Use Area code points for

2002-05-31 Thread John Hudson
At 12:54 5/31/2002, John Cowan wrote: >I rather like the Microsoft approach of assigning them [ornaments] to the >PUA range >F000-F0FF, which sort of preserves the pseudo-ASCII nature of them without >interfering with semantic text processing that assigns ASCII semantics to >the -00FF codepo

Re: Unicode and the digital divide.

2002-05-31 Thread Kenneth Whistler
William Overington opined: > Yet I am very concerned that I may be in effect being told > here that Unicode is only really intended for people with the very latest > equipment using expensive solutions that are only realistically available > to rich corporations. I myself am very concerned abou

from 4 to null (was: 3 big bidi bugs)

2002-05-31 Thread Bernard Miller
Mark Davis wrote: > One could wish for a simpler algorithm (for that matter, one could > wish that people had uniform writing directions, or that Brits would > drive on the right side of the road). As to ByText, you are on your > own (in many ways). ByText? What’s that? One could wish for a simpl

Re: Towards some more Private Use Area code points for

2002-05-31 Thread Markus Scherer
For borders and arbitrary logos/symbols, it sounds like the best would be to do what someone else suggested on this list a few days ago: Define markup to specify a font and a glyph ID in that font to display something without the need for a pseudo-character encoding for it. Something for w3.or

Re: Unicode and the digital divide.

2002-05-31 Thread John H. Jenkins
On Friday, May 31, 2002, at 02:38 PM, Kenneth Whistler wrote: > The issue is *NOT* hardware. Take a look at www.dell.com. The > very, very, bottom-end system, a "Dimension 2200" desktop, comes these > days with a 1.3GHz Intel Celeron chip, oodles of multimegabytes of SDRAM, > a 20- to 40GB hard

Re: from 4 to null (was: 3 big bidi bugs)

2002-05-31 Thread Markus Scherer
There are a Java and a C++ reference implementation linked from the Bidi TR. The Java one is straightforward (and slow), written so that you can read each rule in the TR and see in the source that it works as specified. The C++ code is verified to produce the same results as the Java code. ICU'

Re: Unicode and the digital divide.

2002-05-31 Thread Tom Gewecke
>If you are concerned about the digital divide, then Unicode on the >web, and all that distributed processing and informational power in >those consumer PC's loaded up with Unicode-handling software for a >pittance, are your *friends* in the struggle to keep all economic >power from concentration

Re: Unicode and the digital divide. (derives from Re: Towards some morePrivate Use Area code points for ligatures.)

2002-05-31 Thread Peter_Constable
On 05/31/2002 04:45:00 AM "William Overington" wrote: >This discussion seems to be related to the digital divide. > >Suppose that someone has access to a PC which has Windows 95 or Windows 98 >and has Microsoft Word 97 installed. The person wishes to produce a print >out of a transcription of a

Logical_Order_Exception actually means Phonetic_Order_Exception ?

2002-05-31 Thread Samphan Raruenrom
8<->8 The definition of this newly introduced property in Unicode 3.2 :- http://www.unicode.org/unicode/reports/tr28/#database Logical_Order_Exception: There are a small number of characters (in the Thai and Lao scripts) that do