FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-05 Thread John Cowan
I've reformatted Pim Blokland's question as a Unicode FAQ. Q: What do the terms "turned", "inverted", "reversed", "rotated", "inverse", "digraph", and "ligature" used in the names of Unicode characters mean? A: These terms are basically typographical rather than Unicode-specific. A turned charac

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-07 Thread Pim Blokland
John Cowan schreef: > Digraphs and ligatures are both made by combining two glyphs. In a digraph, > the glyphs remain separate but are placed close together. In a ligature, > the glyphs are fused into a single glyph. Oh, in that case I must say I think the UnicodeData.txt file doesn't do a very

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-07 Thread John Cowan
Pim Blokland scripsit: > For instance, the Danish ae (U+00E6) is not designated a ligature, It was in Unicode 1.0; I think politics were involved in that one. In Latin use, ae is most certainly a ligature, and likewise in the languages (including English) that have borrowed words involving it. In

RE: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-07 Thread Kent Karlsson
The names do NOT always provide correct descriptions of the characters. This is especially true for "digraph" and "ligature" (and in the case of U+00E6 too), as well as (e.g.) SCRIPT CAPITAL P, which is neither script, nor capital (it's lowercase), though it is a p... In addition, there are diff

RE: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-07 Thread Kent Karlsson
> > For instance, the Danish ae (U+00E6) is not designated a ligature, > > It was in Unicode 1.0; I think politics were involved in that one. > In Latin use, ae is most certainly a ligature, and likewise in the > languages (including English) that have borrowed words involving it. > In Danish use,

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-07 Thread John H. Jenkins
On Friday, March 7, 2003, at 04:26 AM, Pim Blokland wrote: Oh, in that case I must say I think the UnicodeData.txt file doesn't do a very good job. For instance, the Danish ae (U+00E6) is not designated a ligature, but the Dutch ij (U+0133) is, even though the "a" and "e" are clearly fused toget

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-07 Thread Roozbeh Pournader
On Fri, 7 Mar 2003, John H. Jenkins wrote: > since different people speaking different languages > often have different perceptions of what a symbol is. Reminds me of ISIRI 3342 that officially considered symbol and character the same thing and used one word ("namaad", Noon, Meem, Alef, Dal) for

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-07 Thread Pim Blokland
Kent Karlsson schreef: > Typographically, it's a ligature either way. You mean that both ae and ij should be called ligatures, although one is fused and the other isn't? OK, I can live with that. I'd rather the ij were called a digraph, though. The ij is considered by some to be one letter in Du

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-07 Thread John Cowan
Kent Karlsson scripsit: > E.g., it is quite legitimate to render, e.g. LIGATURE FI as an f followed > by an i, no ligation, whereas that is not allowed for the ae > ligature/letter, nor for the oe ligature. How do you know that? Either "Caesar" or "Cæsar" is good Latin. -- After fixing the Y2K

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-07 Thread John Cowan
Pim Blokland scripsit: > The ij is considered by some to be one letter in Dutch, and when written > down, an "i" and a "j" together look very much like a written y with > diaeresis. (See fonts like Script MT.) So I can understand foreigners > getting confused and encoding it that way (as a y with

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-07 Thread Doug Ewell
Michael Everson wrote: >> You mean that both ae and ij should be called ligatures, although one >> is fused and the other isn't? >> OK, I can live with that. I'd rather the ij were called a digraph, >> though. > > These terms are not normative. Get used to it. The names themselves are normative,

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-11 Thread Christopher John Fynn
"John Cowan" <[EMAIL PROTECTED]> wrote: > Kent Karlsson scripsit: > > > E.g., it is quite legitimate to render, e.g. LIGATURE FI as an f followed > > by an i, no ligation, whereas that is not allowed for the ae > > ligature/letter, nor for the oe ligature. > How do you know that? Either "Cae

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-11 Thread Mark Davis
d, SJ CA 95193 (408) 256-3148 fax: (408) 256-0799 - Original Message - From: "Christopher John Fynn" <[EMAIL PROTECTED]> To: "'Unicode mailing list'" <[EMAIL PROTECTED]> Cc: "John Cowan" <[EMAIL PROTECTED]> Sent: Monday, March 10, 2003

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-11 Thread John Hudson
At 07:17 AM 3/10/2003, Christopher John Fynn wrote: > How do you know that? Either "Caesar" or "Cæsar" is good Latin. No. Hart's Rules: ... The Chicago Manual of Style:... Hart's and Chicago both correctly specify current British and American classicist conventions for setting Latin text. Th

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-11 Thread Jim Allan
John Cowan posted: How do you know that? Either "Caesar" or "Cæsar" is good Latin. Christopher John Fynn posted in response: No. Hart's Rules: << VOWEL-LIGATURES The combinations æ and œ should each be printed as two letters in Latin and Greek words, e.g. Aeneid, Aeschylus, Caesar, Oedipus, Pho

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-11 Thread Curtis Clark
John Hudson wrote: The same people consider Latin a dead language, suitable only for study of ancient documents, which is clearly not the view taken at the Vatican, which continues to produce new documents in that language. In recent encyclicals, however, at least as published at www.vatican.va,

RE: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-12 Thread Alan Wood
Christopher John Fynn wrote: > Print e.g. oestrogen (where oe represents a single > sound), but, e.g., chloro-ethane (not chloroethane) to avoid > confusion. Please don't try to apply these rules to chemical nomenclature - there are already enough people who get the hyphens wrong, without encou

RE: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-12 Thread jarkko.hietaniemi
> The same people consider Latin a dead language, suitable only for > study of ancient documents, which is clearly not the view taken > at the Vatican, which continues to produce new documents in that language. > In recent encyclicals, however, at least as published at www.vatican.va, > the æ an

RE: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-12 Thread jarkko.hietaniemi
> One can get daily news in Latin, too: http://www.yle.fi/fbc/latini/ Correction: a weekly review.

Re: RE: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-12 Thread Rick McGowan
> One can get daily news in Latin, too: http://www.yle.fi/fbc/latini/ Complete with a very nice recitation in Latin! http://www.yle.fi/fbc/latini/recitatio.html