Re: Accented ij ligatures (and yery)

2003-07-30 Thread Anto'nio Martins-Tuva'lkin
On 2003.07.07, 00:25, Peter Kirk <[EMAIL PROTECTED]> wrote: > Maybe originally U+044B (cyrillic "y", "yery") was two separate > letters, It sure it (though I should provide some references to back this up? Hm, later...) > but it is certainly considered and used as one letter in Cyrillic > langua

Re: ISO 639 "duplicate" codes (was: Re: Ligatures in Turkish andAzeri, was: Accented ij ligatures)

2003-07-15 Thread Addison Phillips [wM]
Phillipe wrote: >>I hae tried several times to do it. It does not work: you may >>effectively remove some tables your don't need, but trying >>to extract just the normalizer is a real nightmare. I tried it >>in the past, and abondonned: too tricky to maintain, and I >>retried it recently (one mont

Re: ISO 639 "duplicate" codes (was: Re: Ligatures in Turkish andAzeri, was: Accented ij ligatures)

2003-07-15 Thread Addison Phillips [wM]
Phillipe wrote: >>I hae tried several times to do it. It does not work: you may >>effectively remove some tables your don't need, but trying >>to extract just the normalizer is a real nightmare. I tried it >>in the past, and abondonned: too tricky to maintain, and I >>retried it recently (one mont

Re: ISO 639 "duplicate" codes (was: Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures)

2003-07-14 Thread Mark Davis
ubject: Re: ISO 639 "duplicate" codes (was: Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures) > On Monday, July 14, 2003 5:34 AM, Mark Davis <[EMAIL PROTECTED]> wrote: > > > ... > > > Of course > > > Java already includes some parts

Re: ISO 639 "duplicate" codes (was: Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures)

2003-07-14 Thread Philippe Verdy
On Monday, July 14, 2003 5:34 AM, Mark Davis <[EMAIL PROTECTED]> wrote: > ... > > Of course > > Java already includes some parts of ICU, but other things are in > > ICU4J are difficult now to integrate in Java, simply because IBM > > forgot to modularize ICU so that it can be integrated slowly. >

Re: ISO 639 "duplicate" codes (was: Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures)

2003-07-13 Thread Mark Davis
___ http://www.macchiato.com ► “Eppur si muove” ◄ - Original Message - From: "Philippe Verdy" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Saturday, July 12, 2003 14:45 Subject: Re: ISO 639 "duplicate" codes (was: Re: Ligatures in Turkish a

Re: ISO 639 "duplicate" codes (was: Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures)

2003-07-12 Thread Philippe Verdy
On Saturday, July 12, 2003 4:17 PM, Jony Rosenne <[EMAIL PROTECTED]> wrote: > What has "iw" to with Hebrew? > > I wasn't involved with the change, but I'm glad it was done. Java and > other systems probably still use it because they never bothered to > check the latest version of 639. I know for

Re: ISO 639 "duplicate" codes (was: Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures)

2003-07-12 Thread Mark Davis
► “Eppur si muove” ◄ - Original Message - From: "Philippe Verdy" <[EMAIL PROTECTED]> To: "Doug Ewell" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Saturday, July 12, 2003 00:27 Subject: Re: ISO 639 "duplicate" codes (was: Re: Ligatures

Re: ISO 639 "duplicate" codes (was: Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures)

2003-07-12 Thread Patrick Andries
Michael Everson" <[EMAIL PROTECTED]> écrivit : > At 08:11 -0400 2003-07-12, Patrick Andries wrote: > > >Just out of curiosity, why was « iw » deprecated ? Seems perfectly fine to > >me. And why was « he » chosen (Herero, Hemba, Hellenic Greek) ? > > Iwrit (iw), being a German transliteration of t

RE: ISO 639 "duplicate" codes (was: Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures)

2003-07-12 Thread Jony Rosenne
es (was: Re: Ligatures in > Turkish and Azeri, was: Accented ij ligatures) > > > > > Samedi 12 juillet à 6h51, Doug Ewell <[EMAIL PROTECTED]> écrivit : > > > The codes "iw" for Hebrew and "in" for Indonesian were deprecated > > FOURTEEN YE

Re: ISO 639 "duplicate" codes (was: Re: Ligatures in Turkish andAzeri, was: Accented ij ligatures)

2003-07-12 Thread Michael Everson
At 08:11 -0400 2003-07-12, Patrick Andries wrote: Just out of curiosity, why was « iw » deprecated ? Seems perfectly fine to me. And why was « he » chosen (Herero, Hemba, Hellenic Greek) ? Iwrit (iw), being a German transliteration of the name of the Hebrew language, and Jiddisch (ji) were both t

Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-12 Thread Peter Kirk
On 12/07/2003 04:18, Michael Everson wrote: At 03:25 -0700 2003-07-12, Peter Kirk wrote: Does anyone know of a good resource on the web, or elsewhere, listing the alphabets used for different languages around the world? I know a project was attempted a few years ago at least for Europe. It woul

Re: ISO 639 "duplicate" codes (was: Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures)

2003-07-12 Thread Patrick Andries
Samedi 12 juillet à 6h51, Doug Ewell <[EMAIL PROTECTED]> écrivit : > The codes "iw" for Hebrew and "in" for Indonesian were deprecated > FOURTEEN YEARS AGO. It is not accurate or fair to refer to them as > "duplicates" of "he" and "id". The Registration Authority deprecates > such codes, rathe

Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-12 Thread Michael Everson
At 03:25 -0700 2003-07-12, Peter Kirk wrote: Does anyone know of a good resource on the web, or elsewhere, listing the alphabets used for different languages around the world? I know a project was attempted a few years ago at least for Europe. It would be useful to have this kind of data availa

Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-12 Thread Peter Kirk
On 11/07/2003 11:18, Philippe Verdy wrote: # T: special case for uppercase I and dotted uppercase I #- For non-Turkic languages, this mapping is normally not used. #- For Turkic languages (tr, az), this mapping can be used instead of the normal mapping for these characters. Is that wh

Re: ISO 639 "duplicate" codes (was: Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures)

2003-07-12 Thread Philippe Verdy
On Saturday, July 12, 2003 6:51 AM, Doug Ewell <[EMAIL PROTECTED]> wrote: > Philippe Verdy wrote: > > > Good luck with ISO language codes which does not even > > define them, and contain many duplicate codes even in > > the Alpha-2 space (he/iw, in/id), or unprecize codes > > matching sometimes

Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-12 Thread Peter_Constable
> Where does the fact of saying that a Grapheme Disjoiner... The character you should be referring to is not a new character GDJ, but rather is the existing ZWNJ, the functions of which include prevention of a ligature. - Peter ---

ISO 639 "duplicate" codes (was: Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures)

2003-07-11 Thread Doug Ewell
Philippe Verdy wrote: > Good luck with ISO language codes which does not even > define them, and contain many duplicate codes even in > the Alpha-2 space (he/iw, in/id), or unprecize codes > matching sometimes very imprecize families of languages > overlapping other language codes... The codes "

Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-11 Thread Philippe Verdy
On Friday, July 11, 2003 6:43 PM, Peter Kirk <[EMAIL PROTECTED]> wrote: > Agreed. But does Unicode actually treat them as non-normative samples? Note clear here: the reference documents say that these tables are normative for applications that want to implement a conforming case folding. But UTR#

Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-11 Thread Peter Kirk
On 11/07/2003 08:51, Philippe Verdy wrote: On Friday, July 11, 2003 3:50 PM, Peter Kirk <[EMAIL PROTECTED]> wrote: So I hope that what is fixed by Unicode is the name not of two languages but of an extensible family of scripts. I think you speak about family of languages? Not really. A se

Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-11 Thread Philippe Verdy
On Friday, July 11, 2003 3:50 PM, Peter Kirk <[EMAIL PROTECTED]> wrote: > So I hope that what is fixed by Unicode is the name not > of two languages but of an extensible family of scripts. I think you speak about family of languages? Good luck with ISO language codes which does not even define th

Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-11 Thread Peter Kirk
On 11/07/2003 05:56, Philippe Verdy wrote: Note also: the Soft_Dotted property was created and considered specially for Turkish and Azeri. Whatever it was that was specially created or adjusted for Turkish and Azeri, was it specifically restricted to these two languages? These are I think

Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-11 Thread Philippe Verdy
On Friday, July 11, 2003 1:12 PM, Kent Karlsson <[EMAIL PROTECTED]> wrote: > > Note also: the Soft_Dotted property was created and considered > > specially for Turkish and Azeri. > > Adding to the long, and unfortunately getting longer, list of > misleading statements from Philippe! No, the reas

RE: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-11 Thread Kent Karlsson
> Note also: the Soft_Dotted property was created and considered > specially for Turkish and Azeri. Adding to the long, and unfortunately getting longer, list of misleading statements from Philippe! No, the reason for the Soft_Dotted property was/is to mark which characters (regardless of langua

Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-10 Thread James H. Cloos Jr.
> "Peter" == Peter Kirk <[EMAIL PROTECTED]> writes: Peter> Maybe, but it is hardly realistic to expect all existing Peter> Turkish and Azeri text to be recoded to insert a character in Peter> the middle of each f - i sequence. But a lot of it already does do that. In TeX Turkish uses f{}i to

Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-10 Thread Kenneth Whistler
> > and Philippe Verdy responded with another question: > > > > > Isn't there a "Grapheme Disjoiner" format control character to > > > force the absence of a ligature like , i.e. ? > > > > The answer to Philippe's rejoinder question is no, there is not > > a "Grapheme Disjoiner" format control c

Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-10 Thread Laurentiu Iancu
See also http://www.microsoft.com/typography/developers/opentype/detail.htm which explains how ligatures can be turned off on a language-dependent basis. Laurentiu Peter Kirk asked: > In Turkish and Azeri the sequences f - i and f - dotless i both occur, > and are fairly frequent. So it is inap

Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-10 Thread Peter Kirk
On 10/07/2003 11:37, Kenneth Whistler wrote: At Peter pointed out, however, it is neither expected or reasonable to have to go back through and drop in ZWNJ's at every relevant location in existing Turkish or Azeri text, simply to prevent fi ligation. Such use of ZWNJ is intended to be exceptional

Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-10 Thread John Cowan
Philippe Verdy scripsit: > Where does the fact of saying that a Grapheme Disjoiner can be used > in Turkish to avoid that the f collapses the dot above a next lowercase i? It is settled that ZWNJ is the correct character to break ligatures. ZWJ means "make a ligature if you can; if not, shape cha

Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-10 Thread Philippe Verdy
On Thursday, July 10, 2003 8:37 PM, Kenneth Whistler <[EMAIL PROTECTED]> wrote: > Peter Kirk asked: > > > > In Turkish and Azeri the sequences f - i and f - dotless i both > > > occur, and are fairly frequent. So it is inappropriate in these > > > languages to use fi ligatures in which the dot on

Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-10 Thread Kenneth Whistler
Peter Kirk asked: > > In Turkish and Azeri the sequences f - i and f - dotless i both occur, > > and are fairly frequent. So it is inappropriate in these languages to > > use fi ligatures in which the dot on the i is lost or invisible, at > > least where the second character is a dotted i. Has any

Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-10 Thread Philippe Verdy
On Thursday, July 10, 2003 6:42 PM, Peter Kirk <[EMAIL PROTECTED]> wrote: > Anyway, I understood from the recent discussion of Hebrew that it is > Unicode policy not to do anything which could theoretically invalidate > existing text even if it could be proved that no such text existed. Where doe

Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-10 Thread Stefan Persson
Peter Kirk wrote: > Maybe, but it is hardly realistic to expect all existing Turkish and Azeri text to be recoded to insert a character in the middle of each f - i sequence. Aren't most Turkish and Azeri text coded as ISO-8859-9 and similar code pages? I that case, it would be enough to add t

Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-10 Thread Peter Kirk
On 10/07/2003 09:34, Stefan Persson wrote: Peter Kirk wrote: > Maybe, but it is hardly realistic to expect all existing Turkish and Azeri text to be recoded to insert a character in the middle of each f - i sequence. Aren't most Turkish and Azeri text coded as ISO-8859-9 and similar code page

Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-10 Thread Philippe Verdy
On Thursday, July 10, 2003 5:41 PM, Peter Kirk <[EMAIL PROTECTED]> wrote: > > Isn't there a "Grapheme Disjoiner" format control character to > > force the absence of a ligature like , i.e. ? > > > Maybe, but it is hardly realistic to expect all existing Turkish and > Azeri text to be recoded to i

Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-10 Thread Peter Kirk
On 10/07/2003 08:21, Philippe Verdy wrote: In Turkish and Azeri the sequences f - i and f - dotless i both occur, and are fairly frequent. So it is inappropriate in these languages to use fi ligatures in which the dot on the i is lost or invisible, at least where the second character is a dotted i

Re: Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-10 Thread Philippe Verdy
On Thursday, July 10, 2003 12:08 PM, Peter Kirk <[EMAIL PROTECTED]> wrote: > On 1st July Philippe Verdy wrote: > > > If fonts still want to display dots on these characters, that's a > > rendering problem: there already exists a lot of fonts used for > > languages other than Turkish and Azeri, wh

Ligatures in Turkish and Azeri, was: Accented ij ligatures

2003-07-10 Thread Peter Kirk
On 1st July Philippe Verdy wrote: If fonts still want to display dots on these characters, that's a rendering problem: there already exists a lot of fonts used for languages other than Turkish and Azeri, which do not display any dot on a lowercase ASCII i or j (dotted), and display a dot on their

Re: Accented ij ligatures (and yery)

2003-07-06 Thread Peter Kirk
Maybe originally U+044B (cyrillic "y", "yery") was two separate letters, but it is certainly considered and used as one letter in Cyrillic languages today. Encoding it as two letters would be about as sensible as insisting that w should be encoded as two u's or that i should be encoded as dotl

Re: Accented ij ligatures (and yery)

2003-07-03 Thread Anto'nio Martins-Tuva'lkin
On 2003.07.01, 15:09, Pim Blokland <[EMAIL PROTECTED]> wrote: > Maybe it was a bad idea to include ? as a character in Unicode at all, > but now it's there, there's no reason to ignore it when refining the > rules, to deprecate it practically. Food for thought: How would you compare U+0133 ("ij"

Re: Accented ij ligatures (was: Unicode Public Review Issues update)

2003-07-02 Thread Doug Ewell
Kent Karlsson wrote: >> Believe it or not, the IJ and ij digraphs *were* included for >> compatibility with an 8-bit legacy character set (ISO 6937). > > 6937 is a multibyte encoding (one or two bytes per character). > There are no combining characters at all in 6937, even though > there is a com

RE: Accented ij ligatures (was: Unicode Public Review Issues update)

2003-07-02 Thread Kent Karlsson
> In either cases, the "Soft_Dotted" property is probably overkill on > the existing or ligatures (should should have been better There is no point in having a soft-dotted property for the capital letter... > named "letters" and not "ligatures") for Dutch. Or is this update > needed to docume

RE: Accented ij ligatures (was: Unicode Public Review Issues update)

2003-07-02 Thread Kent Karlsson
> Believe it or not, the IJ and ij digraphs *were* included for > compatibility with an 8-bit legacy character set (ISO 6937). 6937 is a multibyte encoding (one or two bytes per character). There are no combining characters at all in 6937, even though there is a common misunderstanding that there

Re: Accented ij ligatures (was: Unicode Public Review Issues update)

2003-07-01 Thread Doug Ewell
Philippe Verdy wrote: >> Maybe it was a bad idea to include ij as a character in Unicode at >> all, but now it's there, there's no reason to ignore it when >> refining the rules, to deprecate it practically. > > No, that was needed for correct Dutch support. Look at the case > conversion of into

Re: Accented ij ligatures (was: Unicode Public Review Issues update)

2003-07-01 Thread Philippe Verdy
On Tuesday, July 01, 2003 4:09 PM, Pim Blokland <[EMAIL PROTECTED]> wrote: > Maybe it was a bad idea to include ij as a character in Unicode at > all, but now it's there, there's no reason to ignore it when > refining the rules, to deprecate it practically. No, that was needed for correct Dutch sup

Re: Accented ij ligatures

2003-07-01 Thread Stefan Persson
Pim Blokland wrote: When putting accents on the ij (which does happen!), the dots must go. Simple as that. Where should the accent be placed in that case? Should the accent be centered over "ij"? Should there be one accent over "i" and then the same over "j"? Or should the accent only be an ac

Re: Accented ij ligatures (was: Unicode Public Review Issues update)

2003-07-01 Thread Pim Blokland
Michael Everson schreef: > I think the answer is, regarding the soft dot property, please leave > the ij ligature alone. And I think not. When putting accents on the ij (which does happen!), the dots must go. Simple as that. Maybe it was a bad idea to include ij as a character in Unicode at all, bu

Re: Accented ij ligatures (was: Unicode Public Review Issues update)

2003-07-01 Thread Philippe Verdy
On Tuesday, July 01, 2003 1:55 PM, Kent Karlsson <[EMAIL PROTECTED]> wrote: > > My feeling about the proposed "Public Review" document should > > exclude the ligature, waiting for the decision about the new > > ligature approved in the first rounds by UTC and > > waiting for approval by ISO JTC.

RE: Accented ij ligatures (was: Unicode Public Review Issues update)

2003-07-01 Thread Kent Karlsson
> > I don't know of any instances where a ij digraph would keep the dots > > AND get additional accent marks, nor of any where the ij would > > appear with a dotless i and dotless j and a single dot above, > > centered between them. Can you give examples? > > No of course: So why do you care? >

Re: Accented ij ligatures (was: Unicode Public Review Issuesupdate)

2003-06-30 Thread Michael Everson
I think the answer is, regarding the soft dot property, please leave the ij ligature alone. -- Michael Everson * * Everson Typography * * http://www.evertype.com

Re: Accented ij ligatures (was: Unicode Public Review Issues update)

2003-06-30 Thread Philippe Verdy
On Monday, June 30, 2003 9:13 PM, James H. Cloos Jr. <[EMAIL PROTECTED]> wrote: > So if you want two dots and an acute use ‹ij, U+0308, U+0301›: ij̈́ > > Of course a given font’s diaeresis will often not line up with the > stems of its ij, and a custom one should be used instead. Or > features an

Re: Accented ij ligatures (was: Unicode Public Review Issuesupdate)

2003-06-30 Thread James H. Cloos Jr.
> "Philippe" == Philippe Verdy <[EMAIL PROTECTED]> writes: Philippe> But if one wants to restore the preious visual behavior, Philippe> even if it's incorrect for languages using this digraph as a Philippe> letter, what would be the behavior of using the following Philippe> sequence: Philippe

Re: Accented ij ligatures (was: Unicode Public Review Issues update)

2003-06-30 Thread Philippe Verdy
On Monday, June 30, 2003 1:58 PM, Pim Blokland <[EMAIL PROTECTED]> wrote: > Philippe Verdy schreef: > > > Interesting issue for the Latin Small "ij" Ligature (U+0133): > > Normally the Soft_Dotted issupposed to make disappear one dot when > > there's and additional diacritic above, but many appli

Accented ij ligatures (was: Unicode Public Review Issues update)

2003-06-30 Thread Pim Blokland
Philippe Verdy schreef: > Interesting issue for the Latin Small "ij" Ligature (U+0133): > Normally the Soft_Dotted issupposed to make disappear one dot when > there's and additional diacritic above, but many applications may > keep these two dots above, fitting the diacritic in the middle. > > Thi