Latin digraph characters (was: Re: Klingon silliness)

DougEwell2 Tue, 27 Feb 2001 08:57:37 -0800
In a message dated 2001-02-27 04:17:48 Pacific Standard Time, 
[EMAIL PROTECTED] writes:

>  No character set standard was ever designed by Slovaks. However, Slovak
>  linguists have always treated "ch" as a separate character. As they
>  do "dz" and "dz" with caron, but those are encoded in Unicode.

Adam mentions the Latin digraphs encoded for DZ at U+01F1/2/3 and for DZ with 
caron at U+01C4/5/6.  These characters, along with LJ at U+01C7/8/9 and NJ at 
U+01CA/B/C, were ostensibly added so that Cyrillic (Serbian) text converted 
to the Latin (Croatian) script could be converted 1-to-1.  (DZ and DZ-caron 
are also used in Slovak, as Adam points out.)

This has always puzzled me, because Cyrillic includes lots of other 
characters that transliterate to two or more Latin letters.  CH, SH, SHCH, 
and ZH leap to mind; there may be more.  What was the thought process behind 
providing these compatibility characters only for the Serbo-Croatian 
additions to Cyrillic, but not for the other Cyrillic characters?

Of course, I am not at all suggesting that any such additional characters be 
added.  The existing compatibility characters require three code points each 
(uppercase, titlecase, and lowercase) and I was under the impression that 
they were deprecated, though I could find no mention of that in TUS 3.0.

-Doug Ewell
 Fullerton, California

P.S.  I don't agree that the amount of traffic on this list is a problem.  
There are several interesting, on-topic threads going on here, and people are 
feeling compelled to participate.  Relatively few posts are straight "me 
too's" or of the especially annoying form "I have browser X, database Y, 
operating system Z, how can my app display my Unicode characters?"
Latin digraph characters (was: Re: Klingon silliness)

Reply via email to