>>>>> "Klaus" == Klaus Weide <[EMAIL PROTECTED]> writes: ... Klaus> On 9 Jun 2000, Sergei Pokrovsky wrote: >> Well, your piece works well if "display character set" is set to >> ASCII; it ASCIIzes as expected. But it fails in the trivial >> case, when "display character set" is UTF-8; then the fallback >> branch is taken :) >> I normally have "UTF-8" for the display character set in my cfg; >> but it doesn't work. Klaus> I hope you have seen my followup message on that by now, with Klaus> the correction; and I hope that *that* will work as Klaus> expected. :) Sorry for the non-working code. It seems that the increased Solar activity caused some mail lossage :) anyway, I've taken the corrected version from the archive and it really does work. Thanks. ... >> BTW, the usual ASCIIzation of the esperantic letters in Latin-3 >> (and Unicode) is by adding x for the hat or breve; Klaus> Is this the only scheme in use? I remember vaguely that I Klaus> read about some alternatives, something using 'h'. That's right. Using "h" is Official, or "Fundamental", because that escape is offered in the Holy Scripture (Fundamento de Esperanto). Now, there are different classes of people: 1) those having an engineering turn of mind naturally prefer the x-convention, which offers many advantages (it is biunique, and thus suitable for automatic conversion; it gives good sorting with usual ASCII-oriented functions); 2) those educated in humanities usually are shocked with such a non-traditional use of x, and argue that it damages the image of Esperanto in the eyes of the unaware public. About 7 years ago most of the Esperanto articles in soc.culture.esperanto used the x-convention (I guess, about 80%); since then the use of accented characters in computers have become much easier and better standardized, which paradoxically caused that the class (1) have migrated to Unicode and no longer use any ASCII ersatz -- and thus, the proportion seems to have reversed. But in some cases where precision is required, the x-convention remains unbeatable, e.g. when you have to submit a search word to a HTML form. ... Klaus> Much of the existing def7_uni.tbl is from me, and it wasn't Klaus> meant to be the definite transliteration. A lot of it is ad Klaus> hoc and can be improved; it's just that not many people have Klaus> shown interest. If you make these changes, and think they Klaus> are of general use, please send patches. Well, I attach a diff file. Klaus> There is a potential problem, in that those strings are not Klaus> language- or locale-specific. So ŝ -> sx may be right for Klaus> Esperanto, but not for some other language that also uses Klaus> that character. No, no other language uses the hat-accented consonants of Esperanto. OTOH, "the short u" (ŭ) could be found in transcritions of the classic Latin texts (though I've never seen such a thing in WWW). Klaus> (Maybe ŝ -> sh or whatever is the alternative is better?) Oh, that's a religious matter :) Yes, that woud produce "shipo" for "ŝipo" (ship) and "sharko" for "ŝarko" (shark); but that would be almost as imperfect as the English writing for "dishaki", "flughaveno" etc (like in the English "dishonor", "mishap" etc, where "sh" happen to be distinct letters, not a digraph). -- Sergei
*** src/chrtrans/Bak/def7_uni.tbl Wed Dec 1 09:33:02 1999 --- src/chrtrans/def7_uni.tbl Sun Jun 11 14:48:41 2000 *************** *** 107,126 **** # end of latin-1 repertoire 0x41 U+0100 U+0102 U+0104 # A 0x61 U+0101 U+0103 U+0105 # a ! 0x43 U+0106 U+0108 U+010a U+010c # C # The following line is an example for mapping several accented versions # of small letter 'c' to 'c': ! 0x63 U+0107 U+0109 U+010b U+010d # c 0x44 U+010e 0x64 U+010f U+0110:D/ U+0111:d/ 0x45 U+0112 U+0114 U+0116 U+0118 U+011a # E 0x65 U+0113 U+0115 U+0117 U+0119 U+011b # e ! 0x47 U+011c U+011e U+0120 U+0122 # G ! 0x67 U+011d U+011f U+0121 U+0123 # g ! 0x48 U+0124 ! 0x68 U+0125 U+0126:H/ 0x48 U+0127 # LATIN SMALL LETTER H BAR -> H 0x49 U+0128 U+012a U+012c U+012e U+0130 # I --- 107,130 ---- # end of latin-1 repertoire 0x41 U+0100 U+0102 U+0104 # A 0x61 U+0101 U+0103 U+0105 # a ! 0x43 U+0106 U+010a U+010c # C # The following line is an example for mapping several accented versions # of small letter 'c' to 'c': ! 0x63 U+0107 U+010b U+010d # c 0x44 U+010e 0x64 U+010f + U+0108:Cx + U+0109:cx U+0110:D/ U+0111:d/ 0x45 U+0112 U+0114 U+0116 U+0118 U+011a # E 0x65 U+0113 U+0115 U+0117 U+0119 U+011b # e ! 0x47 U+011e U+0120 U+0122 # G ! 0x67 U+011f U+0121 U+0123 # g ! U+011c:Gx ! U+011d:gx ! U+0124:Hx ! U+0125:hx U+0126:H/ 0x48 U+0127 # LATIN SMALL LETTER H BAR -> H 0x49 U+0128 U+012a U+012c U+012e U+0130 # I *************** *** 127,134 **** 0x69 U+0129 U+012b U+012d U+012f U+0131 # i U+0132:IJ U+0133:ij ! 0x4a U+0134 ! 0x6a U+0135 0x4b U+0136 0x6b U+0137 U+0138:kk --- 131,138 ---- 0x69 U+0129 U+012b U+012d U+012f U+0131 # i U+0132:IJ U+0133:ij ! U+0134:Jx ! U+0135:jx 0x4b U+0136 0x6b U+0137 U+0138:kk *************** *** 151,164 **** U+0153:oe 0x52 U+0154 U+0156 U+0158 # R 0x72 U+0155 U+0157 U+0159 # r ! 0x53 U+015a U+015c U+015e U+0160 # S ! 0x73 U+015b U+015d U+015f U+0161 # s 0x54 U+0162 U+0164 # T 0x74 U+0163 U+0165 # t U+0166:T/ U+0167:t/ ! 0x55 U+0168 U+016a U+016c U+016e U+0172 # U ! 0x75 U+0169 U+016b U+016d U+016f U+0173 # u U+0170:U" U+0171:u" 0x57 U+0174 --- 155,172 ---- U+0153:oe 0x52 U+0154 U+0156 U+0158 # R 0x72 U+0155 U+0157 U+0159 # r ! 0x53 U+015a U+015e U+0160 # S ! 0x73 U+015b U+015f U+0161 # s ! U+015c:Sx ! U+015d:sx 0x54 U+0162 U+0164 # T 0x74 U+0163 U+0165 # t U+0166:T/ U+0167:t/ ! 0x55 U+0168 U+016a U+016e U+0172 # U ! 0x75 U+0169 U+016b U+016f U+0173 # u ! U+016c:Ux ! U+016d:ux U+0170:U" U+0171:u" 0x57 U+0174
