> The result is much better if you allow the ASCII conversion to be a string.
> This allows you to, e.g., "©" = "(c)", "½" = "1/2", and so on. This is also
> good for letters: "ß" = "ss", "å" = "aa", etc.

etcetra? I think he needs more direction then that, especially most naïve 
algorithms are going to produce "a" from "å". Diagraphs can be treated
as titlecase or capital or intelligently.

00FE - "th"
00DE - "TH"
00F0 - "dh" ("th"?)
OOD0 - "DH" ("TH"?)
0108 - "CH" (Esperanto)
0109 - "ch"
011C, 011D - "GH", "gh" (E-o)
0124, 0125 - "HH", "hh" (")
0134, 0135 - "JH", "jh" (")
015C, 015D - "SH", "sh" (")
017F - "s"

Depending on your goals, 015F & 0161 could be "sh", 0163 "ts",
017D "zh", etc. 

0195 - "hw"
01A3 - "gh"(?)
01BF - "w"
01C0 - "|" ("c"?)
01C1 - "||"? ("x"?)
01C3 - "!" ("q"?)
0223 - "w" ("ou"? "8"?)

I omitted most capitals and those that can be found by decomposition
or name stripping, as well a bunch I don't know anything about.
-- 
___________________________________________________________
Sign-up for Ads Free at Mail.com
http://promo.mail.com/adsfreejump.htm


Reply via email to