The problem with new ICS LatinIME I noticed is that it does not recognize 
unicode characters from languages like Czech or Croatian properly.
Gingerbread LatinIME worked fine.
In examples I'm refering to LatinIME versions in which I included Croatian 
dictionary

First, examples of expected behaviour:

1. ICS English - Start typing: "hom"

IME offers:
him, home, homogeneous, hometown, homemade, homosexual, homeland, homicide...


2. Gingerbread Croatian - Start typing: "šuš"

IME offers:
šuška, šušku, šuški, šušte, šuškom...



Here is what happens in CM9 ICS:

1. Croatian - One starts to type: "šuš"

IME offers:
usput, sluša, šuma, slušanja, šumi, slušaš, usprokos, uskrsnuće, slušao, 
usluge...

IME should offer (from dictionary with frequency):
...
<w word="šuška" f="48"></w>
<w word="šuškanje" f="44"></w>
<w word="šuškalo" f="44"></w>
<w word="šuša" f="37"></w>
<w word="šuškanja" f="29"></w>
<w word="šuše" f="28"></w>
<w word="šuštanja" f="26"></w>
...


2. Czech

One starts to type: "čer"

IME offers: čele, včetně, ČSSR, českou, dcera, čemu, česky, erik, českým...


IME should offer (from dictionary with frequency):
...
<w word="červ" f="255"></w>
<w word="červen" f="254"></w>
<w word="čele" f="170"></w>
<w word="černá" f="138"></w>
<w word="červenec" f="127"></w>
<w word="červená" f="103"></w>
<w word="června" f="96"></w>
... 


I believe that the problem begins with native/src/char_utils.cpp but changes 
are needed elsewhere too. It definitely misses the following:
    { 0x0106, 0x0107 },  // LATIN CAPITAL LETTER C WITH ACUTE
    { 0x010C, 0x010D },  // LATIN CAPITAL LETTER C WITH CARON
    { 0x017D, 0x017E },  // LATIN CAPITAL LETTER Z WITH CARON
    { 0x0160, 0x0161 },  // LATIN CAPITAL LETTER S WITH CARON

Possibly other chars for other languages...

Also noticed that character definitions are missing in basechars.h list. But 
after I added both of these and recompiled LatinIME, it still didn't help.


It's obvious that prediction engine doesn't see undefined characters as valid 
ones. Why - that's something I'm trying to figure out and would appreciate a 
hint or two where to look for it. It seems to me that the problem could be in 
/LatinIME/native/src but I don't see it. :(
I'm lost... :(

-- 
You received this message because you are subscribed to the Google
Groups "Android Developers" group.
To post to this group, send email to android-developers@googlegroups.com
To unsubscribe from this group, send email to
android-developers+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/android-developers?hl=en

Reply via email to