Hello, Why can not you store the Dzonkha words in the dictionary as words together with Tsheg marks:
Wordcount Syl1 syl1TshegSyl2/flags Syl3TshegSyl4TshegSyl5/flags Syl1TshegSyl4/flags .. ? This is how all latin charset using languages store their words. (except: They do not store Tshegs, but checking would work perfectly also with Tshegs) Is Tsheg also between words in Dzongha, or there is space or a different symbol? -eleonora Hi, Dzongkha text flow in continuum. Dzongkha words consists of one or more syllable. in case of multisyllable word, the syllables are separated by the Tibetan Inter-syllabic Mark called Tsheg [unicode: 0F0B]. This Tsheg is a small dot represented in the Dzongkha keyboard by [Space Bar]. So, the basic problem with the Dzongkha Spell Checker is that, this Tsheg causes hunspell to spell check Dzongkha word syllable by syllable. and if we store the .dic file with syllables instead of word, then there would be multitude of invalid words formed. The example to suit the above problem would be Latin-borrowed English words "ad hoc", "alma mater", etc.... if we list "ad", "hoc", "alma", "mater", separately in the .dic file, then we can have words such as "ad alma" "ad mater" "alma hoc", and so on....... i see mentioning about ICU breakiterator, ZWSP, etc. how do these all works..any links to these.... How to go about it... Any idea and suggestionsgreatly appreciated.. Thanks in advance C. Norbu. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lingucomponent.openoffice.org For additional commands, e-mail: dev-h...@lingucomponent.openoffice.org