On Mon, 2012-02-13 at 22:39 +0000, Richard Wordingham wrote: > The spell-checker seems to break up a phrase consisting of just กุหลาบ > into 3 or 4 words.
Hmm, so I played around with this and here's what I think is the problem... We have some customized break iterator rules in LibreOffice, so we're using those ones and *not* the built-in icu ones. But we lack a customized Thai one, so we're using some ultra-generic word breaking stuff for Thai and not going near the special built-into-icu Thai iterator :-( I think this change: http://cgit.freedesktop.org/libreoffice/core/commit/?id=475d0c59c66fb7752d230f76130b17145aad0c12 should improve matters a lot. Makes "กุหลาบ" get treated as a single word in the unit test there now anyway, though the Northern Thai one is still not considered a single word, that might be due to the oldish icu we're still using. After some googling I'm unsure if the "right way to go" to further improve Thai break iterators is to simply have another go at upgrading icu to get the latest and greatest there, or for "someone" to have a go at integrating libthai into LibreOffice and hand off break iteration for Thai to that. Either way, link above and related unit test give an entry point to the relevant code. C. _______________________________________________ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice