Re: Adding Extension for Experimental Thai Spelling

2012-09-27 Thread Richard Wordingham
On Thu, 27 Sep 2012 21:08:13 +0700 Nathan Wells wrote: > Firstly, you are right, I was mistaken about ICU and the breakiterator > working for sentences (I just tried it right now and it does work, > but just not with the normal "khan" or "period" of Khmer rather it > works with Latin sentence mar

Re: Adding Extension for Experimental Thai Spelling

2012-09-27 Thread Richard Wordingham
On Thu, 27 Sep 2012 11:52:26 +0700 Nathan Wells wrote: >> 1. If you are shutting off the ICU breakiterator for text following, >> we >> should probably also do it for text preceding. Thus if there is a >> ZWSP or ZWNBSP (U+2060 WJ) anywhere in a text then ICU break >> iteration is disabled for th

Re: Adding Extension for Experimental Thai Spelling

2012-09-27 Thread Martin Hosken
Dear Nathan, > Here are some new ideas, ordered by desirability, with number one being the > most desired, to number three being the least. > > 1) When a zero-width space is detected (U+200B), shut off ICU breakiterator > for Khmer spell checking for characters following the zero-width space > un

Re: Adding Extension for Experimental Thai Spelling

2012-09-27 Thread Nathan Wells
Thanks for your input Richard, Firstly, you are right, I was mistaken about ICU and the breakiterator working for sentences (I just tried it right now and it does work, but just not with the normal "khan" or "period" of Khmer rather it works with Latin sentence markers which is not enough). I had

Re: Adding Extension for Experimental Thai Spelling

2012-09-26 Thread Nathan Wells
Thanks Martin, 1. If you are shutting off the ICU breakiterator for text following, we > should probably also do it for text preceding. Thus if there is a ZWSP or > ZWNBSP (U+2060 WJ) anywhere in a text then ICU break iteration is disabled > for the whole sentence. Yes, I think you are right. I

Re: Adding Extension for Experimental Thai Spelling

2012-09-26 Thread Nathan Wells
Hello Again, Thank you all for your input! This is a deeper problem than I first thought...sorry for the delayed response, but I hope a solution can be found, even though the current ICU breakiterator is not at 100% for Khmer. Here are some new ideas, ordered by desirability, with number one bei

Re: Adding Extension for Experimental Thai Spelling

2012-07-27 Thread Richard Wordingham
On Thu, 26 Jul 2012 16:33:00 +0700 Martin Hosken wrote: > 1. use of U+2060 makes string searching and spell checking harder > (unless WJ chars are stripped for searching and spell checking). They > are not part of the spelling of a word, so their introduction in the > underlying text stream is pr

Re: Adding Extension for Experimental Thai Spelling

2012-07-26 Thread Martin Hosken
Dear All, > > An automatic word and line breaker is very necessary for Khmer and > > Thai because traditionally they have no spaces between words, and so > > line-breaking and spell checking require the use of a zero-width space > > between words which is counterintuitive for most native speakers,

Re: Adding Extension for Experimental Thai Spelling

2012-07-25 Thread Nathan Wells
Thanks for your reply. Yes, a "view->word boundaries" mode would be very helpful (or even incorporating the current "view->field shading" to include viewing 'gray marks' at the automatic ICU breaking so that users can see what is being done). Would this be hard to implement? Also, we are making

Re: Adding Extension for Experimental Thai Spelling

2012-07-25 Thread Caolán McNamara
I'll cc this to the list if you don't mind, in order to archive it. I have no immediate great ideas. But I wonder if a "view->word boundaries" mode would be helpful, i.e. something that indicates the boundaries of the words that the software thinks exist. On Sun, 2012-07-15 at 21:40 +0700, Nathan

Re: Adding Extension for Experimental Thai Spelling

2012-07-12 Thread sungkhum
gt; that though. > > C. > > ___ > LibreOffice mailing list > [hidden email] <http://user/SendEmail.jtp?type=node&node=3995127&i=1> > http://lists.freedesktop.org/mailman/listinfo/libreoffice > > > ------ > If

Re: Adding Extension for Experimental Thai Spelling

2012-07-12 Thread Caolán McNamara
On Sun, 2012-07-08 at 08:08 -0700, sungkhum wrote: > I have two questions: is there a way to have the LibreOffice spelling > checker (Hunspell) also recognize word-breaks using the ICU break iterator > for Khmer so that Cambodians no longer have to add zero-width spaces > manually (as it seems to w

Re: Adding Extension for Experimental Thai Spelling

2012-07-08 Thread sungkhum
-- View this message in context: http://nabble.documentfoundation.org/Adding-Extension-for-Experimental-Thai-Spelling-tp3735637p3994303.html Sent from the Dev mailing list archive at Nabble.com. ___ LibreOffice mailing list LibreOffice@lists.freedesktop.

Re: Adding Extension for Experimental Thai Spelling

2012-02-17 Thread Richard Wordingham
On Fri, 17 Feb 2012 14:10:21 + Caolán McNamara wrote: > On Thu, 2012-02-16 at 23:24 +, Richard Wordingham wrote: > Indeed, yeah, I suppose, assuming its as complicated as "Thai", that > the right direction would be for someone to write for icu new > dictionary-based breakiterators for the

Re: Adding Extension for Experimental Thai Spelling

2012-02-17 Thread Caolán McNamara
On Thu, 2012-02-16 at 23:24 +, Richard Wordingham wrote: > I wouldn't expect a dictionary-based line breaker to handle words from > other languages. (There's a whole slew of Mon-Khmer languages in > Thailand, and they mostly use the Thai script when they happen to get > written.) Indeed, yeah

Re: Adding Extension for Experimental Thai Spelling

2012-02-17 Thread Németh László
Hi, 2012/2/17 Richard Wordingham : > It's a vast improvement - it gives LibreOffice a real Thai > spell-checker.  Thank you.  I have one worry for Siamese - Németh László > suggested that there might be a licensing issue back in > http://openoffice.2283327.n4.nabble.com/Thai-line-breaking-td279131

Re: Adding Extension for Experimental Thai Spelling

2012-02-16 Thread Richard Wordingham
On Tue, 14 Feb 2012 16:19:17 + Caolán McNamara wrote: > I think this change: > http://cgit.freedesktop.org/libreoffice/core/commit/?id=475d0c59c66fb7752d230f76130b17145aad0c12 > should improve matters a lot. It's a vast improvement - it gives LibreOffice a real Thai spell-checker. Thank you

Re: Adding Extension for Experimental Thai Spelling

2012-02-14 Thread Eike Rathke
Hi, On Tuesday, 2012-02-14 16:19:17 +, Caolán McNamara wrote: > We have some customized break iterator rules in LibreOffice, so we're > using those ones and *not* the built-in icu ones. But we lack a > customized Thai one, so we're using some ultra-generic word breaking > stuff for Thai and n

Re: Adding Extension for Experimental Thai Spelling

2012-02-14 Thread Caolán McNamara
On Mon, 2012-02-13 at 22:39 +, Richard Wordingham wrote: > The spell-checker seems to break up a phrase consisting of just กุหลาบ > into 3 or 4 words. Hmm, so I played around with this and here's what I think is the problem... We have some customized break iterator rules in LibreOffice, so we

Re: Adding Extension for Experimental Thai Spelling

2012-02-13 Thread Richard Wordingham
Thank you to every one who's offered me advice. On Mon, 13 Feb 2012 15:08:20 + Caolán McNamara wrote: > I don't think we have any way to override our breakiterators from > extensions. Ah well, I'll just have to try to get Thai spell-checking working for myself and then worry about sharing m

Re: Adding Extension for Experimental Thai Spelling

2012-02-13 Thread Caolán McNamara
On Sat, 2012-02-11 at 16:23 +, Richard Wordingham wrote: > Is it possible to create an experimental alternative to the Thai > break iterator that can be shared with other people as a LibreOffice > extension? I don't think we have any way to override our breakiterators from extensions. FWIW, i

Re: Adding Extension for Experimental Thai Spelling

2012-02-13 Thread Michael Meeks
On Sat, 2012-02-11 at 16:23 +, Richard Wordingham wrote: > As I understand it, the lack of a usable Thai spell-checker for > LibreOffice (unlike, say, a Khmer spell-checker) is due to the Thai > break iterator. In common with many, I know nothing about Thai ;-) but my friend Tim does

Re: Adding Extension for Experimental Thai Spelling

2012-02-13 Thread Michael Stahl
On 11/02/12 17:23, Richard Wordingham wrote: > As I understand it, the lack of a usable Thai spell-checker for > LibreOffice (unlike, say, a Khmer spell-checker) is due to the Thai > break iterator. (I had expected Thai and Khmer to face similar > problems, for neither has a visible word separator

Adding Extension for Experimental Thai Spelling

2012-02-11 Thread Richard Wordingham
As I understand it, the lack of a usable Thai spell-checker for LibreOffice (unlike, say, a Khmer spell-checker) is due to the Thai break iterator. (I had expected Thai and Khmer to face similar problems, for neither has a visible word separator and syllable boundaries are often unclear in both.)