Thanks for your input Richard,
Firstly, you are right, I was mistaken about ICU and the breakiterator
working for sentences (I just tried it right now and it does work, but just
not with the normal khan or period of Khmer rather it works with Latin
sentence markers which is not enough). I had
Dear Nathan,
Here are some new ideas, ordered by desirability, with number one being the
most desired, to number three being the least.
1) When a zero-width space is detected (U+200B), shut off ICU breakiterator
for Khmer spell checking for characters following the zero-width space
until
On Thu, 27 Sep 2012 11:52:26 +0700
Nathan Wells sungk...@gmail.com wrote:
1. If you are shutting off the ICU breakiterator for text following,
we
should probably also do it for text preceding. Thus if there is a
ZWSP or ZWNBSP (U+2060 WJ) anywhere in a text then ICU break
iteration is
On Thu, 27 Sep 2012 21:08:13 +0700
Nathan Wells sungk...@gmail.com wrote:
Firstly, you are right, I was mistaken about ICU and the breakiterator
working for sentences (I just tried it right now and it does work,
but just not with the normal khan or period of Khmer rather it
works with Latin
Hello Again,
Thank you all for your input!
This is a deeper problem than I first thought...sorry for the delayed
response, but I hope a solution can be found, even though the current ICU
breakiterator is not at 100% for Khmer.
Here are some new ideas, ordered by desirability, with number one
Thanks Martin,
1. If you are shutting off the ICU breakiterator for text following, we
should probably also do it for text preceding. Thus if there is a ZWSP or
ZWNBSP (U+2060 WJ) anywhere in a text then ICU break iteration is disabled
for the whole sentence.
Yes, I think you are right. If
On Thu, 26 Jul 2012 16:33:00 +0700
Martin Hosken martin_hos...@sil.org wrote:
1. use of U+2060 makes string searching and spell checking harder
(unless WJ chars are stripped for searching and spell checking). They
are not part of the spelling of a word, so their introduction in the
underlying
Dear All,
An automatic word and line breaker is very necessary for Khmer and
Thai because traditionally they have no spaces between words, and so
line-breaking and spell checking require the use of a zero-width space
between words which is counterintuitive for most native speakers, and
I'll cc this to the list if you don't mind, in order to archive it. I
have no immediate great ideas. But I wonder if a view-word boundaries
mode would be helpful, i.e. something that indicates the boundaries of
the words that the software thinks exist.
On Sun, 2012-07-15 at 21:40 +0700, Nathan
Thanks for your reply.
Yes, a view-word boundaries mode would be very helpful (or
even incorporating the current view-field shading to include viewing
'gray marks' at the automatic ICU breaking so that users can see what is
being done). Would this be hard to implement?
Also, we are making some
On Sun, 2012-07-08 at 08:08 -0700, sungkhum wrote:
I have two questions: is there a way to have the LibreOffice spelling
checker (Hunspell) also recognize word-breaks using the ICU break iterator
for Khmer so that Cambodians no longer have to add zero-width spaces
manually (as it seems to work
to this email, your message will be added to the discussion
below:
http://nabble.documentfoundation.org/Adding-Extension-for-Experimental-Thai-Spelling-tp3735637p3995127.html
To unsubscribe from Adding Extension for Experimental Thai Spelling, click
herehttp://nabble.documentfoundation.org/template
://nabble.documentfoundation.org/Adding-Extension-for-Experimental-Thai-Spelling-tp3735637p3994303.html
Sent from the Dev mailing list archive at Nabble.com.
___
LibreOffice mailing list
LibreOffice@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo
Hi,
2012/2/17 Richard Wordingham richard.wording...@ntlworld.com:
It's a vast improvement - it gives LibreOffice a real Thai
spell-checker. Thank you. I have one worry for Siamese - Németh László
suggested that there might be a licensing issue back in
On Thu, 2012-02-16 at 23:24 +, Richard Wordingham wrote:
I wouldn't expect a dictionary-based line breaker to handle words from
other languages. (There's a whole slew of Mon-Khmer languages in
Thailand, and they mostly use the Thai script when they happen to get
written.)
Indeed, yeah, I
On Fri, 17 Feb 2012 14:10:21 +
Caolán McNamara caol...@redhat.com wrote:
On Thu, 2012-02-16 at 23:24 +, Richard Wordingham wrote:
Indeed, yeah, I suppose, assuming its as complicated as Thai, that
the right direction would be for someone to write for icu new
dictionary-based
On Tue, 14 Feb 2012 16:19:17 +
Caolán McNamara caol...@redhat.com wrote:
I think this change:
http://cgit.freedesktop.org/libreoffice/core/commit/?id=475d0c59c66fb7752d230f76130b17145aad0c12
should improve matters a lot.
It's a vast improvement - it gives LibreOffice a real Thai
On Mon, 2012-02-13 at 22:39 +, Richard Wordingham wrote:
The spell-checker seems to break up a phrase consisting of just กุหลาบ
into 3 or 4 words.
Hmm, so I played around with this and here's what I think is the
problem...
We have some customized break iterator rules in LibreOffice, so
Hi,
On Tuesday, 2012-02-14 16:19:17 +, Caolán McNamara wrote:
We have some customized break iterator rules in LibreOffice, so we're
using those ones and *not* the built-in icu ones. But we lack a
customized Thai one, so we're using some ultra-generic word breaking
stuff for Thai and not
On 11/02/12 17:23, Richard Wordingham wrote:
As I understand it, the lack of a usable Thai spell-checker for
LibreOffice (unlike, say, a Khmer spell-checker) is due to the Thai
break iterator. (I had expected Thai and Khmer to face similar
problems, for neither has a visible word separator
On Sat, 2012-02-11 at 16:23 +, Richard Wordingham wrote:
As I understand it, the lack of a usable Thai spell-checker for
LibreOffice (unlike, say, a Khmer spell-checker) is due to the Thai
break iterator.
In common with many, I know nothing about Thai ;-) but my friend Tim
does -
On Sat, 2012-02-11 at 16:23 +, Richard Wordingham wrote:
Is it possible to create an experimental alternative to the Thai
break iterator that can be shared with other people as a LibreOffice
extension?
I don't think we have any way to override our breakiterators from
extensions.
FWIW,
Thank you to every one who's offered me advice.
On Mon, 13 Feb 2012 15:08:20 +
Caolán McNamara caol...@redhat.com wrote:
I don't think we have any way to override our breakiterators from
extensions.
Ah well, I'll just have to try to get Thai spell-checking working for
myself and then
As I understand it, the lack of a usable Thai spell-checker for
LibreOffice (unlike, say, a Khmer spell-checker) is due to the Thai
break iterator. (I had expected Thai and Khmer to face similar
problems, for neither has a visible word separator and syllable
boundaries are often unclear in both.)
24 matches
Mail list logo