https://bugs.freedesktop.org/show_bug.cgi?id=46950
Bug #: 46950 Summary: Incorrect word breaking for spell-checking Classification: Unclassified Product: LibreOffice Version: unspecified Platform: Other OS/Version: All Status: UNCONFIRMED Severity: normal Priority: medium Component: Linguistic component AssignedTo: libreoffice-bugs@lists.freedesktop.org ReportedBy: n...@math.technion.ac.il It appears that before LibreOffice passes text to the spell-checker, it breaks them into separate words. The problem is that (apparently) it does this using some general language-agnostic rules, while different languages might have different rules as to what characters may be part of a word, and what breaks words. My problem is specifically with the Hebrew spell-checking: In Hebrew, the quote characters - ' and ", are used not just for quoting, but have an additional unrelated use as in-word characters: 1. The single-quote is used to mark foreign sounds. E.g., the word ג'ירפה has a single-quote character after the gimmel, which means it should be pronounced "j", not "g". 2. The double-quote is used inside acronyms, to mark them as such. For example מנכ"ל is the acronym for CEO. מנכ"לים is its plural. Both have quotes in the middle of the word - and these words, together with this quote, are in the dictionary. Because of this, the Hebrew hunspell dictionary includes the following lines in he_IL.aff: BREAK 3 BREAK ^" BREAK "$ BREAK ^' This means that " only breaks words when it's in the beginning and end (and ' only in the beginning) - these characters in the middle of a word never mean a word break in Hebrew. With this setting, hunspell correctly word-breaks and spell-checks Hebrew text. Unfortunately, LibreOffice doesn't respect these instructions. It appears that it incorrectly breaks up the words before sending them to hunspell. The end result is that all Hebrew words which are acronyms or have foreign sounds in them are incorrectly marked as being errors, which is very annoying. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. _______________________________________________ Libreoffice-bugs mailing list Libreoffice-bugs@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs