https://bugs.documentfoundation.org/show_bug.cgi?id=116666

            Bug ID: 116666
           Summary: Fix Hungarian sorting
           Product: LibreOffice
           Version: Inherited From OOo
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Severity: normal
          Priority: medium
         Component: Localization
          Assignee: libreoffice-bugs@lists.freedesktop.org
          Reporter: nem...@numbertext.org

Hungarian orthography rules contain the following extra requirements for
sorting words and sentences:

– expand simplified double consonants;

– ignore spaces and hyphens;

– prefer lower case homonyms.

(Source: http://helyesírás.mta.hu/helyesiras/default/akh12#F2_4)

Expansion of double consonants, (eg. sort “ccs” (long “cs”) as “cscs”) is still
not perfect, but in my analysis, it reduces the bad sorting positions by a
factor of 1/5, than ordering without explansion (3843 vs. 19425 in 4 million
word forms).

More important advantage, using full expansion it's possible to automatize
Hungarian sorting with manual (or in future, Hunspell based) preprocessing.
(Unfortunatelly, ICU collation algorithm alone is not enough for Hungarian,
yet.) Inserting soft hyphens is a quick workaround for here, too (as for the
similar problem of the single consonants, eg. “igazság” -> igaz­ság
(igaz[U+AD]ság) sorted before “igaztalan” correctly).

-- 
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
Libreoffice-bugs mailing list
Libreoffice-bugs@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs

Reply via email to