Quoting Artavazd Mertarjyan <[EMAIL PROTECTED]>:

>
> Hi All!
>
> Thanks for detailed answers!
> I agree that HunSpell is better then MySpell and I'm going to localize it
> for Armenian too.
>
> In "hu" project CVS (2.0.1) the Hungarian language isn't defined as UTF-8.
> Does that mean you are not using UTF-8 for Hungarian or you have another one
> solution?

Hi,

You can find the source of the Hungarian OOo 2.0.1 build on our build server:

http://ftp.fsf.hu/OpenOffice.org_hu/2.0.1/

See hu_HU_u8.aff and hu_HU_u8.dic files in the

http://ftp.fsf.hu/OpenOffice.org_hu/2.0.1/OOo_2.0.1_src_hu_additional.tar.gz

file, and in the builds.

> If you are using UTF-8 now, have you compare these two solutions, which is
> faster?
>
> I've some doubt in score of HunSpell's UTF-8 text spell checker.
> May be for Armenian it will be better to use the same algorithm in the
> HunSpell?

Unicode encoding has a little overhead.

Using UTF-8 dictionary is slower on Hungarian texts by 10-20 percent (checks
80,000-90,000 words/s instead of 100,000 words/s on my machine).

But I think, UTF-8 Armenian spell checking will be faster, as _8-bit_ Hungarian
spell checking, because the bottle neck is the complexity of the
morphology (the affix description) and the compound word support.
Hungarian uses double suffix stripping plus compounding, and enough
fast with UTF-8 encoding, too.

We need the best spell checking and other Lingucomponent support for Armenian,
too. Please, write (and make issues in the Issuezilla) about problems of
Armenian OOo. For example, need for Armenian breakiterator patch, Armenian
hyphenation with the special Armenian hyphen character, etc. UTF-8 encoding has
already set in the Thesaurus component of OOo 2.0.1, thanks to the report of the
Nepali developers.

Best regards,

Laci

>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>




----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to