W dniu 2012-06-04 21:39, Dominique Pellé pisze:
> Marcin Miłkowski <list-addr...@wp.pl <mailto:list-addr...@wp.pl>> wrote:
>
>  > I replaced the 1.1.12 version with a current one, and it works for me on
>  > the command line for 32-bit JVM. I will check in other libraries, please
>  > see if it helps.
>
> Breton spelling checking:
>
> * evel-se is now no longer marked as mistake (good)
> * Gwelloc'h is still marked as spurious mistake yet
>    it's good. I think it's because the tokenizer hack
>    change the apostrophe U+0027 into U+2019.

I fixed this one just now. For this to work, I had to use the following 
lines to Breton affix file:

ICONV 1
ICONV ' ’

Now there's no error shown. I also added ’ to WORDCHARS.

>
> But LanguageTool outputs some strange numbers now!?

That was only for testing.

> French spelling checking:
>
> * "Il sera" and "Ils ont" are now no longer marked as
>    mistake (good)
> * "Jusqu'à" is still marked as spelling error yet it's
>    good. That must be because of the tokenization
>    What is in the dictionary is "jusqu'à"  but LT
>    tokenizes and checks separately: jusqu '  à

Right. It's because French has only these:

WORDCHARS -’

I added ' to my local file and it runs fine, and I think this is a safe fix.

> I've also run my script to find the start up time in
> seconds for all languages when checking a 2-words
> sentence "foo bar".

Startup time should not be really influenced by hunspell much. It is 
only activated during checking, and most time is spent by creating 
suggestions.

> Same test using -d HUNSPELL_RULE to
> disable hunspell (it's about the same time!?)

It should be like this, it's because loading hunspell does not really 
change performance. It's only _running_ hunspell.

> The startup time was significantly faster before
> the Hunspell changes.  These are the startup time
> before hunspel (svn r6963):

Hm, we added new features, such as a slightly more complex configuration 
file, but it is not used for the command-line. I have no idea what happens.

> Any reason for LT to become slower now even
> when disabling Hunspell with -d HUNSPELL_RULE?

It must be something else than hunspell.

> I assume that it's the Hunspell change that slows
> down between r6963 (fast) and r7231 (slow) but
> I could be wrong since there are other changes.

That's what I think.

> My script to measure startup time is available here:

Startup in this measurement might be misleading. Try to profile 
HUNSPELL_RULE using -p switch.

Regards,
Marcin

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to