users suggesting words for the spell checker

2014-10-13 Thread Daniel Naber
Hi, I've activated a new feature on languagetool.org for German that lets users suggest a word for the spell checker. There's a new menu item on potentially misspelled words that directs users to a page on community.languagetool.org where they can submit this word. No registration is needed

API now always up-to-date

2014-10-13 Thread Daniel Naber
Hi, I've modified our snapshot creation script so that it automatically deploys the snapshot as our HTTP API server. This API is also used by the check on www.languagetool.org, so the website now always uses the latest snapshot of LT. Updates happen once a day. If tests fail, the new version

Re: API now always up-to-date

2014-10-13 Thread R.J. Baars
Great! Hi, I've modified our snapshot creation script so that it automatically deploys the snapshot as our HTTP API server. This API is also used by the check on www.languagetool.org, so the website now always uses the latest snapshot of LT. Updates happen once a day. If tests fail, the new

Re: users suggesting words for the spell checker

2014-10-13 Thread Jan Schreiber
I love the idea! A user-friendly, hassle-free way to submit false positives in Hunspell was indeed quite desirable. Am 13.10.2014 09:42, schrieb Daniel Naber: Hi, I've activated a new feature on languagetool.org for German that lets users suggest a word for the spell checker. There's a new

Re: switching from Hunspell to Morfologik

2014-10-13 Thread Jan Schreiber
In case anyone's interested in the exported plain text file, it is here: http://sourceforge.net/projects/germandict/files/Morfologik/de_frequency.7z I sorted the words by frequency class and additionally sorted the largest A class of least frequent words by word length. The frequency

Re: switching from Hunspell to Morfologik

2014-10-13 Thread Daniel Naber
On 2014-10-11 12:00, Daniel Naber wrote: to provide LT as a 100% pure Java software, I'd like to switch from Hunspell (native code) to Morfologik (Java-based). For that, I think the following languages are easy to switch: Asturian Galician Khmer Spanish I've

English disambiguation issue - verb tagged as adjective

2014-10-13 Thread Robin Dunn
Hi, In the following sentence 'feeling' is tagged as an adjective (JJ) but not a verb (VBG). This is a problem because I can't pick up the following error where feeling is actually a verb but your has been used instead of you're. *I hope that your feeling better soon.* Here's the

Rule group - use message, short, url, example tags only once

2014-10-13 Thread Robin Dunn
Hi, Is it possible to specify the message, short, url, example tags only once per rule group e.g. rather than repeating them for every rule in the group if the content of these tags is the same for all rules in the group? Thanks Robin.

Re: Rule group - use message, short, url, example tags only once

2014-10-13 Thread Daniel Naber
On 2014-10-13 16:51, Robin Dunn wrote: Is it possible to specify the message, short, url, example tags only once per rule group e.g. rather than repeating them for every rule in the group if the content of these tags is the same for all rules in the group? That's only possible for url so

IndexOutOfBoundsException with min=0 attribute in pattern rule

2014-10-13 Thread Robin Dunn
Hi, I get the following error when using the min=0 attribute in this English pattern rule. Does anyone know what might be causing this? If I remove the min=0 tag the error disappears. Thanks Robin. rule pattern marker

Re: Rule group - use message, short, url, example tags only once

2014-10-13 Thread Robin Dunn
I see, yes keeping the examples specific for each rule rather than one for the whole rule group makes sense for testing so I guess we should keep it that way. Would be nice if the message and short could be defined only once for the rule group. By the way as you may have noticed by my recent

Re: IndexOutOfBoundsException with min=0 attribute in pattern rule

2014-10-13 Thread Jaume OrtolĂ  i Font
Hi, I think a token min=0 at the end or at the start of a pattern is useless. The pattern is equivalent with or without this token. The error probably comes from a bug. Nobody tried token min=0 at the end of a pattern precisely because it is useless. Regards, Jaume OrtolĂ  2014-10-13 17:07

Re: Rule group - use message, short, url, example tags only once

2014-10-13 Thread Daniel Naber
On 2014-10-13 17:15, Robin Dunn wrote: If you still have a place for an English maintainer I like to sign up, Absolutely! We're very happy about contributions for any languages, but especially for English, as it didn't get the attention it should. Also in terms of contributing to the code

Re: switching from Hunspell to Morfologik

2014-10-13 Thread Jan Schreiber
Dawid, thanks a ton for the clarifications. TBH I blindly followed the instructions on the Wiki page you mentioned and I wasn't really sure what I was doing. Please let's make sure I don't misunderstand you. with SUFFIX encoder (which your .info file implicitly picks) This encoder expands

Re: switching from Hunspell to Morfologik

2014-10-13 Thread Juan Martorell
On 13 October 2014 16:10, Daniel Naber daniel.na...@languagetool.org wrote: I've switched Spanish to Morfologik now, would be nice if someone could test it. Suggestions should be at least as good as before, they consider word frequencies now. I was trying to develop one rule and I was

Re: switching from Hunspell to Morfologik

2014-10-13 Thread Dawid Weiss
Hi Jan, To be honest I'm not really familiar with LT's code either, so I'm not sure what the dictionary wrappers in LT are actually doing ;) I just chipped in because I'm familiar with morfologik-stemming, so I tested your dictionary and provided my feedback. This encoder expands suffixes,