Re: Some thoughts on grammar Nazis, LanguageTool, and the world in itself

2014-05-20 Thread Daniel Naber
On 2014-05-19 16:16, Juan Martorell wrote:

 Therefore my suggestion is that we discuss KPI in terms of performance
 and maintainability rather than in bring rules to LT and we set some
 goals. I remember there was a nice simplification effort some time
 ago.

The more rules you have, the more difficult it becomes to add yet 
another rule that's really useful. But considering that there are 
several languages with less than 100 rules[1], I think adding rules 
there is one of the most useful things one can do.

For languages with many rules, quality might be improved by evaluating 
LT against an error corpus. We have such an evaluation for English now 
(class RealWordCorpusEvaluator) that runs LT against Jenny Pedler's 
corpus of errors from people with dyslexia.

By the way, the new rule editor has not been developed because I expect 
that hundreds of people will now contribute rules. Instead, it's an 
attempt to gain new long-term contributors, maybe even maintainers. We 
still don't have a maintainer for English, so there's nobody who sets a 
direction for what kind of rules are added or what kind of rules are 
disabled by default.

Regards
  Daniel

[1] https://languagetool.org/languages/
[2] http://www.dcs.bbk.ac.uk/~jenny/resources.html


--
Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free.
http://p.sf.net/sfu/SauceLabs
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Some thoughts on grammar Nazis, LanguageTool, and the world in itself

2014-05-19 Thread Jan Schreiber
Marcin wrote:
 This is perfect English and you could probably find Jane Austin or
 Charles Dickens using such constructions.

I think this little discussion reveals a fundamental problem of an Open
Source grammar checker, and perhaps of grammar checkers in general.

In the back of my head, I've thought this for years. The more popular LT
becomes over time, the more time we will probably have to spend fighting
off prescriptivist poppycock[1].

There are a lot of people around who are well-meaning and intelligent
and in some way interested in language and grammar, but are not trained
linguists. This type of person will just take some grammar book (or even
worse, a style manual) from their shelves and try to translate all the
rules they happen to find there into LT rules, and they are unable to
understand what the rule was originally intended to do. Eventually they
end up doing more harm than good.

The Hemingway app[2] is a good example of what I have in mind here. It
complains about adverbs and gives useless advice such as 1 adverbs. Aim
for 0 or fewer. The idea behind the rule is probably something like
don't use verb + adverb if the same thought can be expressed more
clearly by choosing a more descriptive verb, e.g. don't say
She moved quickly.
but rather
She ran.
The second sentence is easier to understand and expresses a fact more
accurately with less words, so it is in some way an improvement over the
first.

But it is by no means trivial to translate this idea to the XML
formalism, or (more generally speaking) to something a computer can
understand. Running around and telling people not to use adverbs is
certainly not the way to go, it is just confusing.

My first thought when Daniel published the new rule editor was, Let's
hope this tool will not bring us a bunch of grammar Nazis and nitpicky
smartasses.

These are just some thoughts I wanted to share. I'm not sure if I said
something useful, or if these are just smartass remarks themselves.

[1] http://languagelog.ldc.upenn.edu/nll/?cat=5
[2] http://languagelog.ldc.upenn.edu/nll/?p=10416

--
Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free.
http://p.sf.net/sfu/SauceLabs
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel


Re: Some thoughts on grammar Nazis, LanguageTool, and the world in itself

2014-05-19 Thread R.J. Baars
Would it be possible to:

- add a level of 'checking profiles'
- that have the familiar categories
- the categories just contain rule(set) id's (file_name+rule_id)
- the rule-id's are stored in a number of rule files

Of all profiles, just 1 can be active for the active language. This makes
it pre-load all rules of that profile.
Multiple files could make editing easier.
Having just the rule id's in the category prevents redundancy of having
rules in multiple profiles.


Just a thought.

Ruud


--
Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free.
http://p.sf.net/sfu/SauceLabs
___
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel