Re: [lingu-dev] About proofreader and spell checker interaction

Thomas Lange - Sun Germany - ham02 - Hamburg Tue, 21 Apr 2009 07:17:14 -0700

Here is Marcins reply (2nd posting):



Hi Thomas,

[snip]


>> >>     // the proofreader may return SPELLING but right now our core
>> >>     // does only handle PROOFREADING if the result is from the
>> >> proofreader...
>> >>     // (later on we may wish to color spelling errors found by the
>> >> proofreader
>> >>     // differently for example. But no special handling right now.
>> >>     if (rDesc.nType == text::TextMarkupType::SPELLCHECK)
>> >>         rDesc.nType = text::TextMarkupType::PROOFREADING;
>>     

Ah...

I was using bonsai before and couldn't find the sources yesterday, 
that's why I asked.


> > Currently, where spell checking is still a separate process and there is
> > no coordination between it and proofreading it is explicitly disabled.
> > The reason for this is that it may be bad to have to different and
> > independent components spell check the same text. There is no mechanism
> > to prevent/solve inconsistencies.
>   

ah, you're right. It's non-trivial.

[...]


> > Currently spell checkers are chained (that is up for discussion as well
> > though, since without chaining the route to take seems to be rather
> > obvious). That means if any of several spell checker for a given
> > language says this text is correct than no error will be reported. That
> > would allow for spell checker A to check normal English text, and for
> > spell checker B to know only about English medical words. Those two
> > spell checkers can easily be chained and you will get a result that is
> > better than using just a single one. Without chaining you would need a
> > spell checker that has to take care of both tasks in one sweep.
>   

I'd say that chaining is OK as far as normal (non-context) spellers are 
concerned. For grammar checkers it should be different, as they work on 
a different principle (most of the time): instead of accepting a word 
from a finite list, they search for an error from a finite list to say 
that they don't accept the text. So instead of using OR (a disjunction) 
of results, use AND (a conjunction) here - all proof-readers should not 
raise any errors, but if any of them raises one, display it.

[...]


> > Thus the problems at hand and to be discussed are:
> > 
> > a) should we give up on chained spell checkers even though there are
> > good uses for them? The simple fact that vanilla OOo has only one spell
> > checker does not mean there aren't other spell checkers around that
> > already make use of that chaining... Or that someone would like to make
> > use of it in the future.
>   

The easiest solution would be to define that a proofreader that has 
isSpellChecker() should be chained as all checkers are. If not, then it 
should be treated in the following manner: whenever a proofreader 
returns an error marked as spellcheck, display it in red, unless this 
error has been found earlier by another checker. Yet, in such a case, a 
comment should be in place, so only change the color, nothing else. 
(Even in a spellchecking dialog, the error could be reported later than 
normal spelling errors).


> > But even if we give up on chaining but still have a grammar checker that
> > is also a spell checker AND a second only spell checker, we still have
> > to decide if we want to make use of the second one. If we want to make
> > use of that one as well, how to merge the results? Should it simply be
> > that the grammar checkers spell checker is only allowed to mark errors
> > where the second one hat found none? 
>   

That seems reasonable, otherwise multiple errors would be displayed in 
the same position.


> > Or should it be allowed to
> > overrule errors found by the second one as not-to-be-reported as well?
>   

That is interesting. Well, I didn't think of it as we never say "this is 
acceptable", we only return errors. The API has no way of overruling 
results. I would say an easier solution would be to explicitly say that 
spellcheckers should accept all words disregarding the context, so they 
would accept "Sri" without "Lanka" or "Burkino" without "Fasa". Next, a 
grammar checker would see if Lanka is preceded with Sri, or Sri is 
followed by Lanka etc.

Of course, this presupposes that developers of proofreaders are in touch 
with developers of spellchecker dictionaries so that dictionaries would 
be properly prepared.

Yet, as you probably know, Laci Nemeth wants to add some limited 
context-check to hunspell. That would already create some problems... 
Probably in such a case, another process in hunspell should use the 
proofreader mechanism to lookup the context, but first the individual 
words should be normally accepted.

My proposal doesn't require any change to the API - it would only define 
what to do with text markup = spellcheck in case when the grammar 
checker is not a spellchecker, and when it is a spellchecker.


> > Or do we need even more complex handling for this problem?
>   

I cannot see a use for it.

Marcin

Ps. BTW, I've heard that the comment being visible only after clicking 
"Explain" is definitely less usable than the previous dialog box that we 
had in LanguageTool. Users I talked to prefer to have the explanation 
displayed without clicking. I find this intuitive as well. Maybe we 
should ask people from the UX project to comment on this?

Re: [lingu-dev] About proofreader and spell checker interaction

Reply via email to