Re: [lingu-dev] About grammar checker

Thomas Lange Wed, 27 Jul 2005 04:23:52 -0700


Hi all,

I just wanted to mention a few document-level checks that couldn't beperformed if the checking is done paragraph by paragraph. I don'tthink any of them is all that important, but they are worthconsidering.
1. For a letter, checking that the opening (Dear Sir) and closing(Yours faithfully) match.


Good point.
But for sake of the implementation in the application (where paragraphs
are the units usually represented) later I think the grammar checking
will simply uses paragraphs as it's largest accepted unit.

I'm sorry to be so blunt about that but considering the core I think
that's what it will be.

One thing that maybe(!) can accomplished is to carry the information
if the previous sentence was correctly finished to the next sentence
when the API provides a flag for.
But even in that case I'm not sure about the actual implementation
in the different applications later on. It might still not happen.

2. Checking consistency of capitalisation and punctuation in lists.
3. Consistency of terms throughout (for example choosing one of'appendixes' or 'appendices' and using it exclusively).


That's a very nice idea!

But I think that is probably done best after the Google summer of
code since the time is quite limited and there is still much to be
done. But of course it's Keli who must decide what he can do in
the remaining time.

Simmilar thing for usage of American and English writing would be
nice. That is it should either be color or colour but not both in the
same text.
But that already raises the quiestion about what is happening in the
next paragraph where colour is used when the previous paragraph used
color?
It would be nice to detect that too.

But that kind of ability will surely require the grammar checker to
have some kind of state information. How else would it know in the
second paragraph what was founf and accepted in the previous one?
If something like that should be possible we must allow for state
information to be saved between calls.
But since that of one document should not be mixed up with that of
another one that is also opened and checked in the background we
also need to have some kind of information transferred in the API
calls that identifies the document the paragraph comes from.

Also currently the background (spell)checking start just somewhere
around the cursor/view and no from the beginning of the document,
wraps around at the end at continues until it reaches the starting
position.
For the above kind of check one might like to change this since other
wise it will be somewhat arbitray which of color or colour is encounterd
first.
Unfortunately there is an obviuos drawback: if the cursor is somewhere
in the middle of the document you won't see errors from background
grammar checking be displayed for a while, since you have to wait for
grammar checking (starting at the beginning of the document) to reach
the position currently visible.

Currently for spellchecking it is implemented that background spelling
starts in the visible areas and thus you will see results right away.

Perhaps the API should support both paragraph-level calls (for real-
time checking) and document-at-once (for stand-alone one-off checksafter a document has been written).


What do you mean by document-at-once checks?
Do you mean to pass the whole documents text in a single call?
That can't possibly be done.

And if you want something like start with the first paragraph and
continue until the last one without out being interrupted by anything
else so that you implicitly might know "the last call checked the
previous paragraph" that won't happen also.
The reason for this is that a document has no explicitly access to
the grammar checker, thus another document e.g. from a different
application or even a API call from about anywhere is allowed at

any time. In general you can't conclude the text in a single documentfrom the order of the calls to the grammar checker.

Well of course could the API explicitly designed like that but I
would consider this a bad design explicitly asking for trouble.

One (sensible) way of working isto turn off real-time spelling and grammar checks so that they do notinterrupt the flow of your writing, and then to check spelling andgrammar at the end. Clearly, performance is less of a problem whendoing this.

For spellchecking this already done since you can only edit a singleword at a time background spelling is suspended for that word only

until the cursor leaves the word or Delete or Backspace are used.

And if similar is necessary for grammar checking the core needs to take
care of it in the future. Thus it is at least nothing Keli has to bother
with. That one has to be done by the application specific developers.

Also, would it be possible to provide style information to thegrammar checker (so it gets XML instead of plain text)? Users mightwant to set up rules that check, for example, that certain keywordsare always bold or that latin terms appear in italics. This mightalso be necessary for checking things like the relative placement offootnote markers and quotation marks.


Interesting idea!
-> Falko: you might want to keep that one in mind.

But as for now the time for Google summer of code is limited and having

to parse XML or at least get an XML parser running, even if it mightonly take two days, is too much for the timeline of the project.

It shouldn't be too troublesome to change the interface accordinglylater. AFAIK basically it will still be text being passed on, it's onlya change in semantics, at least as far as the API is concerned.

But thanks for the idea!


Regards,
Thomas


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [lingu-dev] About grammar checker

Reply via email to