Re: [lingu-dev] lost postings[3/5]: [SoC] Grammar checker API

Bruno Sant'Anna Thu, 01 Jun 2006 04:46:17 -0700

On 6/1/06, Thomas Lange <[EMAIL PROTECTED]> wrote:

Hi Bruno,

> >     >    2. The grammar checker should run in a different thread to not
> >     block
> >     >       OpenOffice.
> >
> >     You mean when grammar checking is done automatically (in the
background
> >     like automatic spell checking) only?
> >
> >
> > No,  not  just in background,  I was planning to implement both of them,
> > automatic checking and interactive checking, but for both of them we can
> > create threads. My idea in creating threads is to not block OOo, when a
> > user want interactive checking it doesn't matter but when a authomatic
> > checking start it must run in background and the main process (OOo) must
> > continue.

Having a thread on it's would be Ok.
Of course you have to care of some things:
First and topmost:
- you must make sure that the grammar checking thread does not take
the upper hand on the CPU's time when it is running.
It would be unacceptable if the main office activity ( e.g. formatting)
would slow down considerably.
Thus you must not use polling to access every text portion (sentence,
paragraph,...) in order to check the document.
Probably the grammar checker should either explicitly ask for the next
portion to check or the document should notify the grammar checker
about the portions to be checked.
Either way it looks troublesome to get/maintain the ability to check
subsequent text in order to be able to check text across paragraph
borders.
- You need to think about what to do if the paragraph being checked is
currently edited as well. Should it not checked? Should editing
prohibited? Do you simply ignore what was edited in the document when
the dialog writes it's modified version of the text back to the
document? Or do you like to introduce API means to resolve issues
when this problem occurs?

Um, Thomas, I am seriously thinking about changing this behavior, from passing the hole paragraph to pass just single sentences (look in the main thread, i posted a complete explanation) , it will not just help with sereval languages, it can avoid this kind of problems as well. It will be proper both for automatic checking (letting the user change the paragraf as he wants) and for interactive checking, It will check in real time. what do you think? (I've send you an e-mail about it yesterday did you received it?)

Anyway when you decide about a thread model to use please also ask for
Mathias' advice since he is much more familiar with threads and
related problems than me.

Ok I ask him. =)

> > Here I want to add a thing, I'm planning to implement both modes ok
> > (automatic and interactive). When a user request the interactive method
> > (e.g. clicking in a button "Check Grammar") the API provides the current
> > text, I mean everything, not just a block, I think it is secure, it can
> > be slow but the user is prepared for waiting since he asked to check.

Err... No please!
I do not find that acceptable at all. Doing this is to invite trouble
and possibly prone to errors as well.
Also other tasks in the background like loading another document or
formatting a third one may be effected negatively (probably performance
only though).
Also it is always possible(!), maybe unlikely (but that should be
no reason to ignore it) that the document gets changed by means of API
by other objects/components/applications... etc. And to prevent this
would require to get exclusive access to the whole document for a larger
part of time.
And even given our current architecture it is probably required to get
acquire the solar mutex for the whole of the process. And acquiring that
is quite similar to 'nothing else goes now'. That is definitely not wanted.
Even database are used to provide as fine granularity as possible to
prevent such exclusive access to large entities.

Thus you should not try to handle more than a paragraph at the time!

it changes when we let the API split the paragraph into sentences too.

> > In
> > the automatic checking, after every change of a paragraph the API sends
> > it to checker, I was thinking about setting a time limit too, for
> > example, 60 seconds, what do you think?

I'm quite unsure want you want to do here.
Do you want to stop grammar checking after 60s or at least 60s after
nothing was found incorrect?

.. here changes too, we dont really need time to automatic check.

If it was activated by the user it should be active all the time.
This of course does not mean that the grammar checker will be always
active. It does only mean the grammar checker gets notified that
'something is to be done' or maybe that he itself periodically checks
if something needs to be done.

It could be like this, other ideas are of course possible as well:
The grammar checker gets notified that the document is to be checked.
(Maybe the initial paragraph is provided as well).
It then checks that paragraph and marks all wrong text parts. When the
paragraph was checked it asks the document for the next paragraph to be
checked and so on until the document does not provide another paragraph
to be checked.
Of with this approach there is no defined order of the paragraphs to
check. Thus it would require additional means if the grammar checker
needs to check across paragraph borders.

UM i think when the API define sentence endings it will be easier to check paragraph border texts, for e.g. enums.

This is part of the reason why I once asked if it is "really" necessary
to check across paragraph borders. It will complicated things.
And maybe the gain is less than the trouble.

BTW: Just curious, does anyone know if other applications grammar
checker are capable to properly check a single sentence that goes across
more than one paragraph?

> >     For example by means of an abstract API to iterate
> >     through and modify the text of a document.
> >     And pushing that question one step further:
> >     Is the grammar checkers implementation to iterate through the
text or
> >     should there be a different object that iterates through the
text and
> >     calls the grammar checker to process it?
> >
> >
> > For me the second one, it can treat details like formatting, letting the
> > grammar checkers act directly should be dangerous for text formatting

Same for me. I just wanted to ask the question because I like to know
if someone has a good argument against this approach. ;-)

Two other question just popped into my mind:
- Should a grammar checker always implement the spell
checking API as well?
If not it would easily be possible that a spellchecker disagrees
with the grammar checker about a specific word or at least the
suggestions for it.
- Or more basically: if we have a grammar checker for a language should
we allow a spellchecker for the same language to be active as well?

The answers are probably obvious but I like to see if it we agree here.

> >     BTW: The I18N break-iterator is not that bad with abbreviations.
I think
> >     it has a list of those. But citations and similar things might
pose a
> >     huge problem to it.
> >
> >
> > Question: Can grammar checkers use I18N break-iterator?

Sure it is an UNO service that can be invoked and used by anyone.

> >     And another question would be:
> >     Having the grammar checker being called with sentences, does it mean
> >     when an error is found the whole paragraph is presented to the user
> >     (could be really large!) or does the UI only display the sentence of
> >     where the error occurred?
> >
> >
> > The  sentence for sure, for these we will have a list with start
> > position and end position indexes. =)

Likely not indices. As Oliver mentioned in one of his postings, indices
have no meaning at all if fields, bookmark etc. are used in the text. Or
if the text got changed meanwhile.
TextRanges are much more well defined object. Indices may only be
appropriate in the API to the actual grammar checker itself because in
the end every grammar checker will know only about text and positions in
text. Because of this it is likely that you need to keep track of both.

Ok, we have to figure out something, i think I will have a better idea when coding =).

> > I think the secure way of implement changes is by showing dialogs, even
> > in authomatic checking, it just show the mistakes, a user have to right
> > click in it and a dialog appears. Have you figured another way to do it?

I don't get what you mean here by showing a dialog even for automatic
checking. As I see it about the whole thing of automatic checking is
to not disrupt the users work-flow by raising dialogs.
Of course the should be the context-menu if you right-click on the
wrong text.

Imagine if the grammar commited a mistake, to it not simply changes the sentence withot any advice the user must permit it.

Re: [lingu-dev] lost postings[3/5]: [SoC] Grammar checker API

Reply via email to