Thanks Pabo for your points.

On 7 April 2016 at 11:07, Pablo Saratxaga <pa...@walon.org> wrote:

> Li Wed, Apr 06, 2016 at 02:55:52PM +0200, Juan Martorell scrijha:
>
> >    It is quite common to attach some pronouns to the verb thus including
> >    information about direct and/or indirect object, or passive/impersonal
> >    voice. Combinations are hughe, some like:
>
> but for a proper synthetisation, the verb itsefl has to be correctly
> tagged first, so to know if a pronoun can be added, or if two can be added.
>
> blind automatic generation will lead to a huge mass of incorrect forms.
>

I'd like to see how huge it will be. Some rare or unfrequent forms will not
harm for sure, but I'm not sure how harmful can be some incorrect forms,
for our purpose.


> For example "morirteme" would be wrong.
>

This counterexample is not the best IMHO. Google gives some results. The
correct spelling includes diacritical (*morírteme*), but the point is that
this word, even rare, is gramatically correct and it has full sense in its
context.

Consider: "*Por muy enfermo que estés, ni se te ocurra morírteme ahora.*" "*Con
el viento he hace, si sales así vestido vas a morírteme de frío. Ponte una
chaqueta, anda.*"


>
> With the prefixes it is even more difficult, as the adequatness of
> a prefix depends not only on grammatical properties, but also on
> meaning and usage.
>
> For example, while desforestar, deshacer are ok; desmorir, descaer are odd.
> I think automatic use of prefixes (that is, add the to *ALL* verbs) would
> be wrong.
>

Agreed in some extent. Even thoug "*desforestar*" is valid; "*deforestar*"
is the preferred spelling. "*descaer*" is in the RAE's dictionary being "
*decaer*" the most used spelling.
Following your example, even though *desmorir* is not in the RAE's
dictionary, it may be a neologism with figurative content conveying sense
to the reader. I mean, for religious or philosofical texts, *desmorir* (to
*undie*) can make sense when talking about alternative timelines, where "
*resucitar*" (to resurrect) makes worse sense bein active and removing the
undo sense of *undying*.

Bottm line, the point of grammar proofreading is more the syntax rather
than spelling or semantics, so it would be worth to allow some flexibility
while mild warning rare forms. This setting may be tuned via category
activation.
This is a good case for statistical insertion:

   1. produce the word
   2. check upon the word database created from a large corpus
   3. decide its insertion based on its frequency



> My approach would be to define some tags to apply to verbs (nouns, etc)
> that can accept a given prefix.
>

This is compatible with statistical insertion, IMO.
------------------------------------------------------------------------------
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to