2014-06-06 20:45 GMT+02:00 Juan Martorell <juan.martor...@gmail.com>:

>
> *1st and foremost: disambiguator:*
>
> My current strategy for disambiguation is starting by the longer
> constructions and then downsizing to the two tokens constructions. Positive
> and negative examples should be included.
>

I can point out some strategies for disambiguation.  I will try to make a
summary.



> *2nd stage: Dictionary*
>
> Some pronouns are attached to verbs and they need to be identified to get
> a correct POS tag.
>

Take a look at the Catalan tokenizer. You need to do something similar. In
Spanish, you also need to remove diacritics (accents), which is not needed
in Catalan.

Regards,
Jaume Ortolà
------------------------------------------------------------------------------
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to