On 5 April 2016 at 16:29, Jaume Ortolà i Font <jaumeort...@gmail.com> wrote:
> > 2014-06-06 20:45 GMT+02:00 Juan Martorell <juan.martor...@gmail.com>: > >> >> *1st and foremost: disambiguator:* >> >> My current strategy for disambiguation is starting by the longer >> constructions and then downsizing to the two tokens constructions. Positive >> and negative examples should be included. >> > > I can point out some strategies for disambiguation. I will try to make a > summary. > That's a great opportunity for Wiki improvement! > > > >> *2nd stage: Dictionary* >> >> Some pronouns are attached to verbs and they need to be identified to get >> a correct POS tag. >> > > Take a look at the Catalan tokenizer. You need to do something similar. In > Spanish, you also need to remove diacritics (accents), which is not needed > in Catalan. > I will. I'm sure it will be very helpful. I'll report my findings in here.
------------------------------------------------------------------------------
_______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel