2014-06-06 20:45 GMT+02:00 Juan Martorell <juan.martor...@gmail.com>:
> > *1st and foremost: disambiguator:* > > My current strategy for disambiguation is starting by the longer > constructions and then downsizing to the two tokens constructions. Positive > and negative examples should be included. > I can point out some strategies for disambiguation. I will try to make a summary. > *2nd stage: Dictionary* > > Some pronouns are attached to verbs and they need to be identified to get > a correct POS tag. > Take a look at the Catalan tokenizer. You need to do something similar. In Spanish, you also need to remove diacritics (accents), which is not needed in Catalan. Regards, Jaume Ortolà
------------------------------------------------------------------------------
_______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel