> Spanish pos + lemmatizer using this approach.
>>
>
> +1, it would be nice to have control over the dictionary, maybe we can
> come up with
> a format to store it in. That will allow us to easily include it in our
> models
> as a resource for feature generation and eliminates the dependency on
> external libraries.

That would be great! The format should  then take into account
morphological features.


>  Of course, another method would be to re-implement John Carroll and
>> colleagues'  finite-state approach for English (and similar rule-based
>> approaches for other languages) which removes the dependence on a
>> dictionary. I will be exploring this further on.
>>
>
> +1
>
> We should define an interface which allows to use different
> implementations like
> we did for the other components.

+1. It seems that we have european languages represented here. Do we have
anybody from east? chinese? Would be nice to check them too.

Reply via email to