Thanks

Yes, i have done this already. It works great. The power of a hybrid pos tagger 
is great, because you can control exceptional tagging cases, or correct output 
mistakes. It would be cool if the api had support for this

Radu



Pe Mar 15, 2011, la 12:44 PM, Jörn Kottmann <[email protected]> a scris:

> On 3/13/11 12:54 AM, Radu Simionescu wrote:
>> Hello
>> 
>> I am making paper a pos tagger for Romanian for my disertation. I want  to be
>> able to restrict the outcomes even more than just using a  dictionary. I 
>> want to
>> use some rules for disambiguation, based on the  context. This would allow 
>> me to
>> use smaller corpus, and also to fix  consistent output mistakes.
>> 
>> So I want to be able to give the postagger the possible set of outcomes  for
>> each word from the input, separately. So, since the training of a  model 
>> doesn't
>> really use the pos dictionary, I figured I could make this parser by  making
>> small modifications to the API, because the dictionary can change from one
>> sentence/word to the other. Please let me know if I am wrong.
>> 
> 
> There is no out-of-the-box support for this, but I believe it should be easy 
> to implement,
> all you need to do is to write a custom sequence validator which does what 
> you described
> above.
> 
> Just have a look at the POSTaggerME class, you need to modify the constructor 
> to give it
> a custom fetaure generator. We should open a jira issue and extend our API to 
> pass-in
> a sequence validator object.
> 
> Jörn

Reply via email to