Re: Tokenizer issue - Quotation marks

James Kosin Tue, 22 Feb 2011 19:53:21 -0800

On 2/22/2011 8:18 AM, Rohana Rajapakse wrote:
> Hi,
>
>  
>
> I am using OpenNLP-1.5 to tokenize text. Tried the text The army had
> made "mistakes".  It gives me "mistakes as a token (note starting quote
> is part of the token). But, if I change the word mistake to Mistake
> (i.e. capitol M) in the input text, then I get the token Mistakes
> (correctly). 
>
>  
>
> Anyone aware of this issue and any idea of how to get-around this?
>
>  
>


Can you see if this model will work for you?

Thanks,
James

Re: Tokenizer issue - Quotation marks

Reply via email to