Hi James, It works for double quotes, but not for single quotes (i.e. fails for 'mistakes'). Is it a training issue then (not having cases with words enclosed within single/double quotes.
I have noticed that your model file is much smaller than the model file available to download. Is it because your training data set is smaller? How does it affect tokenizing overall? Are there training sets available to download? Regards Rohana -----Original Message----- From: James Kosin [mailto:[email protected]] Sent: 23 February 2011 11:26 To: [email protected] Subject: Re: Tokenizer issue - Quotation marks On 2/23/2011 3:23 AM, Rohana Rajapakse wrote: > Thanks for the replies. > > Which model are you referring to? Where can I find it? > > Thanks > > Rohana > Sorry, I thought I responded directly to your email with the attachment. I sent again offline. James GOSS community User Group for clients. Sign-up here: www.gossinteractive.com/usergroup Have you registered for our e-Newsletter? www.gossinteractive.com/newsletter Registered Office: c/o Bishop Fleming, Cobourg House, Mayflower Street, Plymouth, PL1 1LG. Company Registration No: 3553908 This email contains proprietary information, some or all of which may be legally privileged. It is for the intended recipient only. If an addressing or transmission error has misdirected this email, please notify the author by replying to this email. If you are not the intended recipient you may not use, disclose, distribute, copy, print or rely on this email. Email transmission cannot be guaranteed to be secure or error free, as information may be intercepted, corrupted, lost, destroyed, arrive late or incomplete or contain viruses. This email and any files attached to it have been checked with virus detection software before transmission. You should nonetheless carry out your own virus check before opening any attachment. GOSS Interactive Ltd accepts no liability for any loss or damage that may be caused by software viruses.
