On 2/24/11 12:10 PM, Rohana Rajapakse wrote:
Thanks a lot.
I did create a name finder training data file using CONNEL sometimes ago. Will
have a look at how I did it. I may be able to convert this training file to
produce a tokenizer training file.
Thanks a lot. Will let you know how I get on with this. Would contribute
anything that might be useful to others.
English detoeknizer rules will be helpful for many, would be nice if you
could contribute yours then,
there is a general one you can use to start with inside
opennlp-tools/src/test/resources/opennlp/tools/latin-detokenizer.xml
The name finder training data you have should be good enough to start
with, depending
on the way its tokenized.
Jörn