Re: Need en-sent.train file

Jörn Kottmann Tue, 12 Feb 2013 05:02:29 -0800

On 02/08/2013 06:46 PM, Surendra wrote:

Hi,
I am a post graduate student in computer science. I am working on sentence 
boundary detection of local Indian language. Could you please provide me the 
format of the train file and a sample file like en-sent.train which will be 
help full for me to create model.

The sentence detector training data to train the en-sent.bin model isnot Open Source. The easiest way to get training data isto get a corpus and just extract the sentences for the training, thereare a couple of freely or cheaply available corporawhich could be used. Some are already supported by OpenNLP, have a lookat the manual.


Jörn

Re: Need en-sent.train file

Reply via email to