On 09/01/13 08:40, ali koyuncu wrote:
And than I will save them in a file which is binary
such as en-sent.bin It is right ?( InputStream modelIn = *new*FileInputStream(
*"en-sent.bin"*);)

No, this is not accurate...You have not really read the documentation have you? First of all you need to train a model...This is why you need the data (correctly identified sentences). Refer to the 'Sentence Detector Training' section here:
http://opennlp.apache.org/documentation/1.5.2-incubating/manual/opennlp.html#tools.sentdetect.detection

So the procedure goes something like this:

1. find or construct the data you need (turk-sent.train) *
2. train the sentence detector
3. save the trained model (turk-sent.bin)
4. finally use it

there is documentation and code snippets on the above website...make sure you read it first. If i were you I would copy some news articles from a Turkish newspaper and do he sentence splitting with my own eyes in order to produce the train-set...

*for English the data would be something like this (1 sentence per line):

Pierre Vinken, 61 years old, will join the board as a nonexecutive director 
Nov. 29.
Mr. Vinken is chairman of Elsevier N.V., the Dutch publishing group.
Rudolph Agnew, 55 years old and former chairman of Consolidated Gold Fields PLC,
    was named a director of this British industrial conglomerate.


Hope that helps,

Jim

Reply via email to