Re: sentence detector newline behavior

Jörn Kottmann Fri, 24 Jan 2014 06:33:43 -0800

On 01/23/2014 10:06 PM, Tim Miller wrote:

Just an FYI, a while back I did some of these annotations myself onMIMIC to get around this issue. I replaced the newline character witha special (non-English) character, then pre-processed ctakes input toreplace newlines with that character, then did sentence detection,then added the newlines back in. I would be happy to share theseannotations and my code modifications.

I would be really happy to get access to your annotations so I can testthe new line support in OpenNLP with it.

Instead of a special char you would now have to use tags (<CR> and <LF>)to encode the new lines in the training data.The tags only need to be inserted into the training data, for the actualsentence detection the document string can be passed in as it is.


Jörn

Re: sentence detector newline behavior

Reply via email to