Ok got it! I see what you mean...In a nutshell:

You want the output of the AggregateNameFinder to be able to be used as training data for training a separate model for example. This would indeed require no nested sgml tags in the output...I was only thinking about the evaluation aspect.

Jim



On 17/04/12 16:00, Jim - FooBar(); wrote:
Could you name a few of the "applications" that cannot deal with overlapping/intersecting spans ???

I am still struggling to realise why 2 individual but overlapping Span objects can cause problems. They are totally individual and they will be processed serially...each one has its own start offset and its own end offset. the prediction will either be correct or wrong...

unless of course, you're referring to the actual annotation that comes back on screen when you deploy any name finder...hmm, this must be what you probably meant...

well, in that case the problem is not messing up our evaluation statistics but rather keeping the format of the output compatible with format the trainer expects, meaning: no nested sgml tags & no characters immediately after/before the sgml tag.

Is this what you're trying to get across Jorn?

Jim


On 17/04/12 15:42, Jörn Kottmann wrote:
On 04/17/2012 04:40 PM, Jim - FooBar(); wrote:
By "applications" you mean programs external or unrelated to openNLP?

An application which is using the output of the name finder.

Jörn


Reply via email to