Re: Merging the output of multiple name finders

Jim - FooBar(); Tue, 17 Apr 2012 09:23:58 -0700


On 17/04/12 17:07, Jörn Kottmann wrote:

On 04/17/2012 05:30 PM, Jim - FooBar(); wrote:
I think the root of all our problems here is the fact that we'retrying to generalise something that was not intended for thatpurpose. If you remember, before writing the AggregateNameFinder ihad similar code in the TokeNameFinderEvaluator class. After yoursuggestion i moved it to the name-find package but i had strongobjections for doing so because the merging of results should be anevaluation issue only. Now however we've dug ourselves a hole...now,we're saying to the user "you can use the aggregate finder instead ofthe individual ones but we will decide on some things for you!"...see? it is no longer an evaluation issue - it has become aseparate name-finder that people might use for annotating theircorpus which can lead to strange behaviour if we don't resolve nestedtags. We tried to improve the evaluation and we've ended updiscussing hard-coded rules as to how such a name-finder must behave...
The whole point of the evaluation is to measure how good a specific
name finder setup performs on some test data. The evaluator should nothelpin anyway with producing these names, because that is theresponsibility of
the name finder. That is one reason.

Another is that people want to run the name finders exactly the same way
as they did run in the evaluation later to produce names. Thereforthey must
merge the names in the same manner as it was done during evaluation.
But if the logic is hard coded into the evaluator it must be duplicated.

For these reasons I think that the merging logic
does not belong into the evaluators.
A separate name finder is indeed a good place for it because it solvesthetwo points explained above and makes it easily interchangeable againsta different
implementation.

+1 from me to implement a simple baseline merger as well.

Jörn

Yes i get your point, we've discussed this before and generally i doagree...it is just that i hadn't thought that people would use theAggregateNameFinder to produce names. The whole idea from day 1 was toimprove the evaluation.Should i go ahead and modify the AggreagateNameFinder to sort all thepredictions spans according to a comparator that looks at the startoffsets (in increasing order) and checks for overlaps?

Jim

Re: Merging the output of multiple name finders

Reply via email to