Guys, I need to fix CLI for SimpleTokenizer. Otherwise I have no objections.
Aliaksandr On Thu, May 3, 2012 at 1:34 PM, jim.foobar <[email protected]> wrote: > On 03/05/12 12:16, Jörn Kottmann wrote: > >> On 05/03/2012 10:58 AM, Jim - FooBar(); wrote: >> >>> I can also provide the "AggregateNameFinder" class which takes any >>> number of name-finders and merges their results in order to get better >>> evaluation statistics. Internally, it uses the >>> "NameFinderME.**dropOverlappingSpans()" >>> method to get rid of nested spans, which however does the simplistic thing >>> of keeping the earliest span (ignoring the type of the span completely). I >>> think being able to merge results from several name-finders is a killer >>> feature that a lot of people will appreciate even if i don't think keeping >>> the earliest span is sensible when trying to evaluate several finders on >>> multiple entity types... >>> >> >> +1 to implement it based on NameFinderME.**dropOverlappingSpans. >> >> In my opinion that is still a good baseline. We can come up with more >> specialized and sophisticated >> approaches e.g. based on probabilities and limited for statistical name >> finders. >> >> Jörn >> >> > Yes, I agree it is not a bad baseline, but pretty soon we'll have to > either look at the probabilities (if someone is trying to merge several > models) or at the actual class of the namefinder that gave a particular > prediction and reason on that...for example if a prediction came from a > dictionary there is really no point in doubting it is there? It must be > correct! anyway, i'd love to see this feature on 1.5.3 and a couple of > weeks (what William needs) is not that long... > > Jim > > ps: btw, I 've been actually using the aggregate name-finder in my private > build for almost 3 weeks now...I'm passing it 2 dictionary finders of > different types and a maxent model that can also predict 2 types. > Everything works just fine! :) >
