Guys, I need to fix CLI for SimpleTokenizer. Otherwise I have no objections.

Aliaksandr

On Thu, May 3, 2012 at 1:34 PM, jim.foobar <[email protected]> wrote:

> On 03/05/12 12:16, Jörn Kottmann wrote:
>
>> On 05/03/2012 10:58 AM, Jim - FooBar(); wrote:
>>
>>>  I can also provide the "AggregateNameFinder" class which takes any
>>> number of name-finders and merges their results in order to get better
>>> evaluation statistics. Internally, it uses the 
>>> "NameFinderME.**dropOverlappingSpans()"
>>> method to get rid of nested spans, which however does the simplistic thing
>>> of keeping the earliest span (ignoring the type of the span completely). I
>>> think being able to merge results from several name-finders is a killer
>>> feature that a lot of people will appreciate even if i don't think keeping
>>> the earliest span is sensible when trying to evaluate several finders on
>>> multiple entity types...
>>>
>>
>> +1 to implement it based on NameFinderME.**dropOverlappingSpans.
>>
>> In my opinion that is still a good baseline. We can come up with more
>> specialized and sophisticated
>> approaches e.g. based on probabilities and limited for statistical name
>> finders.
>>
>> Jörn
>>
>>
> Yes, I agree it is not a bad baseline, but pretty soon we'll have to
> either look at the probabilities (if someone is trying to merge several
> models) or at the actual class of the namefinder that gave a particular
> prediction and reason on that...for example if a prediction came from a
> dictionary there is really no point in doubting it is there? It must be
> correct! anyway, i'd love to see this feature on 1.5.3 and a couple of
> weeks (what William needs) is not that long...
>
> Jim
>
> ps: btw, I 've been actually using the aggregate name-finder in my private
> build for almost 3 weeks now...I'm passing it 2 dictionary finders of
> different types and a maxent model that can also predict 2 types.
>  Everything works just fine! :)
>

Reply via email to