Why do you need cross-validation? Are you creating the dictionary dynamically from the corpus?
On Sat, Mar 17, 2012 at 10:07 AM, Jim - FooBar(); <[email protected]> wrote: > Hello again, > > basically, with the dictionary almost sorted, i'd like to return to a > previous issue that wasn't quite resolved. I made a mistake and posted that > i couldn't use the *cross-validator* with the dictionary, while in fact i > meant to say that i can't use the *regular evaluator* with the > dictionaryNameFinder...it always returns precission=0, recall=0, f-score=-1, > regardless of finding loads of entities... Has anyone tried doing that? > > I'm not sure why this happens - from what i can see DictionaryNameFinder > implements TokenNameFinder (as does NameFinderME) and that is all the > TokenNameFinderEvaluator needs!I would expect this to work but it > doesn't...I'd like to start working on integrating the 2 but i can't, unless > I can be sure that both evaluations work, at least independently. > > Jim > > > On 13/03/12 17:00, Jim - FooBar(); wrote: >> >> Ooops i'm really sorry...I am using the regular evaluator not the >> cross-validator...The previous message should have been about the standard >> evaluator...My bad! There is simply no point in using the cross-validator >> with the dictionary - there is nothing to train! I apologise for the mistake >> it wont happen again! The thing is i'm pretty stressed out! >> >> Jim >> >> >> >> On 13/03/12 16:51, Jörn Kottmann wrote: >>> >>> You only do cross validation because you need to take some data out to >>> train your model on. That is why you can pass in all the training >>> parameters >>> to the TokenNameFinderCrossValidator. >>> >>> Maybe I am mistaken but it cannot take a TokenNameFinder object as an >>> argument, right? >>> Did you sub-classed it? >>> >>> Anyway since the DictionaryNameFinder cannot be trained you should >>> just use the evaluators they are simpler and give you the same result. >>> >>> Jörn >>> >>> On 03/13/2012 05:34 PM, Jim - FooBar(); wrote: >>>> >>>> Hey guys, >>>> >>>> First of all i can imagine you must be sick and tired of me reporting a >>>> bug or improvement every single day!That is the nature of open-source >>>> though >>>> isn't it? :-) >>>> >>>> Today's issue came literally out of nowhere! Again it has to do with >>>> cross-validation but with the DictionaryNameFinder this time - NOT the >>>> maxent model...Ok, here goes: >>>> >>>> Basically, both the maxentNameFinder and the DictionaryNameFinder can do >>>> NER. Also both classes implement TokenNameFinder so from Java's perspective >>>> either can be passed as argument to the TokenNameFinderCrossValidator >>>> constructor...However, i tried doing that this morning, in an effort to get >>>> some numbers my dictionary, and all i get is 0 precision, 0 recall and -1 >>>> FMeasure, regardless of finding loads of drugs! I think (not 100% sure) the >>>> problem is that the CrossValidator expects annotated text (in order to >>>> verify) but the DictionaryNameFinder can only be deployed on un-annotated >>>> text...To be honest i don't see any other reason why it won't use the >>>> dictionary instead of the model since both classes conform to the same >>>> interface - otherwise Java would complain! >>>> >>>> Any ideas? >>>> >>>> >>>> Jim >>>> >>>> p.s: i 'm not sure if i can call this a bug or a massive >>>> improvement...it all started when i started thinking how i can evaluate my >>>> trained model when it joins forces with the dictionary...i can see with my >>>> eyes there is some improvement but it is crucial that i get some numbers... >>> >>> >> >
