It would be interesting to compare the results of OpenNLP’s perceptron trained models, GIS trained models, and a vanilla CRF implementation (i.e. not specifically trained for a task). We can make a better decision on if we should spend the effort to implement a CRF. Every once in a while we see people ask “what can I do?”. Maybe the answer should be… given an ObjectStream<Event> or DataIndexer, train a CRFModel that extends AbstractModel. Your training class must extend AbstractEventTrainer and we serializable using AbstractModelWriter
Just my 2 cents. Daniel On 2/7/17, 9:51 AM, "Damiano Porta" <[email protected]> wrote: I have good results with perceptron, but +1 for CRF 2017-02-07 15:42 GMT+01:00 Russ, Daniel (NIH/CIT) [E] <[email protected]>: > Hi Jörn, > > > > I think the best entity recognition systems use CRF’s. At some point > we might want to consider adding them. As you know, ME classifiers suffer > from label bias problem (see Lafferty et. al<http://repository.upenn. > edu/cgi/viewcontent.cgi?article=1162&context=cis_papers>.) CRF’s deal > with that issue. I believe that perceptrons suffer from the same problem. > If you think the results are better, I have no problem. I think that our > long-term goal should be to add a CRF, and make it the default for the > NameFinder. > > > > Daniel > > > > > > On 2/6/17, 12:40 PM, "Joern Kottmann" <[email protected]> wrote: > > > > Hello all, > > > > I would like to propose to switch the default training algorithm from > > maxent gis to perceptron for the Name Finder. In all the data sets I > > tried perceptron performs better than maxent gis and I believe that > > would be a much more sensible default. > > > > A user can always override the default by providing the algorithms > > parameter for training. > > > > What do you think? > > > > Jörn > > > > >
