Hi all, I spent the last couple of days on the coreferencer training. We have now a parser for the MUC format (coref and names) and a training format with tools for training the coref component.
Our coreferencer can link noun phrases together. To train this I used our name finder and parser to produce a parse tree with named entities and then merged the mentions from the MUC training file into the parse tree. But I wonder what is the best/right way to do that? The current approach DefaultParse.addMention doesn't really seem to work. A coreferencer trained with it does perform way worse then one instantiated on the models from the old SourceForge page. I am not (yet) sure what the problem really is. Would be nice to get a review on the approach I took to merge the mentions into an existing parse tree. The code which does that can be found in DefaultParse.addMention and in FullParseCorefEnhancerStream. Jörn
