2011/7/4 Jörn Kottmann <[email protected]>: > On 7/4/11 12:02 PM, Olivier Grisel wrote: >> >> That would require a lot of work though. Better start by (re-)training >> the existing code base on some fresh data. > > +1, exactly that is where we have to start, optimizing the feature > generation > or implement new complicated feature really requires the ability to train > it. > > When we start experimenting with new features, a build in evaluation tool > would also be good to have. > > And then of course we should port the code to be language independent, > currently > it only works for English. Yeah, lots of work here ...
The main problem for multi-lingual support is probably less about hard-coded language specific hacks and more about the general lack of freely available annotated training / evaluation corpora for coreference resolution. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel
