+1 to doing this. I already removed that from Chalk for similar reasons.
Also, the best way to do coreference these days is to build on the
rule-based sieve approach given in this paper:

http://www.mitpressjournals.org/doi/abs/10.1162/COLI_a_00152

-Jason


On Wed, Apr 17, 2013 at 4:31 PM, Jörn Kottmann <kottm...@gmail.com> wrote:

> Hi all,
>
> I am proposing that we move the coref component into the sandbox until we
> manage
> to train and test it on a publicly available dataset. In the current state
> it is complicated to maintain the
> code because without training it can't be tested properly, which makes
> bigger changes on OpenNLP
> difficult, for example the maxent refactoring.
>
> I tried to implement parsers for the MUC corpus and added training code,
> but it does not yet work as
> well as the current models on SourceForge. More work is needed to get
> everything fixed.
>
> Additionally the code should be refactored like the other components in
> OpenNLP,
> e.g. one model instantiation, build in evaluation, simple training, etc.
> There is a jira issue with
> all the details.
>
> Any opinions?
>
> Jörn
>



-- 
Jason Baldridge
Associate Professor, Department of Linguistics
The University of Texas at Austin
http://www.jasonbaldridge.com
http://twitter.com/jasonbaldridge

Reply via email to