Hi Jörn,
Thanks very much for the info. It sounds like at the very least it could be an optional thing that we can make available. I will pay close attention if/when discussions about using maven start happening.

The opennlp parser is great but the berkeley parser has a really nifty way of learning grammars. I would love to build a parser that can run with that grammar and replicate the good performance but that is a big project. I will let the opennlp team know if I make any progress on that.

Tim

On 08/01/2012 08:00 AM, Jörn Kottmann wrote:
On 08/01/2012 01:01 PM, Miller, Timothy wrote:
There was some chatter last week about resources potentially being downloaded via maven for license compatibility reasons. Just wondering if that brings about the possibility of using external libraries that are not apache-licensed that would also be auto-downloaded under certain maven build commands. Specifically I was thinking of the GPL-licensed berkeley parser which I've used to get significantly higher accuracy than the opennlp parser we currently wrap in our constituency parser module.

Making scripts or maven build commands which download stuff is fine, but it might turn out to be quit limiting for your users which need the freedom of the AL. That will be
a problem if Berkeley is the only option.

The HBase people for example have an optional dependency on LZO which is GPL,
and people there just need to install and download it themselves.
See here:
http://hbase.apache.org/book/lzo.compression.html

Speaking as an OpenNLP committer now, it would of course be nice to make our parser better.
If you want to work on that we will be happy to get some patches.

Jörn


Reply via email to