Hi Jörn,
Thanks very much for the info. It sounds like at the very least it
could be an optional thing that we can make available. I will pay close
attention if/when discussions about using maven start happening.
The opennlp parser is great but the berkeley parser has a really nifty
way of learning grammars. I would love to build a parser that can run
with that grammar and replicate the good performance but that is a big
project. I will let the opennlp team know if I make any progress on that.
Tim
On 08/01/2012 08:00 AM, Jörn Kottmann wrote:
On 08/01/2012 01:01 PM, Miller, Timothy wrote:
There was some chatter last week about resources potentially being
downloaded via maven for license compatibility reasons. Just
wondering if that brings about the possibility of using external
libraries that are not apache-licensed that would also be
auto-downloaded under certain maven build commands. Specifically I
was thinking of the GPL-licensed berkeley parser which I've used to
get significantly higher accuracy than the opennlp parser we
currently wrap in our constituency parser module.
Making scripts or maven build commands which download stuff is fine,
but it might
turn out to be quit limiting for your users which need the freedom of
the AL. That will be
a problem if Berkeley is the only option.
The HBase people for example have an optional dependency on LZO which
is GPL,
and people there just need to install and download it themselves.
See here:
http://hbase.apache.org/book/lzo.compression.html
Speaking as an OpenNLP committer now, it would of course be nice to
make our parser better.
If you want to work on that we will be happy to get some patches.
Jörn