[
https://issues.apache.org/jira/browse/DATAFU-88?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14313241#comment-14313241
]
Matthew Hayes commented on DATAFU-88:
-------------------------------------
That looks pretty useful. However I see that this library is GPL v3, which
means we cannot autojar it to include it in our datafu-pig jar. It would have
to be downloaded separately.
[~jghoman], what are the implications if we added a wrapper around a GPL v3
library? If this was a dependency that had to be downloaded separately would
this be a problem? Is there any precedent for Apache projects depending on
GPL-licensed libraries?
> Port Stanford Core NLP Functionality to DataFu
> ----------------------------------------------
>
> Key: DATAFU-88
> URL: https://issues.apache.org/jira/browse/DATAFU-88
> Project: DataFu
> Issue Type: New Feature
> Affects Versions: 1.3.0
> Reporter: Russell Jurney
> Assignee: Russell Jurney
> Labels: lemmatizer, nlp, pig, pig_udf, stanford, stemmer
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> For starters I need the Stanford Core NLP stemmer and lemmatizer.
> It looks like maybe I can add something generic and feed arguments to code
> like: props.put("annotators", "tokenize, ssplit, pos, lemma");
> Helpful example of lemmatizing at
> http://stackoverflow.com/questions/1578062/lemmatization-java
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)