[
https://issues.apache.org/jira/browse/DATAFU-14?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13881478#comment-13881478
]
Russell Jurney commented on DATAFU-14:
--------------------------------------
I think we should maintain the single REGISTER setup. Just need to find a way
to manually add jars, and I don't know how. We don't need all the lucene jars,
just core and maybe one other package.
FWIW, I do plan on porting quite a lot of the lucene string functionality over
to UDFs, but this should just require... lucene-core at 2.3MB, and one other
jar or two smaller than that.
> Add NGram Tokenizer to datafu.pig.text.lucene
> ---------------------------------------------
>
> Key: DATAFU-14
> URL: https://issues.apache.org/jira/browse/DATAFU-14
> Project: DataFu
> Issue Type: Improvement
> Environment: plants
> Reporter: Russell Jurney
>
> See
> https://github.com/rjurney/datafu/blob/lucene/src/java/datafu/pig/text/lucene/NGramTokenize.java
> Held up by
> http://stackoverflow.com/questions/21064520/how-to-use-lucene-shinglefilter-could-not-find-implementing-class-for-org-apach/21067142?noredirect=1#21067142
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)