[ https://issues.apache.org/jira/browse/PIG-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alan Gates updated PIG-3190: ---------------------------- Status: Open (was: Patch Available) Canceling patch until issues around location and build failures are resolved. > Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization > --------------------------------------------------------------------------- > > Key: PIG-3190 > URL: https://issues.apache.org/jira/browse/PIG-3190 > Project: Pig > Issue Type: Bug > Components: internal-udfs > Affects Versions: 0.11 > Reporter: Russell Jurney > Assignee: Russell Jurney > Fix For: 0.12 > > Attachments: PIG-3190-2.patch, PIG-3190-3.patch, PIG-3190.patch > > > TOKENIZE is literally useless. The Lucene Standard/Snowball tokenizers in > lucene, as used by, varaha is much more useful for actual tasks: > https://github.com/Ganglion/varaha/blob/master/src/main/java/varaha/text/TokenizeText.java > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira