[jira] [Updated] (PIG-3190) Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization
[ https://issues.apache.org/jira/browse/PIG-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-3190: Fix Version/s: (was: 0.17.0) 0.18.0 > Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization > --- > > Key: PIG-3190 > URL: https://issues.apache.org/jira/browse/PIG-3190 > Project: Pig > Issue Type: Bug > Components: internal-udfs >Affects Versions: 0.11 >Reporter: Russell Jurney >Assignee: Russell Jurney > Fix For: 0.18.0 > > Attachments: PIG-3190-2.patch, PIG-3190-3.patch, PIG-3190.patch > > > TOKENIZE is literally useless. The Lucene Standard/Snowball tokenizers in > lucene, as used by, varaha is much more useful for actual tasks: > https://github.com/Ganglion/varaha/blob/master/src/main/java/varaha/text/TokenizeText.java > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (PIG-3190) Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization
[ https://issues.apache.org/jira/browse/PIG-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-3190: Fix Version/s: (was: 0.16.0) 0.17.0 > Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization > --- > > Key: PIG-3190 > URL: https://issues.apache.org/jira/browse/PIG-3190 > Project: Pig > Issue Type: Bug > Components: internal-udfs >Affects Versions: 0.11 >Reporter: Russell Jurney >Assignee: Russell Jurney > Fix For: 0.17.0 > > Attachments: PIG-3190-2.patch, PIG-3190-3.patch, PIG-3190.patch > > > TOKENIZE is literally useless. The Lucene Standard/Snowball tokenizers in > lucene, as used by, varaha is much more useful for actual tasks: > https://github.com/Ganglion/varaha/blob/master/src/main/java/varaha/text/TokenizeText.java > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PIG-3190) Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization
[ https://issues.apache.org/jira/browse/PIG-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-3190: Fix Version/s: (was: 0.15.0) 0.16.0 > Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization > --- > > Key: PIG-3190 > URL: https://issues.apache.org/jira/browse/PIG-3190 > Project: Pig > Issue Type: Bug > Components: internal-udfs >Affects Versions: 0.11 >Reporter: Russell Jurney >Assignee: Russell Jurney > Fix For: 0.16.0 > > Attachments: PIG-3190-2.patch, PIG-3190-3.patch, PIG-3190.patch > > > TOKENIZE is literally useless. The Lucene Standard/Snowball tokenizers in > lucene, as used by, varaha is much more useful for actual tasks: > https://github.com/Ganglion/varaha/blob/master/src/main/java/varaha/text/TokenizeText.java > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PIG-3190) Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization
[ https://issues.apache.org/jira/browse/PIG-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-3190: Fix Version/s: (was: 0.14.0) 0.15.0 > Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization > --- > > Key: PIG-3190 > URL: https://issues.apache.org/jira/browse/PIG-3190 > Project: Pig > Issue Type: Bug > Components: internal-udfs >Affects Versions: 0.11 >Reporter: Russell Jurney >Assignee: Russell Jurney > Fix For: 0.15.0 > > Attachments: PIG-3190-2.patch, PIG-3190-3.patch, PIG-3190.patch > > > TOKENIZE is literally useless. The Lucene Standard/Snowball tokenizers in > lucene, as used by, varaha is much more useful for actual tasks: > https://github.com/Ganglion/varaha/blob/master/src/main/java/varaha/text/TokenizeText.java > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PIG-3190) Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization
[ https://issues.apache.org/jira/browse/PIG-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-3190: Fix Version/s: (was: 0.13.0) 0.14.0 > Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization > --- > > Key: PIG-3190 > URL: https://issues.apache.org/jira/browse/PIG-3190 > Project: Pig > Issue Type: Bug > Components: internal-udfs >Affects Versions: 0.11 >Reporter: Russell Jurney >Assignee: Russell Jurney > Fix For: 0.14.0 > > Attachments: PIG-3190-2.patch, PIG-3190-3.patch, PIG-3190.patch > > > TOKENIZE is literally useless. The Lucene Standard/Snowball tokenizers in > lucene, as used by, varaha is much more useful for actual tasks: > https://github.com/Ganglion/varaha/blob/master/src/main/java/varaha/text/TokenizeText.java > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PIG-3190) Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization
[ https://issues.apache.org/jira/browse/PIG-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-3190: Fix Version/s: (was: 0.12.0) 0.13.0 > Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization > --- > > Key: PIG-3190 > URL: https://issues.apache.org/jira/browse/PIG-3190 > Project: Pig > Issue Type: Bug > Components: internal-udfs >Affects Versions: 0.11 >Reporter: Russell Jurney >Assignee: Russell Jurney > Fix For: 0.13.0 > > Attachments: PIG-3190-2.patch, PIG-3190-3.patch, PIG-3190.patch > > > TOKENIZE is literally useless. The Lucene Standard/Snowball tokenizers in > lucene, as used by, varaha is much more useful for actual tasks: > https://github.com/Ganglion/varaha/blob/master/src/main/java/varaha/text/TokenizeText.java > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3190) Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization
[ https://issues.apache.org/jira/browse/PIG-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-3190: Status: Open (was: Patch Available) Canceling patch until issues around location and build failures are resolved. > Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization > --- > > Key: PIG-3190 > URL: https://issues.apache.org/jira/browse/PIG-3190 > Project: Pig > Issue Type: Bug > Components: internal-udfs >Affects Versions: 0.11 >Reporter: Russell Jurney >Assignee: Russell Jurney > Fix For: 0.12 > > Attachments: PIG-3190-2.patch, PIG-3190-3.patch, PIG-3190.patch > > > TOKENIZE is literally useless. The Lucene Standard/Snowball tokenizers in > lucene, as used by, varaha is much more useful for actual tasks: > https://github.com/Ganglion/varaha/blob/master/src/main/java/varaha/text/TokenizeText.java > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3190) Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization
[ https://issues.apache.org/jira/browse/PIG-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russell Jurney updated PIG-3190: Status: Patch Available (was: Open) > Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization > --- > > Key: PIG-3190 > URL: https://issues.apache.org/jira/browse/PIG-3190 > Project: Pig > Issue Type: Bug > Components: internal-udfs >Affects Versions: 0.11 >Reporter: Russell Jurney >Assignee: Russell Jurney > Fix For: 0.12 > > Attachments: PIG-3190-2.patch, PIG-3190-3.patch, PIG-3190.patch > > > TOKENIZE is literally useless. The Lucene Standard/Snowball tokenizers in > lucene, as used by, varaha is much more useful for actual tasks: > https://github.com/Ganglion/varaha/blob/master/src/main/java/varaha/text/TokenizeText.java > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3190) Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization
[ https://issues.apache.org/jira/browse/PIG-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russell Jurney updated PIG-3190: Attachment: PIG-3190-3.patch Changed name from Lucene -> Standard, and general cleanup based on feedback. > Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization > --- > > Key: PIG-3190 > URL: https://issues.apache.org/jira/browse/PIG-3190 > Project: Pig > Issue Type: Bug > Components: internal-udfs >Affects Versions: 0.11 >Reporter: Russell Jurney >Assignee: Russell Jurney > Fix For: 0.12 > > Attachments: PIG-3190-2.patch, PIG-3190-3.patch, PIG-3190.patch > > > TOKENIZE is literally useless. The Lucene Standard/Snowball tokenizers in > lucene, as used by, varaha is much more useful for actual tasks: > https://github.com/Ganglion/varaha/blob/master/src/main/java/varaha/text/TokenizeText.java > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3190) Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization
[ https://issues.apache.org/jira/browse/PIG-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russell Jurney updated PIG-3190: Attachment: PIG-3190-2.patch Working patch with unit test. > Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization > --- > > Key: PIG-3190 > URL: https://issues.apache.org/jira/browse/PIG-3190 > Project: Pig > Issue Type: Bug > Components: internal-udfs >Affects Versions: 0.11 >Reporter: Russell Jurney >Assignee: Russell Jurney > Fix For: 0.12 > > Attachments: PIG-3190-2.patch, PIG-3190.patch > > > TOKENIZE is literally useless. The Lucene Standard/Snowball tokenizers in > lucene, as used by, varaha is much more useful for actual tasks: > https://github.com/Ganglion/varaha/blob/master/src/main/java/varaha/text/TokenizeText.java > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3190) Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization
[ https://issues.apache.org/jira/browse/PIG-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russell Jurney updated PIG-3190: Attachment: (was: PIG-3190.patch) > Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization > --- > > Key: PIG-3190 > URL: https://issues.apache.org/jira/browse/PIG-3190 > Project: Pig > Issue Type: Bug > Components: internal-udfs >Affects Versions: 0.11 >Reporter: Russell Jurney >Assignee: Russell Jurney > Fix For: 0.12 > > Attachments: PIG-3190.patch > > > TOKENIZE is literally useless. The Lucene Standard/Snowball tokenizers in > lucene, as used by, varaha is much more useful for actual tasks: > https://github.com/Ganglion/varaha/blob/master/src/main/java/varaha/text/TokenizeText.java > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3190) Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization
[ https://issues.apache.org/jira/browse/PIG-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russell Jurney updated PIG-3190: Attachment: PIG-3190.patch Initial working patch, sans tests. > Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization > --- > > Key: PIG-3190 > URL: https://issues.apache.org/jira/browse/PIG-3190 > Project: Pig > Issue Type: Bug > Components: internal-udfs >Affects Versions: 0.11 >Reporter: Russell Jurney >Assignee: Russell Jurney > Fix For: 0.12 > > Attachments: PIG-3190.patch, PIG-3190.patch > > > TOKENIZE is literally useless. The Lucene Standard/Snowball tokenizers in > lucene, as used by, varaha is much more useful for actual tasks: > https://github.com/Ganglion/varaha/blob/master/src/main/java/varaha/text/TokenizeText.java > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3190) Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization
[ https://issues.apache.org/jira/browse/PIG-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russell Jurney updated PIG-3190: Description: TOKENIZE is literally useless. The Lucene Standard/Snowball tokenizers in lucene, as used by, varaha is much more useful for actual tasks: https://github.com/Ganglion/varaha/blob/master/src/main/java/varaha/text/TokenizeText.java (was: TOKENIZE is literally useless. The Lucene tokenizer in varaha is much more useful for actual tasks: https://github.com/Ganglion/varaha/blob/master/src/main/java/varaha/text/TokenizeText.java) Summary: Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization (was: Add LuceneTokenizer to Pig - useful text tokenization) > Add LuceneTokenizer and SnowballTokenizer to Pig - useful text tokenization > --- > > Key: PIG-3190 > URL: https://issues.apache.org/jira/browse/PIG-3190 > Project: Pig > Issue Type: Bug > Components: internal-udfs >Affects Versions: 0.11 >Reporter: Russell Jurney >Assignee: Russell Jurney > Fix For: 0.12 > > Attachments: PIG-3190.patch > > > TOKENIZE is literally useless. The Lucene Standard/Snowball tokenizers in > lucene, as used by, varaha is much more useful for actual tasks: > https://github.com/Ganglion/varaha/blob/master/src/main/java/varaha/text/TokenizeText.java > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira