[ https://issues.apache.org/jira/browse/PIG-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jie Li updated PIG-2691: ------------------------ Attachment: PIG-2691.patch.2 Fixed one unit test for TOKENIZE in TestLogicalPlanBuilder, and added another unit test using the query in the description. Passed all the other unit tests of test-commit. > Duplicate TOKENIZE schema > ------------------------- > > Key: PIG-2691 > URL: https://issues.apache.org/jira/browse/PIG-2691 > Project: Pig > Issue Type: Bug > Reporter: Gianmarco De Francisci Morales > Labels: simple > Attachments: PIG-2691.patch, PIG-2691.patch.2 > > > TOKENIZE produces a fixed named schema that results in duplicates if used > more than once in the same generate statement. > We could paramenterize the schema on the name of the field being tokenized. > {code} > grunt> q = LOAD 'file' AS (source:chararray, target:chararray); > grunt> e = FOREACH q GENERATE TOKENIZE(source), TOKENIZE(target); > 2012-05-09 20:18:37,235 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR > 1108: > <line 2, column 14> Duplicate schema alias: bag_of_tokenTuples > grunt> e = FOREACH q GENERATE TOKENIZE(source) as s_entities, > TOKENIZE(target) as t_entities; > grunt> describe e > e: {s_entities: {tuple_of_tokens: (token: chararray)},t_entities: > {tuple_of_tokens: (token: chararray)}} > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira