[ https://issues.apache.org/jira/browse/PIG-1272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844403#action_12844403 ]
Daniel Dai commented on PIG-1272: --------------------------------- Rollback the patch due to "TestMultiQuery.testMultiQueryJiraPig1157" fail. Strange hudson does return +1. I checked actually hudson skip TestMultiQuery.testMultiQueryJiraPig1157. > Column pruner causes wrong results > ---------------------------------- > > Key: PIG-1272 > URL: https://issues.apache.org/jira/browse/PIG-1272 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: 0.6.0 > Reporter: Viraj Bhat > Assignee: Daniel Dai > Fix For: 0.7.0 > > Attachments: PIG-1272-1.patch > > > For a simple script the column pruner optimization removes certain columns > from the original relation, which results in wrong results. > Input file "kv" contains the following columns (tab separated) > {code} > a 1 > a 2 > a 3 > b 4 > c 5 > c 6 > b 7 > d 8 > {code} > Now running this script in Pig 0.6 produces > {code} > kv = load 'kv' as (k,v); > keys= foreach kv generate k; > keys = distinct keys; > keys = limit keys 2; > rejoin = join keys by k, kv by k; > dump rejoin; > {code} > (a,a) > (a,a) > (a,a) > (b,b) > (b,b) > Running this in Pig 0.5 version without column pruner results in: > (a,a,1) > (a,a,2) > (a,a,3) > (b,b,4) > (b,b,7) > When we disable the "ColumnPruner" optimization it gives right results. > Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.