[ https://issues.apache.org/jira/browse/PIG-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13106841#comment-13106841 ]
Thejas M Nair commented on PIG-2290: ------------------------------------ This change is not backward compatible. It can break existing pig queries. It can be argued that the TOBAG current implementation has a correct/consistent behavior - it puts each argument into a tuple and adds it to a bag. The bag always contains tuple with single element. I think there has to be a very strong reason to break backward compatibility. This could go into a new UDF (say TOBAG_2 ?). Or this could go into some major version upgrade of pig where we make bunch of non backward compatible changes. > TOBAG wraps tuple parameters in another tuple > --------------------------------------------- > > Key: PIG-2290 > URL: https://issues.apache.org/jira/browse/PIG-2290 > Project: Pig > Issue Type: Bug > Components: internal-udfs > Affects Versions: 0.9.0 > Reporter: Ryan Hoegg > Assignee: Dmitriy V. Ryaboy > Attachments: pig-2290.patch > > > The TOBAG function indiscriminately wraps all parameters in a tuple. When I > pass a list of tuples to the function, I would expect it to return a bag > containing those tuples. Instead, it returns a bag containing single element > tuples, where each tuple contains one of the tuples passed in. > Example: > {code:title=tuples.txt} > (mike,608) > (ryan,11624) > (justin,2317) > {code} > {code:title=Demonstration using pig 0.9.0} > grunt> TUPLE_DATA = LOAD 'tuples.txt' AS > (T:tuple(name:chararray,street_number:int)); > grunt> BAGGED = FOREACH TUPLE_DATA GENERATE TOBAG(T); > grunt> DESCRIBE BAGGED; > BAGGED: {{(name: chararray,street_number: int)}} > grunt> DUMP BAGGED; > ({((mike,608))}) > ({((ryan,11624))}) > ({((justin,2317))}) > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira