[ https://issues.apache.org/jira/browse/DATAFU-39?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984719#comment-13984719 ]
Matthew Hayes commented on DATAFU-39: ------------------------------------- I don't think you should worry about it. I can't see a more concise way to do this. The only alternative I see is to write a UDF, but I don't know if this UDF would be that useful in general. The only benefit it would add is making the code more concise and maybe slightly more efficient, but I think it could result in code that is more confusing as this can be expressed in Pig Latin in a pretty readable way. > RFE: BagSum > ----------- > > Key: DATAFU-39 > URL: https://issues.apache.org/jira/browse/DATAFU-39 > Project: DataFu > Issue Type: New Feature > Reporter: Sam Steingold > > I need a new function {{BagSum}} which would help me solve the problem > described in > [http://stackoverflow.com/questions/22945236/how-do-i-accumulate-vectors-into-a-map]. > Test case: > {code} > /** > > define BagSum datafu.pig.bags.BagSum(); > > data = LOAD 'input' AS (id:int, key:chararray, val:int); > describe data; > > data2 = FOREACH (GROUP data BY id) GENERATE group as id, > BagSum(data.(key,val),data.key) as keys; > describe data2; > > STORE data2 INTO 'output'; > */ > @Multiline > private String bagSumTest; > > @Test > public void bagSumTest() throws Exception > { > PigTest test = createPigTestFromString(bagSumTest); > writeLinesToFile("input", > "(1,A,1)","(1,B,2)","(2,A,3)","(3,A,4)","(1,C,5)","(1,C,6)", > "(3,A,7)","(2,B,8)","(1,A,9)","(2,A,10)"); > test.runScript(); > assertOutput(test, "data2", "(1,{(A,10),(B,2),(C,11)})", > "(2,{(A,13),(B,8)})","(3,{(A,11)})"); > } > {code} > Thanks. > (alternatively, please tell me how to implement this using existing features) -- This message was sent by Atlassian JIRA (v6.2#6252)