This set results from a JOIN:

(04f4c2fd-8be2-41c3-b045-283de80909ba,1966,2L)
(04f4c2fd-8be2-41c3-b045-283de80909ba,3845,2L)

Using PIG, I group this and get:

(669a4b47-d3c3-4950-9ec0-f1e24064d9d9,{(669a4b47-d3c3-4950-9ec0-f1e24064d9d9,1634,2L),(669a4b47-d3c3-4950-9ec0-f1e24064d9d9,1966,2L)})

After FOREACH...GENERATE:

({(1966),(3845)},{(2L),(2L)})

What I want to do is derive:

(1966|3845,2L)

The trouble is that everything is bagged up from the group and I'm not sure
how to unbag for the output so I can do things like apply CONCAT, UNIQUE on
the fields, etc. I have tried nested FOREACH statements, but I can't seem to
drill down far enough to de-reference the values the way I'd like.

Is this a job for UDF or is there anything in Pig Latin that I can do to
accomplish this task?

Thanks!
-M@

Reply via email to