Really easy, fundamental actually. a = Group your_data by (user_id,movie); foreach a generate flatten(group) count($1) ;
-----Original Message----- From: Chengi Liu [mailto:[email protected]] Sent: Wednesday, May 14, 2014 1:25 PM To: [email protected] Subject: Frequency count in pig Hi, My data is in format: user_id,movie_id,timestamp 123, abc,unix_timestamp 123, def, ... 123, abc, ... 234, sda, ... Now, I want to compute the number of times each movie is played in pig.. So the output I am expecting is: 123,abc,2 123,def,1 234,sda,1 and so on.. how do i do this in pig
