On Mon, Dec 20, 2010 at 9:39 AM, Antonio Piccolboni <anto...@piccolboni.info > wrote:
> For an easy solution, use hive. Let's say your record contains userid and > friendid and the table is called friends > Then you would do > select A.userid , B.friendid from friends A join friends B on (A.friendid = > B user.id) > > This is on top of my mind, sorry if some details are off, but I've done it > in the past on large datasets (~100M rows).That's it. Do that in java and > tell me if it isn't at least 50 lines of code. > In raw java it will be a lot of code. In Plume, it should be just a few lines, most of which will have to do with reading the data. Pig and Hive will definitely be the most concise though.