On Mon, Dec 20, 2010 at 9:39 AM, Antonio Piccolboni <anto...@piccolboni.info
> wrote:

> For an easy solution, use hive. Let's say your record contains userid and
> friendid and the table is called friends
> Then you would do
> select A.userid , B.friendid from friends A join friends B on (A.friendid =
> B user.id)
>
> This is on top of my mind, sorry if some details are off, but I've done it
> in the past on large datasets (~100M rows).That's it. Do that in java and
> tell me if it isn't at least 50 lines of code.
>

In raw java it will be a lot of code.

In Plume, it should be just a few lines, most of which will have to do with
reading the data.

Pig and Hive will definitely be the most concise though.

Reply via email to