Sorry for the delay.

1) This works similar how Hadoop distributes the Keys to the reducers,
there is a HashPartitioner that rewrites the vertices to n-files where n is
the number of tasks.
2) block size doesn't matter in this case because a filesplit will be
associated with the partitioned files.

Am 19. April 2012 03:01 schrieb Praveen Sripati <[email protected]>:

> 1. Lets say the input is partitioned into part0, part1, part2, part3 and
> part4. How is it ensured that bsp0 processes part0, bsp1 processes part1
> and so on and there is no mix? We don't want bsp0 to process part1.
>
> private void send(BSPPeerProtocol peer, BSPMessage msg) throws IOException
> {
>  int mod = ((Integer) msg.getTag()) % peer.getAllPeerNames().length;
>  peer.send(peer.getAllPeerNames()[mod], msg);
> }
>
> 2) If the partition file size is more than the HDFS block size and 1+ bsp
> task processes a single partition, how is this scenario handled?
>
> Thanks,
> Praveen
>



-- 
Thomas Jungblut
Berlin <[email protected]>

Reply via email to