subject:"Mapping one key per Map Task"

Re: Mapping one key per Map Task

2011-05-23 Thread Moustafa Gaber

I think you don't need to split your input file so that each map is assigned one key. Your goal is to make load balance. For each map task of yours, it will initiate a new MR sub-job. This sub-job will be assigned a new master/workers, which means the map task of the sub-job may be scheduled to wor

Re: Mapping one key per Map Task

2011-05-23 Thread Vincent Xue

Thanks for the suggestions! On Mon, May 23, 2011 at 5:50 PM, Harsh J wrote: > Vincent, > > You _might_ lose locality by splitting beyond the block splits, and > the tasks although better 'parallelized', may only end up performing > worse. A good way to instead increase task #s is to go the block

Re: Mapping one key per Map Task

2011-05-23 Thread Harsh J

Vincent, You _might_ lose locality by splitting beyond the block splits, and the tasks although better 'parallelized', may only end up performing worse. A good way to instead increase task #s is to go the block size way (lower block size, getting more splits at the cost of little extra NN space).

Re: Mapping one key per Map Task

2011-05-23 Thread Jason

Look at NLineInputFormat Sent from my iPhone On May 23, 2011, at 2:09 AM, Vincent Xue wrote: > Hello Hadoop Users, > > I would like to know if anyone has ever tried splitting an input > sequence file by key instead of by size. I know that this is unusual > for the map reduce paradigm but I am

Re: Mapping one key per Map Task

2011-05-23 Thread Joey Echeverria

Look at getInputSplits() of SequenceFileInputFormat. -Joey On May 23, 2011 5:09 AM, "Vincent Xue" wrote: > Hello Hadoop Users, > > I would like to know if anyone has ever tried splitting an input > sequence file by key instead of by size. I know that this is unusual > for the map reduce paradigm

Mapping one key per Map Task

2011-05-23 Thread Vincent Xue

Hello Hadoop Users, I would like to know if anyone has ever tried splitting an input sequence file by key instead of by size. I know that this is unusual for the map reduce paradigm but I am in a situation where I need to perform some large tasks on each key pair in a load balancing like fashion.

Re: Mapping one key per Map Task

Re: Mapping one key per Map Task

Re: Mapping one key per Map Task

Re: Mapping one key per Map Task

Re: Mapping one key per Map Task

Mapping one key per Map Task

6 matches

Site Navigation

Mail list logo

Footer information