Re: Poor scalability with map reduce application

2011-06-23 Thread Alberto Andreotti
oblem would be to spread network traffic from the shuffle > over a > >> > longer period of time at a cost of having the reducer using resources > >> > earlier. Either way he would see this effect across both sets of runs > if he > >> > is using the defau

Re: Poor scalability with map reduce application

2011-06-22 Thread Harsh J
g the reducer using resources >> > earlier. Either way he would see this effect across both sets of runs if he >> > is using the default parameters. I guess it would all depend on what kind >> > of >> > network layout the cluster is on. >> > >> > Matt >

Re: Poor scalability with map reduce application

2011-06-22 Thread Alberto Andreotti
ns if he > is using the default parameters. I guess it would all depend on what kind of > network layout the cluster is on. > > > > Matt > > > > -Original Message- > > From: Harsh J [mailto:ha...@cloudera.com] > > Sent: Tuesday, June 21, 2011 12:09 P

Re: Poor scalability with map reduce application

2011-06-21 Thread Harsh J
- > From: Harsh J [mailto:ha...@cloudera.com] > Sent: Tuesday, June 21, 2011 12:09 PM > To: common-user@hadoop.apache.org > Subject: Re: Poor scalability with map reduce application > > Alberto, > > On Tue, Jun 21, 2011 at 10:27 PM, Alberto Andreotti > wrote: >> I don&

Re: Poor scalability with map reduce application

2011-06-21 Thread Alberto Andreotti
>> is using the default parameters. I guess it would all depend on what kind of >> network layout the cluster is on. >> >> Matt >> >> -Original Message- >> From: Harsh J [mailto:ha...@cloudera.com] >> Sent: Tuesday, June 21, 2011 12:09 PM >>

Re: Poor scalability with map reduce application

2011-06-21 Thread Alberto Andreotti
-Original Message- > From: Harsh J [mailto:ha...@cloudera.com] > Sent: Tuesday, June 21, 2011 12:09 PM > To: common-user@hadoop.apache.org > Subject: Re: Poor scalability with map reduce application > > Alberto, > > On Tue, Jun 21, 2011 at 10:27 PM, Alberto Andreotti &g

RE: Poor scalability with map reduce application

2011-06-21 Thread GOEKE, MATTHEW (AG/1000)
12:09 PM To: common-user@hadoop.apache.org Subject: Re: Poor scalability with map reduce application Alberto, On Tue, Jun 21, 2011 at 10:27 PM, Alberto Andreotti wrote: > I don't know if speculatives maps are on, I'll check it. One thing I > observed is that reduces begin befo

Re: Poor scalability with map reduce application

2011-06-21 Thread Harsh J
Alberto, On Tue, Jun 21, 2011 at 10:27 PM, Alberto Andreotti wrote: > I don't know if speculatives maps are on, I'll check it. One thing I > observed is that reduces begin before all maps have finished. Let me check > also if the difference is on the map side or in the reduce. I believe it's > ba

Re: Poor scalability with map reduce application

2011-06-21 Thread Alberto Andreotti
Hi Harsh, thanks for your answer!. The cluster is homogeneus, every node has the same amount of cores and memory and is equally reachable in the network. The data is generated specifically for each run. I mean, I write the input data in 4 nodes for one run and in 7 nodes for another. So the input

Re: Poor scalability with map reduce application

2011-06-21 Thread Harsh J
Alberto, Please add more practical-related info like if your cluster is homogenous, if the number of maps and reduces in both runs are consistent (i.e., same data and same amount of reducers on 4 vs. 7?), and if map speculatives are on. Also, do you notice difference of time for a single map task

Poor scalability with map reduce application

2011-06-21 Thread Alberto Andreotti
Hello, I'm working with an application to calculate the temperatures of a squared board. I divide the board in a mesh, and represent the board as a list of (key, value) pairs with a key being the linear position of a cell within the mesh, and the value its temperature. I distribute the data during