Does anyone have any thoughts on how to have Giraph not require a fixed amount of workers, but rather be able to start a superstep with a possibly smaller number of workers?
On Thu, Oct 10, 2013 at 4:12 PM, Manuel Lagang <manuellag...@gmail.com>wrote: > I'm trying to understand the meaning of the 3 parameters to > GiraphConfiguration.setWorkerConfiguration: minWorkers, maxWorkers, and > minPercentResponded. I want my Giraph jobs to co-exist nicely with other > jobs in the cluster, and it's not always the case that I can get a fixed > number of map slots for my job before the job times out. Thus, I would like > the job to be able to start with a possibly smaller set of workers, ideally > being able to pick up workers in later supersteps. > > So I tried <minWorkers=10,maxWorkers=50,minPercentResponded=100>, > expecting this to mean that it can start with 10 workers provided that 100% > of those workers respond. But this setting ends up again waiting for all 50 > workers. > > Then I tried <minWorkers=10,maxWorkers=50,minPercentResponded=20>, > expecting that minPercentResponded was just a redundant expression of > minWorkers/maxWorkers. But this setting leads to null pointer exceptions in > org.apache.giraph.comm.SendCache.removeWorkerData(SendCache.java:199). > > So I must be confused about the meaning of these variables, and what the > legal values are. Can anyone enlighten me on how (if possible) I can get > the behavior I want? > > Thanks, > Manuel Lagang >