I'm trying to understand the meaning of the 3 parameters to
GiraphConfiguration.setWorkerConfiguration: minWorkers, maxWorkers, and
minPercentResponded. I want my Giraph jobs to co-exist nicely with other
jobs in the cluster, and it's not always the case that I can get a fixed
number of map slots for my job before the job times out. Thus, I would like
the job to be able to start with a possibly smaller set of workers, ideally
being able to pick up workers in later supersteps.

So I tried <minWorkers=10,maxWorkers=50,minPercentResponded=100>, expecting
this to mean that it can start with 10 workers provided that 100% of those
workers respond. But this setting ends up again waiting for all 50 workers.

Then I tried <minWorkers=10,maxWorkers=50,minPercentResponded=20>,
expecting that minPercentResponded was just a redundant expression of
minWorkers/maxWorkers. But this setting leads to null pointer exceptions in
org.apache.giraph.comm.SendCache.removeWorkerData(SendCache.java:199).

So I must be confused about the meaning of these variables, and what the
legal values are. Can anyone enlighten me on how (if possible) I can get
the behavior I want?

Thanks,
Manuel Lagang

Reply via email to