I am running on hadoop 2.2.0-cdh5.0.0-beta-1 with pure yarn mode.   I =
have noticed two issues in org.apache.giraph.yarn.GiraphYarnClient

giraph code base is 1.1.0-SNAPSHOT, downloaded on Jan 2, 2014

looking at the checkPerNodeResourcesAvailable Method.

The first issue is that the number of containers available is calculated =
using node.getNumContainers().  Looking at the yarn documentation this =
is the number of containers currently running on the node.  So with a =
yarn cluster with no jobs running all nodes report 0 containers.


The second issue (in the same method) is this if block:

if (workers < numContainers) {
     throw new RuntimeException("Giraph job requires " + workers +
       " containers to run; cluster only hosts " + numContainers);

}

So with the current set up if a cluster has 4 containers currently =
running and a graph job is submitted that requires 2 containers the job =
will fail saying =93Giraph job requires 2 containers to run; cluster =
only hosts 4=94.

the if statement should be =93 workers > numContainers=94  and =
numContainers should reflect the total number of containers available, =
not the number of containers currently running.  I don=92t know yarn =
well so i don=92t know if such a number is available at all.

For the time being i plan on getting this working by bypassing the check =
all together.

Reply via email to