Hello,

I am a final year Bsc Computer Science Student who is using Apache Giraph
for my final year project and dissertation and would appreciate very much
if someone could help me with the following issue.

I am using Apache Giraph 1.1.0 Snapshot with Hadoop 0.20.203.0 and am
having trouble running the ConnectedComponents example. I use the following
command:

 hadoop jar
/home/ghufran/Downloads/Giraph2/giraph/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar
org.apache.giraph.GiraphRunner
org.apache.giraph.examples.ConnectedComponentsComputation -vif
org.apache.giraph.io.formats.IntIntNullTextVertexInputFormat -vip
/user/ghufran/in/my_graph.txt -vof
org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
/user/ghufran/outCC -w 1


I believe it gets stuck in the InputSuperstep as the following is displayed
in terminal when the command is running:

14/03/30 10:48:46 INFO mapred.JobClient:  map 100% reduce 0%
14/03/30 10:48:50 INFO job.JobProgressTracker: Data from 1 workers -
Loading data: 0 vertices loaded, 0 vertex input splits loaded; 0 edges
loaded, 0 edge input splits loaded; min free memory on worker 1 - 109.01MB,
average 109.01MB
14/03/30 10:48:55 INFO job.JobProgressTracker: Data from 1 workers -
Loading data: 0 vertices loaded, 0 vertex input splits loaded; 0 edges
loaded, 0 edge input splits loaded; min free memory on worker 1 - 109.01MB,
average 109.01MB
14/03/30 10:49:00 INFO job.JobProgressTracker: Data from 1 workers -
Loading data: 0 vertices loaded, 0 vertex input splits loaded; 0 edges
loaded, 0 edge input splits loaded; min free memory on worker 1 - 108.78MB,
average 108.78MB
....

which I traced back to the following if statement in the toString() method
of core.org.apache.job.CombinedWorkerProgress:

if (isInputSuperstep()) {
      sb.append("Loading data: ");
      sb.append(verticesLoaded).append(" vertices loaded, ");
      sb.append(vertexInputSplitsLoaded).append(
          " vertex input splits loaded; ");
      sb.append(edgesLoaded).append(" edges loaded, ");
      sb.append(edgeInputSplitsLoaded).append(" edge input splits loaded");

sb.append("; min free memory on worker ").append(
        workerWithMinFreeMemory).append(" - ").append(
        DECIMAL_FORMAT.format(minFreeMemoryMB)).append("MB, average
").append(
        DECIMAL_FORMAT.format(freeMemoryMB)).append("MB");

So it seems to me that it's not loading in the InputFormat correctly. So I
am assuming there's something wrong with my input format class or, probably
more likely, something wrong with the graph I passed in?

I pass in a small graph that has the format vertex id, vertex value,
neighbours separated by tabs, my graph is shown below:

1 0 2
2 1 1 3 4
3 2 2
4 3 2

The full output is shown below after I ran my command is shown below. If
anyone could explain to me why I am not getting the expected output I would
greatly appreciate it.

Many thanks,

Ghufran


FULL OUTPUT:


14/03/30 10:48:06 INFO utils.ConfigurationUtils: No edge input format
specified. Ensure your InputFormat does not require one.
14/03/30 10:48:06 INFO utils.ConfigurationUtils: No edge output format
specified. Ensure your OutputFormat does not require one.
14/03/30 10:48:06 INFO job.GiraphJob: run: Since checkpointing is disabled
(default), do not allow any task retries (setting mapred.map.max.attempts =
0, old value = 4)
14/03/30 10:48:07 INFO job.GiraphJob: run: Tracking URL:
http://ghufran:50030/jobdetails.jsp?jobid=job_201403301044_0001
14/03/30 10:48:45 INFO
job.HaltApplicationUtils$DefaultHaltInstructionsWriter:
writeHaltInstructions: To halt after next superstep execute:
'bin/halt-application --zkServer ghufran:22181 --zkNode
/_hadoopBsp/job_201403301044_0001/_haltComputation'
14/03/30 10:48:45 INFO zookeeper.ZooKeeper: Client
environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT
14/03/30 10:48:45 INFO zookeeper.ZooKeeper: Client environment:host.name
=ghufran
14/03/30 10:48:45 INFO zookeeper.ZooKeeper: Client
environment:java.version=1.7.0_51
14/03/30 10:48:45 INFO zookeeper.ZooKeeper: Client
environment:java.vendor=Oracle Corporation
14/03/30 10:48:45 INFO zookeeper.ZooKeeper: Client
environment:java.home=/usr/lib/jvm/java-7-oracle/jre
14/03/30 10:48:45 INFO zookeeper.ZooKeeper: Client
environment:java.class.path=/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../conf:/usr/lib/jvm/java-7-oracle/lib/tools.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/..:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../hadoop-core-0.20.203.0.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/aspectjrt-1.6.5.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/aspectjtools-1.6.5.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/commons-beanutils-1.7.0.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/commons-beanutils-core-1.8.0.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/commons-cli-1.2.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/commons-codec-1.4.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/commons-collections-3.2.1.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/commons-configuration-1.6.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/commons-daemon-1.0.1.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/commons-digester-1.8.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/commons-el-1.0.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/commons-httpclient-3.0.1.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/commons-lang-2.4.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/commons-logging-1.1.1.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/commons-logging-api-1.0.4.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/commons-math-2.1.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/commons-net-1.4.1.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/core-3.1.1.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/hsqldb-1.8.0.10.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/jackson-core-asl-1.0.1.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/jackson-mapper-asl-1.0.1.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/jasper-compiler-5.5.12.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/jasper-runtime-5.5.12.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/jets3t-0.6.1.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/jetty-6.1.26.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/jetty-util-6.1.26.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/jsch-0.1.42.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/junit-4.5.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/kfs-0.2.2.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/log4j-1.2.15.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/mockito-all-1.8.5.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/oro-2.0.8.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/servlet-api-2.5-20081211.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/slf4j-api-1.4.3.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/slf4j-log4j12-1.4.3.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/xmlenc-0.52.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/jsp-2.1/jsp-2.1.jar:/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/jsp-2.1/jsp-api-2.1.jar
14/03/30 10:48:45 INFO zookeeper.ZooKeeper: Client
environment:java.library.path=/home/ghufran/Downloads/hadoop-0.20.203.0/bin/../lib/native/Linux-amd64-64
14/03/30 10:48:45 INFO zookeeper.ZooKeeper: Client
environment:java.io.tmpdir=/tmp
14/03/30 10:48:45 INFO zookeeper.ZooKeeper: Client
environment:java.compiler=<NA>
14/03/30 10:48:45 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
14/03/30 10:48:45 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
14/03/30 10:48:45 INFO zookeeper.ZooKeeper: Client
environment:os.version=3.8.0-35-generic
14/03/30 10:48:45 INFO zookeeper.ZooKeeper: Client environment:user.name
=ghufran
14/03/30 10:48:45 INFO zookeeper.ZooKeeper: Client
environment:user.home=/home/ghufran
14/03/30 10:48:45 INFO zookeeper.ZooKeeper: Client
environment:user.dir=/home/ghufran/Downloads/hadoop-0.20.203.0/bin
14/03/30 10:48:45 INFO zookeeper.ZooKeeper: Initiating client connection,
connectString=ghufran:22181 sessionTimeout=60000
watcher=org.apache.giraph.job.JobProgressTracker@209fa588
14/03/30 10:48:45 INFO mapred.JobClient: Running job: job_201403301044_0001
14/03/30 10:48:45 INFO zookeeper.ClientCnxn: Opening socket connection to
server ghufran/127.0.1.1:22181. Will not attempt to authenticate using SASL
(unknown error)
14/03/30 10:48:45 INFO zookeeper.ClientCnxn: Socket connection established
to ghufran/127.0.1.1:22181, initiating session
14/03/30 10:48:45 INFO zookeeper.ClientCnxn: Session establishment complete
on server ghufran/127.0.1.1:22181, sessionid = 0x1451263c44c0002,
negotiated timeout = 600000
14/03/30 10:48:45 INFO job.JobProgressTracker: Data from 1 workers -
Loading data: 0 vertices loaded, 0 vertex input splits loaded; 0 edges
loaded, 0 edge input splits loaded; min free memory on worker 1 - 109.01MB,
average 109.01MB
14/03/30 10:48:46 INFO mapred.JobClient:  map 100% reduce 0%
14/03/30 10:48:50 INFO job.JobProgressTracker: Data from 1 workers -
Loading data: 0 vertices loaded, 0 vertex input splits loaded; 0 edges
loaded, 0 edge input splits loaded; min free memory on worker 1 - 109.01MB,
average 109.01MB
14/03/30 10:48:55 INFO job.JobProgressTracker: Data from 1 workers -
Loading data: 0 vertices loaded, 0 vertex input splits loaded; 0 edges
loaded, 0 edge input splits loaded; min free memory on worker 1 - 109.01MB,
average 109.01MB
14/03/30 10:49:00 INFO job.JobProgressTracker: Data from 1 workers -
Loading data: 0 vertices loaded, 0 vertex input splits loaded; 0 edges
loaded, 0 edge input splits loaded; min free memory on worker 1 - 108.78MB,
average 108.78MB

Reply via email to