Can you paste your cluster information ? I am also struggling to make it work on 75M vertices and 100s of million edges.
On Fri, Jul 26, 2013 at 8:02 AM, jerome richard <jeromerichard...@msn.com>wrote: > Hi, > > I encountered a critical scaling problem using Giraph. I made a very > simple algorithm to test Giraph on large graphs : a connexity test. It > works on relatively large graphs (3 072 441 nodes and 117 185 083 edges) > but not on very large graph (52 000 000 nodes and 2 000 000 000 edges). > In fact, during the processing of the biggest graph, Giraph core seems to > fail after the superstep 14 (15 on some jobs). The input graph size is 30 > GB stored as text and the output is also stored as text. 9 working jobs are > used to compute the graph. > > Here is the tracktrace of jobs (this is the same for the 9 jobs): > java.lang.IllegalStateException: run: Caught an unrecoverable > exception exists: Failed to check > /_hadoopBsp/job_201307260439_0006/_applicationAttemptsDir/0/_superstepDir/97/_addressesAndPartitions > after 3 tries! > at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:101) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Unknown Source) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > Caused by: java.lang.IllegalStateException: exists: Failed to check > /_hadoopBsp/job_201307260439_0006/_applicationAttemptsDir/0/_superstepDir/97/_addressesAndPartitions > after 3 tries! > at org.apache.giraph.zk.ZooKeeperExt.exists(ZooKeeperExt.java:369) > at > org.apache.giraph.worker.BspServiceWorker.startSuperstep(BspServiceWorker.java:678) > at > org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:248) > at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:91) > ... 7 more > > Could you help me to solve this problem? > If you need the code of the program, I can put that here (the code is > relatively tiny). > > Thanks, > Jérôme. > > -- --Puneet