I am also having problems with triangle count - seems like this algorithm is very memory consuming (I could not process even small graphs ~ 5 million Vertices and 70 million Edges with less the 32 GB RAM on EACH machine). What if I have graphs with billion edges, what amount of RAM do I need then?
So now I am trying to understand how it works and rewrite it maybe. I would like to process big graphs with not so much RAM on each machine. Am 20.07.2015 04:27 schrieb "Jack Yang" <j...@uow.edu.au>: > Hi there, > > > > I got an error when running one simple graphX program. > > My setting is: spark 1.4.0, Hadoop yarn 2.5. scala 2.10. with four virtual > machines. > > > > if I constructed one small graph (6 nodes, 4 edges), I run: > > println("triangleCount: %s ".format( > hdfs_graph.triangleCount().vertices.count() )) > > that returns me the correct results. > > > > But I import a much larger graph (with 850000 nodes, 5000000 edges), the > error is > > 15/07/20 12:03:36 WARN scheduler.TaskSetManager: Lost task 2.0 in stage > 11.0 (TID 32, 192.168.157.131): java.lang.AssertionError: assertion failed > > at scala.Predef$.assert(Predef.scala:165) > > at > org.apache.spark.graphx.lib.TriangleCount$$anonfun$7.apply(TriangleCount.scala:90) > > at > org.apache.spark.graphx.lib.TriangleCount$$anonfun$7.apply(TriangleCount.scala:87) > > at > org.apache.spark.graphx.impl.VertexPartitionBaseOps.leftJoin(VertexPartitionBaseOps.scala:140) > > at > org.apache.spark.graphx.impl.VertexRDDImpl$$anonfun$3.apply(VertexRDDImpl.scala:159) > > at > org.apache.spark.graphx.impl.VertexRDDImpl$$anonfun$3.apply(VertexRDDImpl.scala:156) > > at > org.apache.spark.rdd.ZippedPartitionsRDD2.compute(ZippedPartitionsRDD.scala:88) > > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) > > at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) > > > > > > I run the above two graphs using the same submit command: > > spark-submit --class "sparkUI.GraphApp" --master spark://master:7077 > --executor-memory 2G --total-executor-cores 4 myjar.jar > > > > any thought? anything wrong with my machine or configuration? > > > > > > > > > > Best regards, > > Jack > > >