Hi , I am adding bunch of vertices in a graph in graphx using the following method . I am facing the problem of latency. First time an addition of say 400 vertices to a graph with 100,000 nodes takes around 7 seconds. next time its taking 15 seconds. So every subsequent adds are taking more time than the previous one. Please help me solve this problem.
My cluster is presently having one machine with 8 core and 8 gb ram. I am running in local mode. def addVertex(rdd: RDD[String], sc: SparkContext, session: String): Long = { val defaultUser = (0, 0) rdd.collect().foreach { x => { val aVertex: RDD[(VertexId, (Int, Int))] = sc.parallelize(Array((x.toLong, (100, 100)))) gVertices = gVertices.union(aVertex) } } inputGraph = Graph(gVertices, gEdges, defaultUser) inputGraph.cache() gVertices = inputGraph.vertices gVertices.cache() val count = gVertices.count println(count); return 1; } Thanks, Udbhav