graph.vertices.collect().foreach(println)
HI, Any one have idea about this. When i use graph.vertices.collect() in the console(spark console) getting limited vertices data, as i have million records res34: Array[(org.apache.spark.graphx.VertexId, (Array[Double], Array[Double], Double, Double))] = Array((8501952,(Array(-1.0720014023085627, -0.166344 57650848242),Array(-0.6103788820773441, -0.07786941417473896),-1.3895885415250486,1.0)), (8502656,(Array(0.4372264795058396, 0.013542308038655674),Arra y(0.8989895103168658, 0.10203414054868458),0.6711091242883567,1.0)), (1024388,(Array(0.5121940125583068, 0.023648040456737244) But when i use like below, as object hashcode is printed, but certianly i want data in above format to do my further operations g.vertices.collect().foreach(println) (8502417,(*[D@f068ac,[D@16825ec*,0.6711091242883567,1.0)) (8502099,(*[D@91cd94,[D@4fb2ea*,0.6711091242883567,1.0)) (1939777,(*[D@6b48d0,[D@17979f6*,0.6368329031347298,0.0)) (8500191,(*[D@36063b,[D@d3339a*,-0.37782997016492503,1.0)) (8500927,(*[D@1e0e2a8,[D@140ab89*,0.6368329031347298,1.0)) (8500255,(*[D@bde454,[D@d96b14*,0.6528450234389184,1.0)) Any idea is appreciated. val newG = g.vertices.collect() newG.foreach(println) also not working... thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/graph-vertices-collect-foreach-println-tp28437.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Graphx Examples for ALS
Hi, Where can i find the the ALS recommendation algorithm for large data set? Please feel to share your ideas/algorithms/logic to build recommendation engine by using spark graphx Thanks in advance. Thanks, Balaji -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Graphx-Examples-for-ALS-tp28401.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Bipartite projection with Graphx
Hi, Is possible Bipartite projection with Graphx Rdd1 #id name 1 x1 2 x2 3 x3 4 x4 5 x5 6 x6 7 x7 8 x8 Rdd2 #id name 10001 y1 10002 y2 10003 y3 10004 y4 10005 y5 10006 y6 EdgeList #src id Dest id 1 10001 1 10002 2 10001 2 10002 2 10004 3 10003 3 10005 4 10001 4 10004 5 10003 5 10005 6 10003 6 10006 7 10005 7 10006 8 10005 val nodes = Rdd1++ Rdd2 val Network = Graph(nodes, links) with above network need to create projection graphs like x1-x2 weight (see the image in below wiki link) example: https://en.wikipedia.org/wiki/Bipartite_network_projection -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Bipartite-projection-with-Graphx-tp28360.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Spark Graphx with Database
Hi All, I would like to know about spark graphx execution/processing with database.Yes, i understand spark graphx is in-memory processing but some extent we can manage querying but would like to do much more complex query or processing.Please suggest me the usecase or steps for the same. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Graphx-with-Database-tp28253.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: Graphx triplet comparison
Hi Thanks for reply. Here is my code: class BusStopNode(val name: String,val mode:String,val maxpasengers :Int) extends Serializable case class busstop(override val name: String,override val mode:String,val shelterId: String, override val maxpasengers :Int) extends BusStopNode(name,mode,maxpasengers) with Serializable case class busNodeDetails(override val name: String,override val mode:String,val srcId: Int,val destId :Int,val arrivalTime :Int,override val maxpasengers :Int) extends BusStopNode(name,mode,maxpasengers) with Serializable case class routeDetails(override val name: String,override val mode:String,val srcId: Int,val destId :Int,override val maxpasengers :Int) extends BusStopNode(name,mode,maxpasengers) with Serializable val busstopRDD: RDD[(VertexId, BusStopNode)] = sc.textFile("\\BusStopNameMini.txt").filter(!_.startsWith("#")). map { line => val row = line split "," (row(0).toInt, new busstop(row(0),row(3),row(1)+row(0),row(2).toInt)) } busstopRDD.foreach(println) val busNodeDetailsRdd: RDD[(VertexId, BusStopNode)] = sc.textFile("\\RouteDetails.txt").filter(!_.startsWith("#")). map { line => val row = line split "," (row(0).toInt, new busNodeDetails(row(0),row(4),row(1).toInt,row(2).toInt,row(3).toInt,0)) } busNodeDetailsRdd.foreach(println) val detailedStats: RDD[Edge[BusStopNode]] = sc.textFile("\\routesEdgeNew.txt"). filter(! _.startsWith("#")). map {line => val row = line split ',' Edge(row(0).toInt, row(1).toInt,new BusStopNode(row(2), row(3),1) )} val busGraph = busstopRDD ++ busNodeDetailsRdd busGraph.foreach(println) val mainGraph = Graph(busGraph, detailedStats) mainGraph.triplets.foreach(println) val subGraph = mainGraph subgraph (epred = _.srcAttr.name == "101") //Working Fine for (subTriplet <- subGraph.triplets) { println(subTriplet.dstAttr.name) } //Working fine for (mainTriplet <- mainGraph.triplets) { println(subTriplet.dstAttr.name) } //causing error while iterating both at same time for (subTriplet <- subGraph.triplets) { for (mainTriplet <- mainGraph.triplets) { //Nullpointer exception is causing here if (subTriplet.dstAttr.name.toString.equals(mainTriplet.dstAttr.name)) { println("hello")//success case on both destination names of of subgraph and maingraph } } } } BusStopNameMini.txt 101,bs,10,B 102,bs,10,B 103,bs,20,B 104,bs,14,B 105,bs,8,B RouteDetails.txt #101,102,104 4 5 6 #102,103 3 4 #103,105,104 2 3 4 #104,102,101 4 5 6 #104,1015 #105,104,102 5 6 2 1,101,104,5,R 2,102,103,5,R 3,103,104,5,R 4,102,103,5,R 5,104,101,5,R 6,105,102,5,R routesEdgeNew.txt it contains two types of edges are bus to bus with edge value is distance and bus to route with edge value as time #101,102,104 4 5 6 #102,103 3 4 #103,105,104 2 3 4 #104,102,101 4 5 6 #104,1015 #105,104,102 5 6 2 101,102,4,BS 102,104,5,BS 102,103,3,BS 103,105,4,BS 105,104,3,BS 104,102,4,BS 102,101,5,BS 104,101,5,BS 105,104,5,BS 104,102,6,BS 101,1,4,R,102 101,1,4,R,103 102,2,5,R 103,3,6,R 103,3,5,R 104,4,7,R 105,5,4,Z 101,2,9,R 105,5,4,R 105,2,5,R 104,2,5,R 103,1,4,R 101,103,4,BS 101,104,4,BS 101,105,4,BS 101,103,5,BS 101,104,5,BS 101,105,5,BS 1,101,4,R -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Graphx-triplet-comparison-tp28198p28205.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Graphx triplet comparison
Hi, I would like to know how to do graphx triplet comparison in scala. Example there are two triplets; val triplet1 = mainGraph.triplet.filter(condition1) val triplet2 = mainGraph.triplet.filter(condition2) now i want to do compare triplet1 & triplet2 with condition3 -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Graphx-triplet-comparison-tp28198.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Graphx triplet loops causing null pointer exception
HI, I am getting null pointer exception when i am executing the triplet loop inside another triplet loop works fine below : for (mainTriplet <- mainGraph.triplets) { println(mainTriplet.dstAttr.name) } works fine below : for (subTriplet <- subGrapgh.triplets) { println(subTriplet .dstAttr.name) } Below is the causing issue : 1. for (subTriplet <- subGrapgh.triplets) { 2. println(subTriplet .dstAttr.name) 3. for (mainTriplet <- mainGraph.triplets) { 4.println(mainTriplet.dstAttr.name) // here i want to do operation like subTriplet.dstAttr.name.toString.equals(mainTriplet.dstAttr.name) 6.} 7.} line number 2 prints only one time and getting below error immediately ERROR Executor: Exception in task 0.0 in stage 10.0 (TID 10) java.lang.NullPointerException at org.apache.spark.graphx.impl.GraphImpl.triplets$lzycompute(GraphImpl.scala:50) at org.apache.spark.graphx.impl.GraphImpl.triplets(GraphImpl.scala:49) Please help and let me know if anything required in my explanation -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Graphx-triplet-loops-causing-null-pointer-exception-tp28195.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org