Glenn Strycker created SPARK-1883: ------------------------------------- Summary: spark graphx triplets.map does not return correct values Key: SPARK-1883 URL: https://issues.apache.org/jira/browse/SPARK-1883 Project: Spark Issue Type: Bug Reporter: Glenn Strycker
graph.triplets does not work -- it returns incorrect results I have a graph with the following edges: orig_graph.edges.collect = Array(Edge(1,4,1), Edge(1,5,1), Edge(1,7,1), Edge(2,5,1), Edge(2,6,1), Edge(3,5,1), Edge(3,6,1), Edge(3,7,1), Edge(4,1,1), Edge(5,1,1), Edge(5,2,1), Edge(5,3,1), Edge(6,2,1), Edge(6,3,1), Edge(7,1,1), Edge(7,3,1)) When I run triplets.collect, I only get the last edge repeated 16 times: orig_graph.triplets.collect = Array(((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1)) I've also tried writing various map steps first before calling the triplet function, but I get the same results as above. Similarly, the example on the graphx programming guide page (http://spark.apache.org/docs/0.9.0/graphx-programming-guide.html) is incorrect. val facts: RDD[String] = graph.triplets.map(triplet => triplet.srcAttr._1 + " is the " + triplet.attr + " of " + triplet.dstAttr._1) does not work, but val facts: RDD[String] = graph.triplets.map(triplet => triplet.srcAttr + " is the " + triplet.attr + " of " + triplet.dstAttr) does work, although the results are meaningless. For my graph example, I get the following line repeated 16 times: 1 is the 1 of 1 -- This message was sent by Atlassian JIRA (v6.2#6252)