Glenn Strycker created SPARK-1883:
-------------------------------------

             Summary: spark graphx triplets.map does not return correct values
                 Key: SPARK-1883
                 URL: https://issues.apache.org/jira/browse/SPARK-1883
             Project: Spark
          Issue Type: Bug
            Reporter: Glenn Strycker


graph.triplets does not work -- it returns incorrect results 

I have a graph with the following edges: 

orig_graph.edges.collect 
=  Array(Edge(1,4,1), Edge(1,5,1), Edge(1,7,1), Edge(2,5,1), Edge(2,6,1), 
Edge(3,5,1), Edge(3,6,1), Edge(3,7,1), Edge(4,1,1), Edge(5,1,1), Edge(5,2,1), 
Edge(5,3,1), Edge(6,2,1), Edge(6,3,1), Edge(7,1,1), Edge(7,3,1)) 

When I run triplets.collect, I only get the last edge repeated 16 times: 

orig_graph.triplets.collect 
= Array(((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), 
((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), 
((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), 
((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1), ((7,1),(3,1),1)) 

I've also tried writing various map steps first before calling the triplet 
function, but I get the same results as above. 

Similarly, the example on the graphx programming guide page 
(http://spark.apache.org/docs/0.9.0/graphx-programming-guide.html) is 
incorrect. 

val facts: RDD[String] = 
  graph.triplets.map(triplet => 
    triplet.srcAttr._1 + " is the " + triplet.attr + " of " + 
triplet.dstAttr._1) 

does not work, but 

val facts: RDD[String] = 
  graph.triplets.map(triplet => 
    triplet.srcAttr + " is the " + triplet.attr + " of " + triplet.dstAttr) 

does work, although the results are meaningless.  For my graph example, I get 
the following line repeated 16 times: 

1 is the 1 of 1



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to