Brennon York created SPARK-5790:
-----------------------------------

             Summary: VertexRDD's won't zip properly for `diff` capability
                 Key: SPARK-5790
                 URL: https://issues.apache.org/jira/browse/SPARK-5790
             Project: Spark
          Issue Type: Bug
          Components: GraphX
            Reporter: Brennon York


For VertexRDD's with differing partition sizes one cannot run commands like 
`diff` as it will thrown an IllegalArgumentException. The code below provides 
an example:

{code:scala}
import org.apache.spark.graphx._
import org.apache.spark.rdd._
val setA: VertexRDD[Int] = VertexRDD(sc.parallelize(0L until 3L).map(id => (id, 
id.toInt+1)))
setA.collect.foreach(println(_))
val setB: VertexRDD[Int] = VertexRDD(sc.parallelize(2L until 4L).map(id => (id, 
id.toInt+2)))
setB.collect.foreach(println(_))
val diff = setA.diff(setB)
diff.collect.foreach(println(_))
val setC: VertexRDD[Int] = VertexRDD(sc.parallelize(2L until 4L).map(id => (id, 
id.toInt+2)) ++ sc.parallelize(6L until 8L).map(id => (id, id.toInt+2)))
setA.diff(setC).collect
// java.lang.IllegalArgumentException: Can't zip RDDs with unequal numbers of 
partitions
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to