[ https://issues.apache.org/jira/browse/SPARK-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ankur Dave updated SPARK-1931: ------------------------------ Description: Commit 905173df57b90f90ebafb22e43f55164445330e6 introduced a bug in partitionBy where, after repartitioning the edges, it reuses the VertexRDD without updating the routing tables to reflect the new edge layout. This causes the following test to fail: {code} val g = Graph( sc.parallelize(List((0L, "a"), (1L, "b"), (2L, "c"))), sc.parallelize(List(Edge(0L, 1L, 1), Edge(0L, 2L, 1)), 2)) assert(g.triplets.collect.map(_.toTuple).toSet == Set(((0L, "a"), (1L, "b"), 1), ((0L, "a"), (2L, "c"), 1))) val gPart = g.partitionBy(PartitionStrategy.EdgePartition2D) assert(gPart.triplets.collect.map(_.toTuple).toSet == Set(((0L, "a"), (1L, "b"), 1), ((0L, "a"), (2L, "c"), 1))) {code} was: Commit 905173df57b90f90ebafb22e43f55164445330e6 introduced a bug in partitionBy where, after repartitioning the edges, it reuses the VertexRDD without updating the routing tables to reflect the new edge layout. This causes the following test to fail: {code} val g = Graph( sc.parallelize(List((0L, "a"), (1L, "b"), (2L, "c"))), sc.parallelize(List(Edge(0L, 1L, 1), Edge(0L, 2L, 1)), 2)) assert(g.triplets.collect.map(_.toTuple).toSet === Set(((0L, "a"), (1L, "b"), 1), ((0L, "a"), (2L, "c"), 1))) val gPart = g.partitionBy(PartitionStrategy.EdgePartition2D) assert(gPart.triplets.collect.map(_.toTuple).toSet === Set(((0L, "a"), (1L, "b"), 1), ((0L, "a"), (2L, "c"), 1))) {code} > Graph.partitionBy does not reconstruct routing tables > ----------------------------------------------------- > > Key: SPARK-1931 > URL: https://issues.apache.org/jira/browse/SPARK-1931 > Project: Spark > Issue Type: Bug > Components: GraphX > Affects Versions: 1.0.0 > Reporter: Ankur Dave > Assignee: Ankur Dave > Fix For: 1.0.0 > > > Commit 905173df57b90f90ebafb22e43f55164445330e6 introduced a bug in > partitionBy where, after repartitioning the edges, it reuses the VertexRDD > without updating the routing tables to reflect the new edge layout. This > causes the following test to fail: > {code} > val g = Graph( > sc.parallelize(List((0L, "a"), (1L, "b"), (2L, "c"))), > sc.parallelize(List(Edge(0L, 1L, 1), Edge(0L, 2L, 1)), 2)) > assert(g.triplets.collect.map(_.toTuple).toSet == > Set(((0L, "a"), (1L, "b"), 1), ((0L, "a"), (2L, "c"), 1))) > val gPart = g.partitionBy(PartitionStrategy.EdgePartition2D) > assert(gPart.triplets.collect.map(_.toTuple).toSet == > Set(((0L, "a"), (1L, "b"), 1), ((0L, "a"), (2L, "c"), 1))) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)