I think this is a separate issue with how the EdgeRDDImpl partitions
edges.  If you can merge this change in and rebuild, it should work:

   https://github.com/apache/spark/pull/4136/files

If you can't, I just called the Graph.partitonBy() method right after
construction my graph but before performing any operations on it.  That
way, the EdgeRDDImpl class doesn't have to use the default partitioner.

Hope this helps!
   Jay

On Tue Feb 03 2015 at 12:35:14 AM NicolasC <nicolas.ch...@inria.fr> wrote:

> On 01/29/2015 08:31 PM, Ankur Dave wrote:
> > Thanks for the reminder. I just created a PR:
> > https://github.com/apache/spark/pull/4273
> > Ankur
> >
>
> Hello,
>
> Thanks for the patch. I applied it on Pregel.scala (in Spark 1.2.0
> sources) and rebuilt
> Spark. During execution, at the 25th iteration of Pregel, checkpointing is
> done and then
> it throws the following exception :
>
> Exception in thread "main" org.apache.spark.SparkException: Checkpoint
> RDD CheckpointRDD[521] at reduce at VertexRDDImpl.scala:80(0) has
> different number of partitions than original RDD VertexRDD
> ZippedPartitionsRDD2[518] at zipPartitions at VertexRDDImpl.scala:170(2)
>         at org.apache.spark.rdd.RDDCheckpointData.doCheckpoint(
> RDDCheckpointData.scala:98)
>         at org.apache.spark.rdd.RDD.doCheckpoint(RDD.scala:1279)
>         at org.apache.spark.rdd.RDD$$anonfun$doCheckpoint$1.apply(
> RDD.scala:1281)
>         at org.apache.spark.rdd.RDD$$anonfun$doCheckpoint$1.apply(
> RDD.scala:1281)
>         at scala.collection.immutable.List.foreach(List.scala:318)
>         at org.apache.spark.rdd.RDD.doCheckpoint(RDD.scala:1281)
>         at org.apache.spark.SparkContext.runJob(SparkContext.scala:1285)
>         at org.apache.spark.SparkContext.runJob(SparkContext.scala:1351)
>         at org.apache.spark.rdd.RDD.reduce(RDD.scala:867)
>         at org.apache.spark.graphx.impl.VertexRDDImpl.count(
> VertexRDDImpl.scala:80)
>         at org.apache.spark.graphx.Pregel$.apply(Pregel.scala:155)
>         at org.apache.spark.graphx.lib.ShortestPaths$.run(
> ShortestPaths.scala:69)
>         ....
>
> Pregel.scala:155 is the following line in the pregel loop:
>
>        activeMessages = messages.count()
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to