Tobias Bertelsen created SPARK-5744: ---------------------------------------
Summary: RDD.isEmpty fails when rdd contains empty partitions. Key: SPARK-5744 URL: https://issues.apache.org/jira/browse/SPARK-5744 Project: Spark Issue Type: Bug Components: Spark Core Reporter: Tobias Bertelsen Priority: Critical The implementation of {{RDD.isEmpty()}} fails if there is empty partitions. It was introduce by https://github.com/apache/spark/pull/4074 Example: {code:scala} sc.parallelize(Seq(), 1).isEmpty() {code} The above code throws an exception like this: {code} org.apache.spark.SparkDriverExecutionException: Execution error at org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:977) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1374) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1338) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) Cause: java.lang.ArrayStoreException: [Ljava.lang.Object; at scala.runtime.ScalaRunTime$.array_update(ScalaRunTime.scala:88) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1466) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1466) at org.apache.spark.scheduler.JobWaiter.taskSucceeded(JobWaiter.scala:56) at org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:973) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1374) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1338) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org