Github user srowen commented on a diff in the pull request:
https://github.com/apache/spark/pull/4534#discussion_r24520320
--- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala ---
@@ -1253,9 +1253,9 @@ abstract class RDD[T: ClassTag](
/**
* @return true if and only if the RDD contains no elements at all. Note
that an RDD
- * may be empty even when it has at least 1 partition.
+ * may be empty even when it has 1 or more partitions.
*/
- def isEmpty(): Boolean = partitions.length == 0 || take(1).length == 0
+ def isEmpty(): Boolean = partitions.length == 0 || mapPartitions(it =>
Iterator(!it.hasNext)).reduce(_&&_)
--- End diff --
I'll try the test case, sure, to investigate. The case of an empty
partition should be handled already by `take()`, so I don't think that's it per
se.
(I'm worried about this logic since it will touch every partition, and the
point was to not do so. The 2 changes before this line aren't necessary.)
The exception looks more like funny business in handling `Seq()` (i.e. type
`Any`) somewhere along the line. I'll look.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]