Github user goungoun commented on the issue: https://github.com/apache/spark/pull/20800 @rxin, checking empty is likely to be a common process in every ETL batch job. I think it is the right place to provide that functionality. When a basic function is missing already supposed to be provided, people spend unnecessary time for searching and creating their own creative functions. It does not help us develop clean code or business value neither. I added one of the stack-overflow discussions at SPARK-23627 for your reference. I also would like to confirm rdd.isEmpty is optimized internally following up this issue.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org