[GitHub] spark issue #20800: [SPARK-23627][SQL] Provide isEmpty in Dataset
Github user goungoun commented on the issue: https://github.com/apache/spark/pull/20800 Thanks!! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20800: [SPARK-23627][SQL] Provide isEmpty in Dataset
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20800 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20800: [SPARK-23627][SQL] Provide isEmpty in Dataset
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20800 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20800: [SPARK-23627][SQL] Provide isEmpty in Dataset
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20800 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90615/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20800: [SPARK-23627][SQL] Provide isEmpty in Dataset
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20800 **[Test build #90615 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90615/testReport)** for PR 20800 at commit [`f30d3ec`](https://github.com/apache/spark/commit/f30d3ec95c0d00f409f6536d10710b2f65fad787). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20800: [SPARK-23627][SQL] Provide isEmpty in Dataset
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20800 **[Test build #90615 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90615/testReport)** for PR 20800 at commit [`f30d3ec`](https://github.com/apache/spark/commit/f30d3ec95c0d00f409f6536d10710b2f65fad787). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20800: [SPARK-23627][SQL] Provide isEmpty in Dataset
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20800 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20800: [SPARK-23627][SQL] Provide isEmpty in Dataset
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20800 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20800: [SPARK-23627][SQL] Provide isEmpty in Dataset
Github user goungoun commented on the issue: https://github.com/apache/spark/pull/20800 For additional check that I mentioned. The following code shows that Spark users does not need to add take(1). ds.rdd.take(1).isEmpty is redundant. [RDD.scala](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/RDD.scala) `def isEmpty(): Boolean = withScope { partitions.length == 0 || take(1).length == 0 }` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20800: [SPARK-23627][SQL] Provide isEmpty in Dataset
Github user goungoun commented on the issue: https://github.com/apache/spark/pull/20800 @rxin, checking empty is likely to be a common process in every ETL batch job. I think it is the right place to provide that functionality. When a basic function is missing already supposed to be provided, people spend unnecessary time for searching and creating their own creative functions. It does not help us develop clean code or business value neither. I added one of the stack-overflow discussions at SPARK-23627 for your reference. I also would like to confirm rdd.isEmpty is optimized internally following up this issue. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20800: [SPARK-23627][SQL] Provide isEmpty in Dataset
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20800 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20800: [SPARK-23627][SQL] Provide isEmpty in Dataset
Github user rxin commented on the issue: https://github.com/apache/spark/pull/20800 So the API looks useful, but I don't know if this is the right implementation. How important is it to add this? It seems like the value is not super high either. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20800: [SPARK-23627][SQL] Provide isEmpty in Dataset
Github user maropu commented on the issue: https://github.com/apache/spark/pull/20800 sorry, I can't do that. Also, cc: @gatorsmile @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20800: [SPARK-23627][SQL] Provide isEmpty in Dataset
Github user maropu commented on the issue: https://github.com/apache/spark/pull/20800 LGTM, too. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20800: [SPARK-23627][SQL] Provide isEmpty in Dataset
Github user goungoun commented on the issue: https://github.com/apache/spark/pull/20800 @HyukjinKwon, @maropu Just a gentle reminder. Jenkins is waiting for a comment like 'ok to test'. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20800: [SPARK-23627][SQL] Provide isEmpty in Dataset
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/20800 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20800: [SPARK-23627][SQL] Provide isEmpty in DataSet
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20800 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20800: [SPARK-23627][SQL] Provide isEmpty in DataSet
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20800 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org