[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/22157 Thank you for pinging me, @srowen . @raofu Instead of changing the existing test coverage, we had better add additional test cases which all files are corrupted. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22157 **[Test build #4285 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4285/testReport)** for PR 22157 at commit [`9afdac6`](https://github.com/apache/spark/commit/9afdac6da180b9c8959696941701c734e9a3fe8e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files
Github user raofu commented on the issue: https://github.com/apache/spark/pull/22157 I fixed the test by making the first file the corrupted file. @srowen, can you help kick off a Jenkins run? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files
Github user srowen commented on the issue: https://github.com/apache/spark/pull/22157 The failure in OrcQuerySuite looks legitimate. It's because it corrupts the third file of three, then sets the reader to not ignore corrupt files, but never actually reads the third file now with this change. I think that might be a good thing. @dongjoon-hyun do you have an opinion? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22157 **[Test build #4283 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4283/testReport)** for PR 22157 at commit [`5a86b36`](https://github.com/apache/spark/commit/5a86b3618da695431c01ddbe4bb102a45f93b3b1). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22157 **[Test build #4283 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4283/testReport)** for PR 22157 at commit [`5a86b36`](https://github.com/apache/spark/commit/5a86b3618da695431c01ddbe4bb102a45f93b3b1). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22157 > Do we have a similar issue for Parquet? Looks not since we explicitly pick up one file before reading in schema inference: https://github.com/apache/spark/blob/f984ec75ed6162ee6f5881716a8311c883aca22a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala#L229-L239 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22157 **[Test build #4282 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4282/testReport)** for PR 22157 at commit [`5a86b36`](https://github.com/apache/spark/commit/5a86b3618da695431c01ddbe4bb102a45f93b3b1). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files
Github user rxin commented on the issue: https://github.com/apache/spark/pull/22157 Do we have a similar issue for Parquet? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22157 **[Test build #4282 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4282/testReport)** for PR 22157 at commit [`5a86b36`](https://github.com/apache/spark/commit/5a86b3618da695431c01ddbe4bb102a45f93b3b1). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22157 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22157 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22157 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org