[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files

2018-08-22 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/22157
  
Thank you for pinging me, @srowen .

@raofu Instead of changing the existing test coverage, we had better add 
additional test cases which all files are corrupted.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files

2018-08-22 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22157
  
**[Test build #4285 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4285/testReport)**
 for PR 22157 at commit 
[`9afdac6`](https://github.com/apache/spark/commit/9afdac6da180b9c8959696941701c734e9a3fe8e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files

2018-08-22 Thread raofu
Github user raofu commented on the issue:

https://github.com/apache/spark/pull/22157
  
I fixed the test by making the first file the corrupted file. @srowen, can 
you help kick off a Jenkins run?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files

2018-08-22 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/22157
  
The failure in OrcQuerySuite looks legitimate. It's because it corrupts the 
third file of three, then sets the reader to not ignore corrupt files, but 
never actually reads the third file now with this change. I think that might be 
a good thing. @dongjoon-hyun do you have an opinion?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22157
  
**[Test build #4283 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4283/testReport)**
 for PR 22157 at commit 
[`5a86b36`](https://github.com/apache/spark/commit/5a86b3618da695431c01ddbe4bb102a45f93b3b1).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22157
  
**[Test build #4283 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4283/testReport)**
 for PR 22157 at commit 
[`5a86b36`](https://github.com/apache/spark/commit/5a86b3618da695431c01ddbe4bb102a45f93b3b1).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files

2018-08-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22157
  
> Do we have a similar issue for Parquet?

Looks not since we explicitly pick up one file before reading in schema 
inference: 


https://github.com/apache/spark/blob/f984ec75ed6162ee6f5881716a8311c883aca22a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala#L229-L239


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files

2018-08-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22157
  
**[Test build #4282 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4282/testReport)**
 for PR 22157 at commit 
[`5a86b36`](https://github.com/apache/spark/commit/5a86b3618da695431c01ddbe4bb102a45f93b3b1).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files

2018-08-20 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/22157
  
Do we have a similar issue for Parquet?



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files

2018-08-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22157
  
**[Test build #4282 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4282/testReport)**
 for PR 22157 at commit 
[`5a86b36`](https://github.com/apache/spark/commit/5a86b3618da695431c01ddbe4bb102a45f93b3b1).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files

2018-08-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22157
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files

2018-08-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22157
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files

2018-08-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22157
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org