[GitHub] spark issue #22157: [SPARK-25126][SQL] Avoid creating Reader for all orc fil...

2018-08-22 Thread raofu
Github user raofu commented on the issue: https://github.com/apache/spark/pull/22157 @dongjoon-hyun, thanks lot for the pointers! I've update the PR description. Please let me know if there is any other information you'd like me to add

[GitHub] spark issue #22157: [SPARK-25126][SQL] Avoid creating Reader for all orc fil...

2018-08-22 Thread raofu
Github user raofu commented on the issue: https://github.com/apache/spark/pull/22157 @dongjoon-hyun Title updated. Thanks for adding the test coverage! I've merged your commit. Can you help kick off another Jenkins run? I don't think I have the permission to do

[GitHub] spark issue #22157: [SPARK-25126] Avoid creating Reader for all orc files

2018-08-22 Thread raofu
Github user raofu commented on the issue: https://github.com/apache/spark/pull/22157 I fixed the test by making the first file the corrupted file. @srowen, can you help kick off a Jenkins run? --- - To unsubscribe

[GitHub] spark pull request #22157: [SPARK-25126] Avoid creating Reader for all orc f...

2018-08-20 Thread raofu
GitHub user raofu opened a pull request: https://github.com/apache/spark/pull/22157 [SPARK-25126] Avoid creating Reader for all orc files In OrFileOperator.ReadSchema, a Reader is created for every file although only the first valid one is used. This uses significant amount

[GitHub] spark pull request #22113: [SPARK-25126] Lazily create Reader for orc files

2018-08-15 Thread raofu
Github user raofu commented on a diff in the pull request: https://github.com/apache/spark/pull/22113#discussion_r210473687 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFileOperator.scala --- @@ -70,7 +70,7 @@ private[hive] object OrcFileOperator extends

[GitHub] spark pull request #22113: [SPARK-25126] Lazily create Reader for orc files

2018-08-15 Thread raofu
Github user raofu closed the pull request at: https://github.com/apache/spark/pull/22113 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22113: [SPARK-25126] Lazily create Reader for orc files

2018-08-15 Thread raofu
GitHub user raofu opened a pull request: https://github.com/apache/spark/pull/22113 [SPARK-25126] Lazily create Reader for orc files ## What changes were proposed in this pull request? Currently Reader is created for every orc file under the directory and then the first