GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/19651
[SPARK-20682][SPARK-15474][SPARK-21791] Add new ORCFileFormat based on ORC 1.4.1 ## What changes were proposed in this pull request? Since [SPARK-2883](https://issues.apache.org/jira/browse/SPARK-2883), Apache Spark supports Apache ORC inside `sql/hive` module with Hive dependency. This PR aims to add a new ORC data source inside `sql/core` and to replace the old ORC data source eventually. This PR resolves the following three issues. - SPARK-20682: Add new ORCFileFormat based on Apache ORC 1.4.1 - SPARK-15474: ORC data source fails to write and read back empty dataframe - SPARK-21791: ORC should support column names with dot ## How was this patch tested? Pass the Jenkins with the existing all tests and new tests for SPARK-15474 and SPARK-21791. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dongjoon-hyun/spark SPARK-20682 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/19651.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19651 ---- commit fdde27416fe54036afbd9b809a363e7871df67cf Author: Dongjoon Hyun <dongj...@apache.org> Date: 2017-05-15T02:33:15Z [SPARK-20682][SPARK-15474][SPARK-21791] Add new ORCFileFormat based on Apache ORC 1.4.1 ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org