Konstantin Shaposhnikov created SPARK-6566: ----------------------------------------------
Summary: Update Spark to use the latest version of Parquet libraries Key: SPARK-6566 URL: https://issues.apache.org/jira/browse/SPARK-6566 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 1.3.0 Reporter: Konstantin Shaposhnikov There are a lot of bug fixes in the latest version of parquet (1.6.0rc7). E.g. PARQUET-136 It would be good to update Spark to use the latest parquet version. The following changes are required: {code} diff --git a/pom.xml b/pom.xml index 5ad39a9..095b519 100644 --- a/pom.xml +++ b/pom.xml @@ -132,7 +132,7 @@ <!-- Version used for internal directory structure --> <hive.version.short>0.13.1</hive.version.short> <derby.version>10.10.1.1</derby.version> - <parquet.version>1.6.0rc3</parquet.version> + <parquet.version>1.6.0rc7</parquet.version> <jblas.version>1.2.3</jblas.version> <jetty.version>8.1.14.v20131031</jetty.version> <orbit.version>3.0.0.v201112011016</orbit.version> {code} and {code} --- a/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTableOperations.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTableOperations.scala @@ -480,7 +480,7 @@ private[parquet] class FilteringParquetRowInputFormat globalMetaData = new GlobalMetaData(globalMetaData.getSchema, mergedMetadata, globalMetaData.getCreatedBy) - val readContext = getReadSupport(configuration).init( + val readContext = ParquetInputFormat.getReadSupportInstance(configuration).init( new InitContext(configuration, globalMetaData.getKeyValueMetaData, globalMetaData.getSchema)) {code} I am happy to prepare a pull request if necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org