[ https://issues.apache.org/jira/browse/SPARK-20937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16625943#comment-16625943 ]
Tim Armstrong commented on SPARK-20937: --------------------------------------- [~srowen] I would say it is. I know Spark made the change a while ago but the new default (INT{32,64}) for DECIMAL is still worse in many respects than the previous default - support for reading it is far from universal and the PLAIN encoding is a lot less dense (you end up with lots of 0 bytes ). > Describe spark.sql.parquet.writeLegacyFormat property in Spark SQL, > DataFrames and Datasets Guide > ------------------------------------------------------------------------------------------------- > > Key: SPARK-20937 > URL: https://issues.apache.org/jira/browse/SPARK-20937 > Project: Spark > Issue Type: Improvement > Components: Documentation, SQL > Affects Versions: 2.3.0 > Reporter: Jacek Laskowski > Priority: Trivial > > As a follow-up to SPARK-20297 (and SPARK-10400) in which > {{spark.sql.parquet.writeLegacyFormat}} property was recommended for Impala > and Hive, Spark SQL docs for [Parquet > Files|https://spark.apache.org/docs/latest/sql-programming-guide.html#configuration] > should have it documented. > p.s. It was asked about in [Why can't Impala read parquet files after Spark > SQL's write?|https://stackoverflow.com/q/44279870/1305344] on StackOverflow > today. > p.s. It's also covered in [~holden.ka...@gmail.com]'s "High Performance > Spark: Best Practices for Scaling and Optimizing Apache Spark" book (in Table > 3-10. Parquet data source options) that gives the option some wider publicity. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org