[ https://issues.apache.org/jira/browse/SPARK-43273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17716135#comment-17716135 ]
Andrew Grigorev commented on SPARK-43273: ----------------------------------------- Just as a icing on the cake - Clickhouse accidently started to use LZ4_RAW by default for their Parquet output format :). > Spark can't read parquet files with a newer LZ4_RAW compression > --------------------------------------------------------------- > > Key: SPARK-43273 > URL: https://issues.apache.org/jira/browse/SPARK-43273 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 3.2.4, 3.3.3, 3.3.2, 3.4.0 > Reporter: Andrew Grigorev > Priority: Trivial > > hadoop-parquet version should be updated to 1.3.0 > > {code:java} > java.util.concurrent.ExecutionException: org.apache.spark.SparkException: Job > aborted due to stage failure: Task 2 in stage 1.0 failed 1 times, most recent > failure: Lost task 2.0 in stage 1.0 (TID 3) (f2b63fdfa0a6 executor driver): > java.lang.IllegalArgumentException: No enum constant > org.apache.parquet.hadoop.metadata.CompressionCodecName.LZ4_RAW > at java.base/java.lang.Enum.valueOf(Enum.java:273) > at > org.apache.parquet.hadoop.metadata.CompressionCodecName.valueOf(CompressionCodecName.java:26) > at > org.apache.parquet.format.converter.ParquetMetadataConverter.fromFormatCodec(ParquetMetadataConverter.java:636) > ... {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org