[ 
https://issues.apache.org/jira/browse/SPARK-39743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17571995#comment-17571995
 ] 

shezm commented on SPARK-39743:
-------------------------------

[~yeachan153] 

spark.io.compression.zstd.level is adapted to 
{{{}spark.io.compression.codec{}}}. It only works on internal data.

 

If you want to set a different zstd level to write parquet files , you can set 
`parquet.compression.codec.zstd.level` in sparkConf, like :

 
{code:java}
val spark = SparkSession
      .builder()
      .master("local")
      .appName("spark example")
      .config("spark.sql.parquet.compression.codec", "zstd")
      .config("parquet.compression.codec.zstd.level", 10)  // here 
      .getOrCreate(); 

    val csvfile = spark.read.csv("file:///home/test_data/Reviews.csv")
    csvfile.coalesce(1).write.parquet("file:///home/test_data/nn_parq_10"){code}
 

 

> Unable to set zstd compression level while writing parquet files
> ----------------------------------------------------------------
>
>                 Key: SPARK-39743
>                 URL: https://issues.apache.org/jira/browse/SPARK-39743
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.2.0
>            Reporter: Yeachan Park
>            Priority: Minor
>
> While writing zstd compressed parquet files, the following setting 
> `spark.io.compression.zstd.level` does not have any affect with regards to 
> the compression level of zstd.
> All files seem to be written with the default zstd compression level, and the 
> config option seems to be ignored.
> Using the zstd cli tool, we confirmed that setting a higher compression level 
> for the same file tested in spark resulted in a smaller file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to