GitHub user theelderbeever created a discussion: How do you write a single 
parquet file with a specified compression?

As the title says? How do you just write a single parquet file with 
configuration? The two `write_parquet` methods that exist have completely 
different arguments and config options that hardly offer much to work with. 
Additionally, all the examples for 

Datafusion 49.0.2

```rust
    let options = WriterProperties::builder()
        .set_compression(datafusion::parquet::basic::Compression::ZSTD(
            ZstdLevel::try_new(3)?,
        ))
        .build();

    let write_options = 
DataFrameWriteOptions::new().with_single_file_output(true);

    // This writes a single file but takes `TableParquetOptions` which  can't 
configure compression. The docs say this is tied to `ParquetWriterOptions` but 
there is no way to convert between the two.
    df.repartition(Partitioning::RoundRobinBatch(1))?
        .write_parquet("data/data.zstd.parquet", write_options, None) 
        .await?;

    // This accepts the `WriterProperties` but can't be configured to write a 
single file.
    ctx.write_parquet(
        df.repartition(Partitioning::RoundRobinBatch(1))?
            .create_physical_plan()
            .await?,
        "data/data.zstd.parquet",
        Some(options)
    )
    .await?;
```

GitHub link: https://github.com/apache/datafusion/discussions/17578

----
This is an automatically sent email for github@datafusion.apache.org.
To unsubscribe, please send an email to: 
github-unsubscr...@datafusion.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to