[I] S3 compression Issue with Iceberg [iceberg]

via GitHub Wed, 04 Oct 2023 00:33:54 -0700


swat1234 opened a new issue, #8713:
URL: https://github.com/apache/iceberg/issues/8713


   Iceberg tables not compressing parquet file in s3. When the below Table 
parameters are used for the Compression the file size is increasing in 
comparison with uncompression. Can some one please assist on the same.
   
   1. File with UNCOMPRESSED codec.
   
   00000-0-0129ba78-17f6-466f-b57b-695c678d64d5-00001.parquet === size 682 bytes
   
   },
         "properties" : {
           "codec" : "UNCOMPRESSED",
   
   -------------------------------
   2. File with gzip codec 733 bytes
   
   00000-0-e6f22c0e-2e16-43aa-8a5f-efabee995876-00001.parquet
   
   "properties" : {
           "codec" : "GZIP",
   
   -------------------------------
   3. File with code snappy codec 686 bytes.
   
   00000-0-36fd4aad-8c38-40f5-8241-78ffe4f0a032-00001.parquet
   
    "codec" : "SNAPPY",
           "path" : {
   
   --------------------------------------------------------------
   Table Properties:
   
   "parquet.compression": "SNAPPY"
       "read.parquet.vectorization.batch-size": "5000"
       "read.split.target-size": "134217728"
       "read.parquet.vectorization.enabled": "true"
       "write.parquet.page-size-bytes": "1048576"
       "write.parquet.row-group-size-bytes": "134217728"
       "write_compression": "SNAPPY"
       "write.parquet.compression-codec": "snappy"
       "write.metadata.metrics.max-inferred-column-defaults": "100"
       "write.parquet.compression-level": "4"
       "write.target-file-size-bytes": "536870912"
       "write.delete.target-file-size-bytes": "67108864"
       "write.parquet.page-row-limit": "20000"
       "write.format.default": "parquet"
       "write.metadata.compression-codec": "gzip"
       "write.compression": "SNAPPY"
   
   
   Thanks in advance!!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] S3 compression Issue with Iceberg [iceberg]

Reply via email to