syun64 commented on code in PR #358:
URL: https://github.com/apache/iceberg-python/pull/358#discussion_r1477114250
##########
pyiceberg/io/pyarrow.py:
##########
@@ -1720,13 +1720,22 @@ def write_file(table: Table, tasks:
Iterator[WriteTask]) -> Iterator[DataFile]:
except StopIteration:
pass
+ compression_codec = table.properties.get("write.parquet.compression-codec")
+ compression_level = table.properties.get("write.parquet.compression-level")
+ if compression_codec == "uncompressed":
Review Comment:
Do we need this handling? Could we just interpret string 'none' value to
mean something different from not setting the write.parquet.compression-codec
property at all?
I think it would be simpler to just propagate the string value directly to
the compression option of the
[ParquetWriter](https://arrow.apache.org/docs/python/generated/pyarrow.parquet.ParquetWriter.html):
"compression[str](https://docs.python.org/3/library/stdtypes.html#str) or
[dict](https://docs.python.org/3/library/stdtypes.html#dict), default ‘snappy’
Specify the compression codec, either on a general basis or per-column.
Valid values: {‘NONE’, ‘SNAPPY’, ‘GZIP’, ‘BROTLI’, ‘LZ4’, ‘ZSTD’}."
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]