achasnovskiy opened a new pull request, #3173: URL: https://github.com/apache/iceberg-python/pull/3173
Closes #[2329](https://github.com/apache/iceberg-python/issues/2329) # Rationale for this change Some environments (e.g. AWS SCPs) require S3 uploads to include server-side encryption parameters (`ServerSideEncryption`, and for KMS also `SSEKMSKeyId`). PyIceberg writes table files through `FileIO`; for the **fsspec** backend this ultimately uses `s3fs.S3FileSystem`, which supports passing these fields via `s3_additional_kwargs`. Previously, `FsspecFileIO` did not map any catalog/FileIO properties into `s3_additional_kwargs`, so users could not configure SSE for S3 writes through PyIceberg. This change adds two optional configuration keys that are passed through to `s3fs` as `ServerSideEncryption` and `SSEKMSKeyId` respectively. **Note:** Default `s3://` FileIO selection prefers `PyArrowFileIO` when installed. These new properties are implemented for **`FsspecFileIO` only**; users who need them should set `py-io-impl: pyiceberg.io.fsspec.FsspecFileIO` (or ensure the fsspec-backed FileIO is selected). ## Are these changes tested? Yes. - `make lint` - `make test` New unit tests assert that `s3fs.S3FileSystem` is constructed with the expected `s3_additional_kwargs` when `s3.server-side-encryption` and (optionally) `s3.sse-kms-key-id` are set. ## Are there any user-facing changes? Yes. New catalog/FileIO configuration properties: - `s3.server-side-encryption` — e.g. `AES256` or `aws:kms` (passed as `ServerSideEncryption` to S3 APIs via s3fs). - `s3.sse-kms-key-id` — KMS key id or ARN when using SSE-KMS (passed as `SSEKMSKeyId`). Documentation updated in the S3 FileIO section of the configuration docs. **Example (`.pyiceberg.yaml` snippet):** ```yaml catalog: default: type: glue py-io-impl: pyiceberg.io.fsspec.FsspecFileIO s3.server-side-encryption: aws:kms s3.sse-kms-key-id: arn:aws:kms:us-east-1:123456789012:key/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx # ... other catalog config (region, credentials, etc.) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
