adamreeve opened a new pull request, #16779:
URL: https://github.com/apache/datafusion/pull/16779

   ## Which issue does this PR close?
   
   - Closes #16778.
   
   ## Rationale for this change
   
   See #16778. This allows per-file encryption key generation and for keys to 
be retrieved based on encryption metadata stored in the Parquet files, rather 
than readers needing to know AES keys upfront.
   
   ## What changes are included in this PR?
   
   * Adds a new `EncryptionFactory` trait for types that generate file 
encryption and decryption properties. This is loosely based on the approach 
used by Spark (see [this 
comment](https://github.com/apache/datafusion/issues/15216#issuecomment-2852529965)
 for details).
   * Allows registering `EncryptionFactory` instances in the `RuntimeEnv`, 
similar to how `ObjectStore`s can be registered.
   * Updates the `crypto` configuration field in `TableParquetOptions` to allow 
setting an encryption factory id, and opaque configuration options required by 
the encryption factory.
   * Updates Parquet encryption and decryption code to use a registered 
`EncryptionFactory` where necessary.
   
   ## Are these changes tested?
   
   Yes, new unit tests and an example have been added.
   
   ## Are there any user-facing changes?
   
   Yes, this is a new user-facing feature.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to