Hi all, Now that the low level interface to Parquet encryption is merged in parquet-cpp, and close to completion in parquet-mr, we need to get back to the subject of a high level interface, that allows to use Parquet encryption in a simple, almost transparent way; and helps with management of encryption keys.
What has changed in this field since June'19, when we have last discussed it? - the basic Parquet encryption layer and its low level interface are mostly complete - the two alternatives to high level interfaces we had (properties-driven, and schema-driven), are not mutually exclusive anymore. Together with Xinli, Gabor and Maya, we have managed to create a simple Crypto Factory interface mechanism (PARQUET-1817 <https://issues.apache.org/jira/browse/PARQUET-1817>, already merged in parquet-mr/encryption), that allows to plug in any of the two alternatives - or any other implementation of a high level encryption interface. - the properties-driven interface, and the key management tools used for its implementation, have matured significantly, and are already deployed in production. - I presume the schema-driven interface (crypto-interface with schema activation) has significantly matured as well. The draft design of the Properties-driven encryption is here: https://docs.google.com/document/d/1boH6HPkG0ZhgxcaRkGk3QpZ8X_J91uXZwVGwYN45St4/edit?usp=sharing - Key management tools (leveraged to build the properties-driven encryption, but have a wider applicability), design: https://docs.google.com/document/d/1bEu903840yb95k9q2X-BlsYKuXoygE4VnMDl9xz_zhk/edit?usp=sharing - Code: the draft pull request that implements Properties-driven encryption (and Key management tools) is here: https://github.com/apache/parquet-mr/pull/615 Xinli informs that the Schema-driven design doc is ready too, and a link will be sent soon. All feedback from the community will be appreciated. Cheers, Gidon.
