To avoid any confusion - the stuff, described in the previous mail, is a
possible future add-on.

The low level API, already defined and implemented, has the maximal
capabilities, as far as Parquet encryption is concerned.
High level interface will expose a (useful) subset of these capabilities,
as explained in the doc below.
Today, we have at least three companies building end-to-end data protection
systems using the low level Parquet encryption API.
The API is simple; these folks focus on how to manage the keys/auth above
it, they are skilled enough to handle that.

In the high level interface, we're using this experience to help less
skilled users with the key management/auth. There is
no auto-magic solution for that, but we will create a set of helper tools
and a simple interface to it.
The interface concepts will be somewhat similar to the low-level API: pass
a list of columns to be encrypted, but now,
instead of explicit keys and their metadata, pass master key IDs for each
column. See the doc for examples
of such table/column properties
<https://docs.google.com/document/d/1boH6HPkG0ZhgxcaRkGk3QpZ8X_J91uXZwVGwYN45St4/edit#heading=h.o9oq8a9wa6em>.
The translation of master key IDs into encryption keys/metadata will be
performed by these helper tools, KMS, etc.

In other words, we should build this bottom up, by completing/merging the
low level APIs first, and then use the community experience with them
to optimally design and build the high level add-on interface/ helper tools.

Cheers, Gidon.

---------- Forwarded message ---------
From: Gidon Gershinsky <[email protected]>
Date: Wed, Jun 5, 2019 at 4:51 PM
Subject: High level interface to Parquet encryption
To: <[email protected]>


Hi all,

As discussed at the last sync, we've briefly explored the current proposals
for the high level interface to encryption. While the initial goal was to
merge them into a single doc, it turned out the 1396 has evolved in the
meantime, becoming a full interface system. So we have two parallel
proposals, both presented for a community discussion:

[1] Crypto Interface for Schema Activation of Parquet Encryption
<https://docs.google.com/document/d/17GTQAezl1ZC1pMNHjYU_bPVxMU6DIPjtXOiLclXUlyA/edit#heading=h.r9wntu3s8swd>
Corresponds to PARQUET-1396
<https://issues.apache.org/jira/browse/PARQUET-1396>

[2] Properties-based Interface to Parquet Encryption
<https://docs.google.com/document/d/1boH6HPkG0ZhgxcaRkGk3QpZ8X_J91uXZwVGwYN45St4/edit?usp=sharing>
I've created PARQUET-1568
<https://issues.apache.org/jira/browse/PARQUET-1568> for this one. Both
title and description of the Jira are subject to change. The doc [2] is not
a design draft, but rather a writeup of the current proposal and prototype
code, put together mainly to facilitate the community feedback and
discussion of goals, approach, etc.

Cheers, Gidon

Reply via email to