Hi Itamar,

My suggestion would be wrap a different API in Python - the high-level
encryption interface of
https://github.com/apache/arrow/pull/8023

This will enable interoperability with Apache Spark (and other frameworks),
where we don't expose the low level parquet encryption API.
If such a low level API is exposed in PyArrow  (by wrapping the PR4826),
it's interop with Spark will require an extra layer, that would eventually
recreate the PR8023 functionality, and add a risk of writing files that
won't be readable by Spark/others (and vice versa).

For additional information, please see these docs:
https://docs.google.com/document/d/1boH6HPkG0ZhgxcaRkGk3QpZ8X_J91uXZwVGwYN45St4/edit?usp=sharing
https://docs.google.com/document/d/1bEu903840yb95k9q2X-BlsYKuXoygE4VnMDl9xz_zhk/edit?usp=sharing


Cheers, Gidon


On Thu, Sep 3, 2020 at 5:12 PM Itamar Turner-Trauring <
ita...@pythonspeed.com> wrote:

> Hi,
>
> I'm looking into implementing this, and it seems like there are two parts:
> packaging, but also wrapping the APIs in Python. Is the latter item
> accurate? If so, any examples of similar existing wrapped APIs, or should I
> just come up with something on my own?
>
> Context:
> https://github.com/apache/arrow/pull/4826
> https://issues.apache.org/jira/browse/ARROW-8040
>
> Thanks,
>
> —Itamar

Reply via email to