b.com/apache/arrow/pull/8023
Cheers, Gidon
-- Forwarded message -
From: Gidon Gershinsky
Date: Thu, Feb 18, 2021 at 6:25 PM
Subject: Re: Exposing low-level Parquet encryption to Python user (or,
maybe not)
To: dev
Thanks, then we'll just go ahead and address the remainin
, Feb 18, 2021 at 6:25 PM
Subject: Re: Exposing low-level Parquet encryption to Python user (or,
maybe not)
To: dev
Thanks, then we'll just go ahead and address the remaining comments.
Cheers, Gidon
On Thu, Feb 18, 2021 at 5:45 PM Antoine Pitrou wrote:
>
> I don't think the
Thanks, then we'll just go ahead and address the remaining comments.
Cheers, Gidon
On Thu, Feb 18, 2021 at 5:45 PM Antoine Pitrou wrote:
>
> I don't think there's any concern around having a process-global shared
> key cache. The discussion was just around the implementation.
>
> Also, FTR, a
I don't think there's any concern around having a process-global shared
key cache. The discussion was just around the implementation.
Also, FTR, a standalone LRU cache class is proposed here, which may
reduce the amount of original code in the Parquet encryption PR:
https://github.com/apache/ar
I believe the shared structures that were debated are the key caches.
Cheers, Gidon
On Thu, Feb 18, 2021 at 6:37 AM Micah Kornfield
wrote:
> >
> > I don't think any notion of threading should be present in the
> > implementation, except for the required locks around shared structures.
>
>
> I
>
> I don't think any notion of threading should be present in the
> implementation, except for the required locks around shared structures.
I seem to recall the debate was how to model some class interactions to
determine what should be considered shared structures and what should not.
On Wed,
This certainly sounds good to me.
Cheers, Gidon
On Wed, Feb 17, 2021 at 7:36 PM Antoine Pitrou wrote:
>
> I don't think any notion of threading should be present in the
> implementation, except for the required locks around shared structures.
> I don't know where the idea of a "main thread" c
I don't think any notion of threading should be present in the
implementation, except for the required locks around shared structures.
I don't know where the idea of a "main thread" comes from, but it
probably shouldn't exist in a C++ library.
Regards
Antoine.
Le 17/02/2021 à 18:34, Gidon G
Just to clarify. There are two options, which one do you refer to? A design
with a main thread that handles projections and the keys (relevant for the
projected columns); or the current code with any thread allowed to handle
full file reading, inc the footer, column projections and their keys? Can
Le 17/02/2021 à 12:47, Gidon Gershinsky a écrit :
> From the doc,
> "To maintain consistency with the style of parquet-cpp, the above
> structures should not be explicitly synchronized with individual mutexes.
> In the case of a parquet::arrow::FileReader, the request to read a given
> selection
>From the doc,
"To maintain consistency with the style of parquet-cpp, the above
structures should not be explicitly synchronized with individual mutexes.
In the case of a parquet::arrow::FileReader, the request to read a given
selection of row groups and columns is issued from a single main thread
I'm not sure a threading model is expected for an encryption layer. Am
I missing something?
Regards
Antoine.
Le 17/02/2021 à 06:59, Gidon Gershinsky a écrit :
> Precisely, the main change is in the threading model. Afaik, the document
> proposes a model that fits pandas, but might be problem
Precisely, the main change is in the threading model. Afaik, the document
proposes a model that fits pandas, but might be problematic for other users
of this library.
Technically, this is not showstopper though; if the community decides on
this model, it will be compatible with the high-level encry
Hi Antoine,
My part there is mostly review and some advice. The bulk of the work is
done by Tham, and by the community members who've reviewed the PR; my
frustration is with seeing it in limbo for a while now.
Regarding the remaining comments - currently, the main sticking points are
the change pr
I think some of the comments might be conflicting. One of the concerns
(that I would need to refresh myself on to offer an opinion which was
covered in Ben's doc) was the threading model we expect in the library.
On Tue, Feb 16, 2021 at 8:03 AM Antoine Pitrou wrote:
>
> Hi Gidon,
>
> Le 16/02/2
Hi Gidon,
Le 16/02/2021 à 16:42, Gidon Gershinsky a écrit :
> Regarding the high-level layer, I think it waits for a progress at
> https://docs.google.com/document/d/11qz84ajysvVo5ZAV9mXKOeh6ay4-xgkBrubggCP5220/edit?usp=sharing
> No activity there since last November. This is unfortunate, becaus
Regarding the high-level layer, I think it waits for a progress at
https://docs.google.com/document/d/11qz84ajysvVo5ZAV9mXKOeh6ay4-xgkBrubggCP5220/edit?usp=sharing
No activity there since last November. This is unfortunate, because Tham
has put a lot of work in coding the high-level layer (and addr
On Mon, Feb 15, 2021, at 2:49 PM, Micah Kornfield wrote:
> Sorry I realized I had a typo in my email. We should definitely namespace
> dangerous apis appropriately.
Decryption doesn't seem necessarily dangerous? In any case, I will start with
PR for decryption only and we can see how that goes
Sorry I realized I had a typo in my email. We should definitely namespace
dangerous apis appropriately.
On Monday, February 15, 2021, Itamar Turner-Trauring
wrote:
>
>
> On Fri, Feb 12, 2021, at 11:52 PM, Micah Kornfield wrote:
> > 2. I'm open to exposing the lower level encryption libraries i
On Fri, Feb 12, 2021, at 11:52 PM, Micah Kornfield wrote:
> 2. I'm open to exposing the lower level encryption libraries in python
> (without appropriate namespacing/communication). It seems at least for
> reading, there is potentially less harm (I'll caveat that with I'm not a
> security exper
My thoughts:
1. I've lost track of the higher level encryption implementation in C++.
I think we were trying to come to a consensus on the threading/thread
safety model?
2. I'm open to exposing the lower level encryption libraries in python
(without appropriate namespacing/communication). It se
Hi,
Since the PR for high-level C++ Parquet encryption API appears stalled
(https://github.com/apache/arrow/pull/8023), I'm looking into exposing the
low-level Parquet encryption API to Python.
Arguments for doing this: the low-level API is all the users I'm talking to
need, at the moment, so
22 matches
Mail list logo