[ 
https://issues.apache.org/jira/browse/PARQUET-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098529#comment-17098529
 ] 

Venkata Satya Pradeep Srikakolapu commented on PARQUET-1178:
------------------------------------------------------------

Thank you [~gershinsky] for quick reply. Could you please point me to an 
example for implementing this feature in Apache arrow? I am interested to 
understand key management for encrypting/decrypting columns with Apache Arrow. 

I am working with a customer from Health Care space. My customer wants to 
encrypt sensitive columns while persisting data to the disk (Data Lake). I see 
few options for column encryption & Key Management
 # Apache Arrow with Python - Do you recommend to use Apache Arrow with Scala?
 # Parquet - MR - Not released yet 
 # Encryption With ORC files (similar to Parquet Modular encryption) - 
[https://jira.apache.org/jira/browse/ORC-14?jql=text%20~%20%22column%20level%20encryption%20to%20ORC%20files%22]

*Some context*: My customer is using Apache Parquet with scala + Spark 
extensively. My customer is also planning to use Python with Parquet. 

Could you please recommend what would be the best choice?

 

 

> Parquet modular encryption
> --------------------------
>
>                 Key: PARQUET-1178
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1178
>             Project: Parquet
>          Issue Type: New Feature
>            Reporter: Gidon Gershinsky
>            Assignee: Gidon Gershinsky
>            Priority: Major
>
> A mechanism for modular encryption and decryption of Parquet files. Allows to 
> keep data fully encrypted in the storage - while enabling efficient analytics 
> on the data, via reader-side extraction / authentication / decryption of data 
> subsets required by columnar projection and predicate push-down.
> Enables fine-grained access control to column data by encrypting different 
> columns with different keys.
> Supports a number of encryption algorithms, to account for different security 
> and performance requirements.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to