[ https://issues.apache.org/jira/browse/PARQUET-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17393788#comment-17393788 ]
Gabor Szadovszky commented on PARQUET-2071: ------------------------------------------- I think it is a great idea to skip unnecessary deserialization/serialization steps in such cases. Meanwhile, we already have some tools with similar approach like trans-compression or prune columns. What do you think of implementing a more universal tool where you can configure the projection schema and the configuration of the target file. Then the tool can decide which level of deserialization/serialization is required. For example for trans-compression you need to decompress the pages while for encryption you don't. What do you think? > Encryption translation tool > ---------------------------- > > Key: PARQUET-2071 > URL: https://issues.apache.org/jira/browse/PARQUET-2071 > Project: Parquet > Issue Type: New Feature > Components: parquet-mr > Reporter: Xinli Shang > Assignee: Xinli Shang > Priority: Major > > When translating existing data to encryption state, we could develop a tool > like TransCompression to translate the data at page level to encryption state > without reading to record and rewrite. This will speed up the process a lot. -- This message was sent by Atlassian Jira (v8.3.4#803005)