[ https://issues.apache.org/jira/browse/HDFS-8747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14651474#comment-14651474 ]
Yi Liu commented on HDFS-8747: ------------------------------ The design looks good overall. Besides Andrew's comments, I have few questions/comments too: *1.* About trash, it's to prevent accidental deletion of files and directories if enabled, I think we may not need special support for encryption zone besides what we have currently, since if users don't use {{-skipTrash}}, the deletion will be failed, so people will not accidentally delete the files/directories, they should know what they are doing if he uses {{-skipTrash}}. On the other hand, the management is more complex as Andrew discussed for Trash. Do you see any real problems of trash for encryption zones. *2.* About Scratch, it's good and can solve some problems, nice work to support it. I did see hive needs this support if people want to use transparent encryption there. > Provide Better "Scratch Space" and "Soft Delete" Support for HDFS Encryption > Zones > ---------------------------------------------------------------------------------- > > Key: HDFS-8747 > URL: https://issues.apache.org/jira/browse/HDFS-8747 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption > Affects Versions: 2.6.0 > Reporter: Xiaoyu Yao > Assignee: Xiaoyu Yao > Attachments: HDFS-8747-07092015.pdf, HDFS-8747-07152015.pdf, > HDFS-8747-07292015.pdf > > > HDFS Transparent Data Encryption At-Rest was introduced in Hadoop 2.6 to > allow create encryption zone on top of a single HDFS directory. Files under > the root directory of the encryption zone will be encrypted/decrypted > transparently upon HDFS client write or read operations. > Generally, it does not support rename(without data copying) across encryption > zones or between encryption zone and non-encryption zone because different > security settings of encryption zones. However, there are certain use cases > where efficient rename support is desired. This JIRA is to propose better > support of two such use cases “Scratch Space” (a.k.a. staging area) and “Soft > Delete” (a.k.a. trash) with HDFS encryption zones. > “Scratch Space” is widely used in Hadoop jobs, which requires efficient > rename support. Temporary files from MR jobs are usually stored in staging > area outside encryption zone such as “/tmp” directory and then rename to > targeted directories as specified once the data is ready to be further > processed. > Below is a summary of supported/unsupported cases from latest Hadoop: > * Rename within the encryption zone is supported > * Rename the entire encryption zone by moving the root directory of the zone > is allowed. > * Rename sub-directory/file from encryption zone to non-encryption zone is > not allowed. > * Rename sub-directory/file from encryption zone A to encryption zone B is > not allowed. > * Rename from non-encryption zone to encryption zone is not allowed. > “Soft delete” (a.k.a. trash) is a client-side “soft delete” feature that > helps prevent accidental deletion of files and directories. If trash is > enabled and a file or directory is deleted using the Hadoop shell, the file > is moved to the .Trash directory of the user's home directory instead of > being deleted. Deleted files are initially moved (renamed) to the Current > sub-directory of the .Trash directory with original path being preserved. > Files and directories in the trash can be restored simply by moving them to a > location outside the .Trash directory. > Due to the limited rename support, delete sub-directory/file within > encryption zone with trash feature is not allowed. Client has to use > -skipTrash option to work around this. HADOOP-10902 and HDFS-6767 improved > the error message but without a complete solution to the problem. > We propose to solve the problem by generalizing the mapping between > encryption zone and its underlying HDFS directories from 1:1 today to 1:N. > The encryption zone should allow non-overlapped directories such as scratch > space or soft delete "trash" locations to be added/removed dynamically after > creation. This way, rename for "scratch space" and "soft delete" can be > better supported without breaking the assumption that rename is only > supported "within the zone". -- This message was sent by Atlassian JIRA (v6.3.4#6332)