[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043878#comment-14043878
 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-5890:
----------------------------------------------------

Sorry for jumping in late on this one.

bq. For some sensitive data, encryption while in flight (network) is not 
sufficient, it is required that while at rest it should be encrypted.
Not sure why this is a requirement. I can understand the requirement for 
encryption it on the wire, but on disk the intermediate files are already 
secure, readable only by users who need to read them and more importantly all 
this intermediate data is _very transitory_.

bq. .. writing a separate file for every spill just to store a few bytes of IV 
doesn't seem like a reasonable tradeoff in either performance or complexity.
I too echo the same sentiment.

If this really is a requirement, aren't we better off asking cluster admins to 
either install disks with local file-systems that support encryption 
specifically for intermediate data or just create some partitions that support 
encryption? That seems like the right layer to handle something like this 
instead of adding a whole lot of complexity into the software that only has a 
downside of performance.

Wearing my YARN hat, it is not enough to do this just for MapReduce. Every 
other framework running on YARN will need to add this complexity - this is 
asking for too much complexity. We are better off handling it at the 
file-system/partition/disk level.

> Support for encrypting Intermediate data and spills in local filesystem
> -----------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5890
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5890
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: security
>    Affects Versions: 2.4.0
>            Reporter: Alejandro Abdelnur
>            Assignee: Arun Suresh
>              Labels: encryption
>         Attachments: MAPREDUCE-5890.3.patch, MAPREDUCE-5890.4.patch, 
> MAPREDUCE-5890.5.patch, MAPREDUCE-5890.6.patch, MAPREDUCE-5890.7.patch, 
> org.apache.hadoop.mapred.TestMRIntermediateDataEncryption-output.txt, 
> syslog.tar.gz
>
>
> For some sensitive data, encryption while in flight (network) is not 
> sufficient, it is required that while at rest it should be encrypted. 
> HADOOP-10150 & HDFS-6134 bring encryption at rest for data in filesystem 
> using Hadoop FileSystem API. MapReduce intermediate data and spills should 
> also be encrypted while at rest.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to