[jira] [Commented] (HADOOP-13887) Encrypt S3A data client-side with AWS SDK (S3-CSE)
[ https://issues.apache.org/jira/browse/HADOOP-13887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17425303#comment-17425303 ] Dongjoon Hyun commented on HADOOP-13887: This seems to land at branch-3.2. Could you update the Fix Version, please? > Encrypt S3A data client-side with AWS SDK (S3-CSE) > -- > > Key: HADOOP-13887 > URL: https://issues.apache.org/jira/browse/HADOOP-13887 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Jeeyoung Kim >Assignee: Mehakmeet Singh >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: HADOOP-13887-002.patch, HADOOP-13887-007.patch, > HADOOP-13887-branch-2-003.patch, HADOOP-13897-branch-2-004.patch, > HADOOP-13897-branch-2-005.patch, HADOOP-13897-branch-2-006.patch, > HADOOP-13897-branch-2-008.patch, HADOOP-13897-branch-2-009.patch, > HADOOP-13897-branch-2-010.patch, HADOOP-13897-branch-2-012.patch, > HADOOP-13897-branch-2-014.patch, HADOOP-13897-trunk-011.patch, > HADOOP-13897-trunk-013.patch, HADOOP-14171-001.patch, S3-CSE Proposal.pdf > > Time Spent: 14h > Remaining Estimate: 0h > > Expose the client-side encryption option documented in Amazon S3 > documentation - > http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingClientSideEncryption.html > When backporting, include HADOOP-17817 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13887) Encrypt S3A data client-side with AWS SDK (S3-CSE)
[ https://issues.apache.org/jira/browse/HADOOP-13887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17282266#comment-17282266 ] Mehakmeet Singh commented on HADOOP-13887: -- update: was able to resolve the issue by setting the KMS region for the AWS KMS client(same as the bucket region). > Encrypt S3A data client-side with AWS SDK (S3-CSE) > -- > > Key: HADOOP-13887 > URL: https://issues.apache.org/jira/browse/HADOOP-13887 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Jeeyoung Kim >Assignee: Igor Mazur >Priority: Minor > Attachments: HADOOP-13887-002.patch, HADOOP-13887-007.patch, > HADOOP-13887-branch-2-003.patch, HADOOP-13897-branch-2-004.patch, > HADOOP-13897-branch-2-005.patch, HADOOP-13897-branch-2-006.patch, > HADOOP-13897-branch-2-008.patch, HADOOP-13897-branch-2-009.patch, > HADOOP-13897-branch-2-010.patch, HADOOP-13897-branch-2-012.patch, > HADOOP-13897-branch-2-014.patch, HADOOP-13897-trunk-011.patch, > HADOOP-13897-trunk-013.patch, HADOOP-14171-001.patch, S3-CSE Proposal.pdf > > > Expose the client-side encryption option documented in Amazon S3 > documentation - > http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingClientSideEncryption.html > Currently this is not exposed in Hadoop but it is exposed as an option in AWS > Java SDK, which Hadoop currently includes. It should be trivial to propagate > this as a parameter passed to the S3client used in S3AFileSystem.java -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13887) Encrypt S3A data client-side with AWS SDK (S3-CSE)
[ https://issues.apache.org/jira/browse/HADOOP-13887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280997#comment-17280997 ] Mehakmeet Singh commented on HADOOP-13887: -- I have been trying to add support for CSE using the AmazonS3EncryptionClientV2Builder in DefaultS3ClientFactory, but met with Region issues like : {code:java} 2021-02-08 15:37:01,675 [setup] WARN util.EC2MetadataUtils (EC2MetadataUtils.java:getItems(410)) - Unable to retrieve the requested metadata (/latest/dynamic/instance-identity/document). Failed to connect to service endpoint: 2021-02-08 15:37:01,675 [setup] WARN util.EC2MetadataUtils (EC2MetadataUtils.java:getItems(410)) - Unable to retrieve the requested metadata (/latest/dynamic/instance-identity/document). Failed to connect to service endpoint: com.amazonaws.SdkClientException: Failed to connect to service endpoint: at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:100) at com.amazonaws.internal.EC2ResourceFetcher.doReadResource(EC2ResourceFetcher.java:70) at com.amazonaws.internal.InstanceMetadataServiceResourceFetcher.readResource(InstanceMetadataServiceResourceFetcher.java:75) at com.amazonaws.internal.EC2ResourceFetcher.readResource(EC2ResourceFetcher.java:66) at com.amazonaws.util.EC2MetadataUtils.getItems(EC2MetadataUtils.java:403) at com.amazonaws.util.EC2MetadataUtils.getData(EC2MetadataUtils.java:372) at com.amazonaws.util.EC2MetadataUtils.getData(EC2MetadataUtils.java:368) at com.amazonaws.util.EC2MetadataUtils.getEC2InstanceRegion(EC2MetadataUtils.java:283) at com.amazonaws.regions.InstanceMetadataRegionProvider.tryDetectRegion(InstanceMetadataRegionProvider.java:59) at com.amazonaws.regions.InstanceMetadataRegionProvider.getRegion(InstanceMetadataRegionProvider.java:50) at com.amazonaws.regions.AwsRegionProviderChain.getRegion(AwsRegionProviderChain.java:46) at com.amazonaws.client.builder.AwsClientBuilder.determineRegionFromRegionProvider(AwsClientBuilder.java:475) at com.amazonaws.client.builder.AwsClientBuilder.setRegion(AwsClientBuilder.java:458) at com.amazonaws.client.builder.AwsClientBuilder.configureMutableProperties(AwsClientBuilder.java:424) at com.amazonaws.client.builder.AwsSyncClientBuilder.build(AwsSyncClientBuilder.java:46) at com.amazonaws.services.s3.AmazonS3EncryptionClientV2.newAWSKMSClient(AmazonS3EncryptionClientV2.java:197) at com.amazonaws.services.s3.AmazonS3EncryptionClientV2.(AmazonS3EncryptionClientV2.java:115) at com.amazonaws.services.s3.AmazonS3EncryptionClientV2Builder.build(AmazonS3EncryptionClientV2Builder.java:101) at com.amazonaws.services.s3.AmazonS3EncryptionClientV2Builder.build(AmazonS3EncryptionClientV2Builder.java:23) at com.amazonaws.client.builder.AwsSyncClientBuilder.build(AwsSyncClientBuilder.java:46) at org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.newAmazonS3EncryptionClient(DefaultS3ClientFactory.java:188) at org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.createS3Client(DefaultS3ClientFactory.java:126) at org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:751) at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:444) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3460) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:536) at org.apache.hadoop.fs.contract.AbstractBondedFSContract.init(AbstractBondedFSContract.java:72) at org.apache.hadoop.fs.contract.AbstractFSContractTestBase.setup(AbstractFSContractTestBase.java:187) at org.apache.hadoop.fs.s3a.AbstractS3ATestBase.setup(AbstractS3ATestBase.java:77) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:748)Caused by: java.net.ConnectException: No route to host (connect failed) at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketIm
[jira] [Commented] (HADOOP-13887) Encrypt S3A data client-side with AWS SDK (S3-CSE)
[ https://issues.apache.org/jira/browse/HADOOP-13887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17110202#comment-17110202 ] Steve Loughran commented on HADOOP-13887: - S3 now supports unpadded CSE, so client side encryption will be safe to use. This is what e > Encrypt S3A data client-side with AWS SDK (S3-CSE) > -- > > Key: HADOOP-13887 > URL: https://issues.apache.org/jira/browse/HADOOP-13887 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Jeeyoung Kim >Assignee: Igor Mazur >Priority: Minor > Attachments: HADOOP-13887-002.patch, HADOOP-13887-007.patch, > HADOOP-13887-branch-2-003.patch, HADOOP-13897-branch-2-004.patch, > HADOOP-13897-branch-2-005.patch, HADOOP-13897-branch-2-006.patch, > HADOOP-13897-branch-2-008.patch, HADOOP-13897-branch-2-009.patch, > HADOOP-13897-branch-2-010.patch, HADOOP-13897-branch-2-012.patch, > HADOOP-13897-branch-2-014.patch, HADOOP-13897-trunk-011.patch, > HADOOP-13897-trunk-013.patch, HADOOP-14171-001.patch, S3-CSE Proposal.pdf > > > Expose the client-side encryption option documented in Amazon S3 > documentation - > http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingClientSideEncryption.html > Currently this is not exposed in Hadoop but it is exposed as an option in AWS > Java SDK, which Hadoop currently includes. It should be trivial to propagate > this as a parameter passed to the S3client used in S3AFileSystem.java -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13887) Encrypt S3A data client-side with AWS SDK
[ https://issues.apache.org/jira/browse/HADOOP-13887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17036976#comment-17036976 ] Hadoop QA commented on HADOOP-13887: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 9s{color} | {color:red} HADOOP-13887 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HADOOP-13887 | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/16763/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Encrypt S3A data client-side with AWS SDK > - > > Key: HADOOP-13887 > URL: https://issues.apache.org/jira/browse/HADOOP-13887 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Jeeyoung Kim >Assignee: Igor Mazur >Priority: Minor > Attachments: HADOOP-13887-002.patch, HADOOP-13887-007.patch, > HADOOP-13887-branch-2-003.patch, HADOOP-13897-branch-2-004.patch, > HADOOP-13897-branch-2-005.patch, HADOOP-13897-branch-2-006.patch, > HADOOP-13897-branch-2-008.patch, HADOOP-13897-branch-2-009.patch, > HADOOP-13897-branch-2-010.patch, HADOOP-13897-branch-2-012.patch, > HADOOP-13897-branch-2-014.patch, HADOOP-13897-trunk-011.patch, > HADOOP-13897-trunk-013.patch, HADOOP-14171-001.patch, S3-CSE Proposal.pdf > > > Expose the client-side encryption option documented in Amazon S3 > documentation - > http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingClientSideEncryption.html > Currently this is not exposed in Hadoop but it is exposed as an option in AWS > Java SDK, which Hadoop currently includes. It should be trivial to propagate > this as a parameter passed to the S3client used in S3AFileSystem.java -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13887) Encrypt S3A data client-side with AWS SDK
[ https://issues.apache.org/jira/browse/HADOOP-13887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16760741#comment-16760741 ] Hadoop QA commented on HADOOP-13887: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 8s{color} | {color:red} HADOOP-13887 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HADOOP-13887 | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/15888/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Encrypt S3A data client-side with AWS SDK > - > > Key: HADOOP-13887 > URL: https://issues.apache.org/jira/browse/HADOOP-13887 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Jeeyoung Kim >Assignee: Igor Mazur >Priority: Minor > Attachments: HADOOP-13887-002.patch, HADOOP-13887-007.patch, > HADOOP-13887-branch-2-003.patch, HADOOP-13897-branch-2-004.patch, > HADOOP-13897-branch-2-005.patch, HADOOP-13897-branch-2-006.patch, > HADOOP-13897-branch-2-008.patch, HADOOP-13897-branch-2-009.patch, > HADOOP-13897-branch-2-010.patch, HADOOP-13897-branch-2-012.patch, > HADOOP-13897-branch-2-014.patch, HADOOP-13897-trunk-011.patch, > HADOOP-13897-trunk-013.patch, HADOOP-14171-001.patch, S3-CSE Proposal.pdf > > > Expose the client-side encryption option documented in Amazon S3 > documentation - > http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingClientSideEncryption.html > Currently this is not exposed in Hadoop but it is exposed as an option in AWS > Java SDK, which Hadoop currently includes. It should be trivial to propagate > this as a parameter passed to the S3client used in S3AFileSystem.java -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13887) Encrypt S3A data client-side with AWS SDK
[ https://issues.apache.org/jira/browse/HADOOP-13887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16690812#comment-16690812 ] lqjacklee commented on HADOOP-13887: /Users/liu/java/workspace-home/git/hadoop/patch/HADOOP-15870.patch > Encrypt S3A data client-side with AWS SDK > - > > Key: HADOOP-13887 > URL: https://issues.apache.org/jira/browse/HADOOP-13887 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Jeeyoung Kim >Assignee: Igor Mazur >Priority: Minor > Attachments: HADOOP-13887-002.patch, HADOOP-13887-007.patch, > HADOOP-13887-branch-2-003.patch, HADOOP-13897-branch-2-004.patch, > HADOOP-13897-branch-2-005.patch, HADOOP-13897-branch-2-006.patch, > HADOOP-13897-branch-2-008.patch, HADOOP-13897-branch-2-009.patch, > HADOOP-13897-branch-2-010.patch, HADOOP-13897-branch-2-012.patch, > HADOOP-13897-branch-2-014.patch, HADOOP-13897-trunk-011.patch, > HADOOP-13897-trunk-013.patch, HADOOP-14171-001.patch, S3-CSE Proposal.pdf > > > Expose the client-side encryption option documented in Amazon S3 > documentation - > http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingClientSideEncryption.html > Currently this is not exposed in Hadoop but it is exposed as an option in AWS > Java SDK, which Hadoop currently includes. It should be trivial to propagate > this as a parameter passed to the S3client used in S3AFileSystem.java -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13887) Encrypt S3A data client-side with AWS SDK
[ https://issues.apache.org/jira/browse/HADOOP-13887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16656699#comment-16656699 ] Hadoop QA commented on HADOOP-13887: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 7s{color} | {color:red} HADOOP-13887 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HADOOP-13887 | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/15396/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Encrypt S3A data client-side with AWS SDK > - > > Key: HADOOP-13887 > URL: https://issues.apache.org/jira/browse/HADOOP-13887 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Jeeyoung Kim >Assignee: Igor Mazur >Priority: Minor > Attachments: HADOOP-13887-002.patch, HADOOP-13887-007.patch, > HADOOP-13887-branch-2-003.patch, HADOOP-13897-branch-2-004.patch, > HADOOP-13897-branch-2-005.patch, HADOOP-13897-branch-2-006.patch, > HADOOP-13897-branch-2-008.patch, HADOOP-13897-branch-2-009.patch, > HADOOP-13897-branch-2-010.patch, HADOOP-13897-branch-2-012.patch, > HADOOP-13897-branch-2-014.patch, HADOOP-13897-trunk-011.patch, > HADOOP-13897-trunk-013.patch, HADOOP-14171-001.patch, S3-CSE Proposal.pdf > > > Expose the client-side encryption option documented in Amazon S3 > documentation - > http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingClientSideEncryption.html > Currently this is not exposed in Hadoop but it is exposed as an option in AWS > Java SDK, which Hadoop currently includes. It should be trivial to propagate > this as a parameter passed to the S3client used in S3AFileSystem.java -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13887) Encrypt S3A data client-side with AWS SDK
[ https://issues.apache.org/jira/browse/HADOOP-13887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16656624#comment-16656624 ] Steve Loughran commented on HADOOP-13887: - I'm looking at this in the context of HADOOP-14556 adding the ability to serialize secrets over the wire inside a DT. I don't want to make the change there cutting out the option to add CSE I'm going to * add the CSE options to the enum sent around * have the marshall/unmarshall code store a version ID so that if we need to add a new field, any changes to the writable will be detected fast. > Encrypt S3A data client-side with AWS SDK > - > > Key: HADOOP-13887 > URL: https://issues.apache.org/jira/browse/HADOOP-13887 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Jeeyoung Kim >Assignee: Igor Mazur >Priority: Minor > Attachments: HADOOP-13887-002.patch, HADOOP-13887-007.patch, > HADOOP-13887-branch-2-003.patch, HADOOP-13897-branch-2-004.patch, > HADOOP-13897-branch-2-005.patch, HADOOP-13897-branch-2-006.patch, > HADOOP-13897-branch-2-008.patch, HADOOP-13897-branch-2-009.patch, > HADOOP-13897-branch-2-010.patch, HADOOP-13897-branch-2-012.patch, > HADOOP-13897-branch-2-014.patch, HADOOP-13897-trunk-011.patch, > HADOOP-13897-trunk-013.patch, HADOOP-14171-001.patch, S3-CSE Proposal.pdf > > > Expose the client-side encryption option documented in Amazon S3 > documentation - > http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingClientSideEncryption.html > Currently this is not exposed in Hadoop but it is exposed as an option in AWS > Java SDK, which Hadoop currently includes. It should be trivial to propagate > this as a parameter passed to the S3client used in S3AFileSystem.java -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13887) Encrypt S3A data client-side with AWS SDK
[ https://issues.apache.org/jira/browse/HADOOP-13887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16287534#comment-16287534 ] Steve Loughran commented on HADOOP-13887: - I've been thinking about this. # Once a file is opened, it's length is known (the initial getFileStatus() returns it, and it will come back on a header of the GET # many of the uses of a file don't need to know the full length of a file until it's open. Specifically, when your code does an open(); seek(EOF-len(footer)); you don't need to know the EOF in advance. Partitioning does, though there a small diff in the length of the last partition is *probably* tractable. In C you can open a file, do an explicit seek(offset from EOF), and, if you want to know the file length, do an ftell() once you are there. # We could add a in interface+ method + streamCapabilities() option to return the length of an open file, e.g. {{public abstract long size() throws IOException;}}. (you can get this from a raw local stream, BTW). # then code could be moved to using it, starting with the internal classes > Encrypt S3A data client-side with AWS SDK > - > > Key: HADOOP-13887 > URL: https://issues.apache.org/jira/browse/HADOOP-13887 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Jeeyoung Kim >Assignee: Igor Mazur >Priority: Minor > Attachments: HADOOP-13887-002.patch, HADOOP-13887-007.patch, > HADOOP-13887-branch-2-003.patch, HADOOP-13897-branch-2-004.patch, > HADOOP-13897-branch-2-005.patch, HADOOP-13897-branch-2-006.patch, > HADOOP-13897-branch-2-008.patch, HADOOP-13897-branch-2-009.patch, > HADOOP-13897-branch-2-010.patch, HADOOP-13897-branch-2-012.patch, > HADOOP-13897-branch-2-014.patch, HADOOP-13897-trunk-011.patch, > HADOOP-13897-trunk-013.patch, HADOOP-14171-001.patch, S3-CSE Proposal.pdf > > > Expose the client-side encryption option documented in Amazon S3 > documentation - > http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingClientSideEncryption.html > Currently this is not exposed in Hadoop but it is exposed as an option in AWS > Java SDK, which Hadoop currently includes. It should be trivial to propagate > this as a parameter passed to the S3client used in S3AFileSystem.java -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13887) Encrypt S3A data client-side with AWS SDK
[ https://issues.apache.org/jira/browse/HADOOP-13887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16261515#comment-16261515 ] Steve Loughran commented on HADOOP-13887: - FWIW, presto have this, and they get to see the prestofs issues * https://github.com/prestodb/presto/issues/7186 : Presto doesn't seem to be able to read encrypted Parquet data * https://github.com/aws/aws-sdk-java/issues/1057 : EMRFS doesn't set the x-amz-unencrypted-content-length header Presto does look for the header, just gets burned with EMRFS saved data which doesn't set the header. What does EMR do? From the issues bq. We had a chat with the EMR people to understand how Hive/Spark is able to read encrypted files when the x-amz-unencrypted-content-length is not set. The outcome is, EMR Hive/Spark reads the entire file in those cases to determine the unencrypted content length, which is something that we don't really want to do. > Encrypt S3A data client-side with AWS SDK > - > > Key: HADOOP-13887 > URL: https://issues.apache.org/jira/browse/HADOOP-13887 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Jeeyoung Kim >Assignee: Igor Mazur >Priority: Minor > Attachments: HADOOP-13887-002.patch, HADOOP-13887-007.patch, > HADOOP-13887-branch-2-003.patch, HADOOP-13897-branch-2-004.patch, > HADOOP-13897-branch-2-005.patch, HADOOP-13897-branch-2-006.patch, > HADOOP-13897-branch-2-008.patch, HADOOP-13897-branch-2-009.patch, > HADOOP-13897-branch-2-010.patch, HADOOP-13897-branch-2-012.patch, > HADOOP-13897-branch-2-014.patch, HADOOP-13897-trunk-011.patch, > HADOOP-13897-trunk-013.patch, HADOOP-14171-001.patch, S3-CSE Proposal.pdf > > > Expose the client-side encryption option documented in Amazon S3 > documentation - > http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingClientSideEncryption.html > Currently this is not exposed in Hadoop but it is exposed as an option in AWS > Java SDK, which Hadoop currently includes. It should be trivial to propagate > this as a parameter passed to the S3client used in S3AFileSystem.java -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13887) Encrypt S3A data client-side with AWS SDK
[ https://issues.apache.org/jira/browse/HADOOP-13887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234365#comment-16234365 ] Steve Moist commented on HADOOP-13887: -- >I can see the appeal of some form of support for this purely for some >backup/restore process, I agree, that's a scenario I am going to cover in the other proposal. >People will end up encrypting their data, then be filing bugs/support calls >trying to understand why their queries are all failing. Oh yes they will. >It also isn't going to interact with any other S3 client, which is a >significant limitation The aws S3 cse sdk also has that limitation. IIRC it is also written in Java which makes portability a concern. At least with the Hadoop KMS, it exposes REST endpoints to encrypt/decrypt keys making it more platform independent. So while utitlities don't integrate currently with it, it doesn't prevent them from in the future from doing so. Even a lot of the AWS services don't integrate with the cse sdk. I created HADOOP-15006 and renamed this jira. I will let [~Igor Mazur] or [~steve_l] close the ticket as I am unsure of how to do so. > Encrypt S3A data client-side with AWS SDK > - > > Key: HADOOP-13887 > URL: https://issues.apache.org/jira/browse/HADOOP-13887 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 2.8.0 >Reporter: Jeeyoung Kim >Assignee: Igor Mazur >Priority: Minor > Attachments: HADOOP-13887-002.patch, HADOOP-13887-007.patch, > HADOOP-13887-branch-2-003.patch, HADOOP-13897-branch-2-004.patch, > HADOOP-13897-branch-2-005.patch, HADOOP-13897-branch-2-006.patch, > HADOOP-13897-branch-2-008.patch, HADOOP-13897-branch-2-009.patch, > HADOOP-13897-branch-2-010.patch, HADOOP-13897-branch-2-012.patch, > HADOOP-13897-branch-2-014.patch, HADOOP-13897-trunk-011.patch, > HADOOP-13897-trunk-013.patch, HADOOP-14171-001.patch, S3-CSE Proposal.pdf > > > Expose the client-side encryption option documented in Amazon S3 > documentation - > http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingClientSideEncryption.html > Currently this is not exposed in Hadoop but it is exposed as an option in AWS > Java SDK, which Hadoop currently includes. It should be trivial to propagate > this as a parameter passed to the S3client used in S3AFileSystem.java -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org