[ 
https://issues.apache.org/jira/browse/HADOOP-14556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16621018#comment-16621018
 ] 

Steve Loughran commented on HADOOP-14556:
-----------------------------------------

Patch 007. still a WiP

* For people who want to review this, the code is also [on 
github|https://github.com/steveloughran/hadoop/tree/s3/HADOOP-14556-delegation-token].

This fairly complex design is intended to 
* support different back-end token bindings
* and leave it open for anyone who ever does Kerberos binding (as Wasb permits) 
to do so. 

Supported bindings
* Full: your normal AWS secrets. Should work with non-AWS S3 services.
* Session: session tokens are requested off STS
* Role. This is the complex one, but the most significant. Ask for a restricted 
role with a configured role ARN and a dynamically created role policy 
restricted purely to the bucket & DDB table used by the FS (there's some 
interfaces there to let them tell the token binding what those policies are).

Example:

{code}
2018-09-19 19:15:10,324 [JUnit-testDTFileSystem] DEBUG auth.STSClientFactory 
(STSClientFactory.java:requestRole(181)) - Requesting role 
arn:aws:iam::11111111111:role/stevel-s3guard with duration 21600; policy = {
  "Version" : "2012-10-17",
  "Statement" : [ {
    "Sid" : "7",
    "Effect" : "Allow",
    "Action" : [ "s3:GetBucketLocation", "s3:ListBucket" ],
    "Resource" : "arn:aws:s3:::hwdev-steve-ireland-new"
  }, {
    "Sid" : "8",
    "Effect" : "Allow",
    "Action" : [ "s3:Get*", "s3:PutObject", "s3:DeleteObject", 
"s3:AbortMultipartUpload", "s3:ListMultipartUploadParts", "s3:ListBucket*" ],
    "Resource" : "arn:aws:s3:::hwdev-steve-ireland-new/*"
  }, {
    "Sid" : "1",
    "Effect" : "Allow",
    "Action" : [ "kms:Decrypt", "kms:GenerateDataKey" ],
    "Resource" : "arn:aws:kms:*"
  }, {
    "Sid" : "9",
    "Effect" : "Allow",
    "Action" : [ "dynamodb:BatchGetItem", "dynamodb:BatchWriteItem", 
"dynamodb:DeleteItem", "dynamodb:DescribeTable", "dynamodb:GetItem", 
"dynamodb:PutItem", "dynamodb:Query", "dynamodb:UpdateItem" ],
    "Resource" : 
"arn:aws:dynamodb:eu-west-1:00000000000:table/hwdev-steve-ireland-new"
  } ]
}
{code}

This token can be passed on to a shared hive/spark cluster, knowing that the 
maximum access anything with that token can have will be full R/W access to the 
destination bucket and any S3Guard table

h3. Scale

There's some ILoad* tests to see what the sustainable rate of issuing STS 
session and role tokens is. 

The TSV datasets [are available for 
download|https://github.com/steveloughran/datasets/releases/tag/tag_2018-09-17-aws]
 and analysis in your favourite notebook. Any analysis + different results from 
different locations would be great!

Key points:
# you can get about 500-1000 requests/second before calls get rejected.
# Calls to STS do need to catch & retry on throttle events in the case this 
does occure.

For anyone planning those tests, you need to invoke them by name and set 
-Dscale. Others users in your AWS account using the same STS endpoint may have 
calls rejected for throttling too, which may be "observable". Test carefully by 
selecting an explicit location and/or doing it in quiet periods.

h3. TODO

* if that token really does contain user info (i.e someone ever did kerberos 
support), it should somehow be preserved. What to do?
* docs, obviously.
* I now know more about role permissions; improve our docs there too.
* FileContext tests are failing due to port mismatches in "canonical" paths. 
hence the improved detail on the failing exception being raised ... issue is 
still outstanding.
* S3a FS to pick up encryption settings from DT; will permit SSE-C to propagate 
from client to shared service, in particular
* Some downstream tests in Hive & Spark. These only seem look for DTs if the 
user has kerberos enabled.

> S3A to support Delegation Tokens
> --------------------------------
>
>                 Key: HADOOP-14556
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14556
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.2.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>         Attachments: HADOOP-14556-001.patch, HADOOP-14556-002.patch, 
> HADOOP-14556-003.patch, HADOOP-14556-004.patch, HADOOP-14556-005.patch, 
> HADOOP-14556-007.patch, HADOOP-14556.oath-002.patch, HADOOP-14556.oath.patch
>
>
> S3A to support delegation tokens where
> * an authenticated client can request a token via 
> {{FileSystem.getDelegationToken()}}
> * Amazon's token service is used to request short-lived session secret & id; 
> these will be saved in the token and  marshalled with jobs
> * A new authentication provider will look for a token for the current user 
> and authenticate the user if found
> This will not support renewals; the lifespan of a token will be limited to 
> the initial duration. Also, as you can't request an STS token from a 
> temporary session, IAM instances won't be able to issue tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to