[ 
https://issues.apache.org/jira/browse/HADOOP-19067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814511#comment-17814511
 ] 

Steve Loughran commented on HADOOP-19067:
-----------------------------------------

you've seen the s3 auditing stuff right? where can map HTTP requests to 
kerberos principals, spark jobs IDs, even fs commands?

main issue there is the http referrer header doesn't get to cloudtrail -if you 
could express your need for that to anyone @ AWS you know that'd be great. I 
want to tie every single GET operation to the job and task which does it. 
mapping assume role to (principal, job, id) helps, but if you have multiple 
jobs with same role active at the same time, insufficient.

as for the adding of tags
* an option to add that referrer header would be good
* and if you look at the fs.s3a.header design something similar to that for 
assumed role tags will be welcome too.

usual test process as documented in testing.md. thanks. Hadoop 3.4+ only BTW; 
3.3.x is feature frozen for s3a, just critical bug fixes -the move to the v2 
sdk makes backporting too hard.


> Allow tag passing to AWS Credential Provider
> --------------------------------------------
>
>                 Key: HADOOP-19067
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19067
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/s3
>    Affects Versions: 3.3.6
>            Reporter: Jason Martin
>            Priority: Minor
>
> [https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/auth/AssumedRoleCredentialProvider.java#L131-L133]
>  passes a session name and role arn to AssumeRoleRequest. The AWS AssumeRole 
> API also supports passing a list of tags: 
> [https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/sts/model/AssumeRoleRequest.html#tags()]
> These tags could be used by platforms to enhance the data encoded into 
> CloudTrail entries to provide better information about the client. For 
> example, a 'notebook' based platform could encode the notebook / jobname / 
> invoker-id in these tags, enabling more granular access controls and leaving 
> a richer breadcrumb-trail as to what operations are being performed.
> This is particularly useful in larger environments where jobs do not get 
> individual roles to assume, and there is a desire to track what 
> jobs/notebooks are reading a given set of files in S3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to