steveloughran commented on pull request #2807:
URL: https://github.com/apache/hadoop/pull/2807#issuecomment-842340255


   +git showing some log output during a terasort test
   https://gist.github.com/steveloughran/8e0aadb51c63f1c3538deda19ee952ae
   
   some of the events (e.g 
183c9826b45486e485693808f38e2c4071004bf5dfd4c3ab210f0a21a4235ef8 ) have job ID 
in the referrer header "ji=job_1620911577786_0006". This is only set during the 
FS operations the S3A committer performs during task and job, as they're the 
only ones we know are explicitly related to a job. If we were confident that 
whichever thread called `Committer.setupTask()` was the only thread making 
FileSystem API calls for that task then we could set it at the task level.
   
   The`org.apache.hadoop.fs.audit.CommonAuditContext` class provides global and 
thread local context maps to let apps attach such attributes; the new 
ManifestCommitter will be setting them so that once ABFS picks up the same 
auditing, the context info will come down.
   
   Modified versions of Hive, Spark etc could use this API to set any of their 
context info when a specific thread was scheduled to work for a given query; 
trying to guess in the hadoop committer isn't the right place
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to