steveloughran commented on pull request #2807: URL: https://github.com/apache/hadoop/pull/2807#issuecomment-842340255
+git showing some log output during a terasort test https://gist.github.com/steveloughran/8e0aadb51c63f1c3538deda19ee952ae some of the events (e.g 183c9826b45486e485693808f38e2c4071004bf5dfd4c3ab210f0a21a4235ef8 ) have job ID in the referrer header "ji=job_1620911577786_0006". This is only set during the FS operations the S3A committer performs during task and job, as they're the only ones we know are explicitly related to a job. If we were confident that whichever thread called `Committer.setupTask()` was the only thread making FileSystem API calls for that task then we could set it at the task level. The`org.apache.hadoop.fs.audit.CommonAuditContext` class provides global and thread local context maps to let apps attach such attributes; the new ManifestCommitter will be setting them so that once ABFS picks up the same auditing, the context info will come down. Modified versions of Hive, Spark etc could use this API to set any of their context info when a specific thread was scheduled to work for a given query; trying to guess in the hadoop committer isn't the right place -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org