[ https://issues.apache.org/jira/browse/FLINK-25029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452838#comment-17452838 ]
刘方奇 commented on FLINK-25029: ----------------------------- [~arvid] , Hi, sorry to bother. But if you can give me some advice today, maybe i can do more work for this on the weekend:D. Only incidentally, it depends on your time. > Hadoop Caller Context Setting In Flink > -------------------------------------- > > Key: FLINK-25029 > URL: https://issues.apache.org/jira/browse/FLINK-25029 > Project: Flink > Issue Type: Improvement > Components: FileSystems > Reporter: 刘方奇 > Assignee: 刘方奇 > Priority: Major > Labels: pull-request-available > > For a given HDFS operation (e.g. delete file), it's very helpful to track > which upper level job issues it. The upper level callers may be specific > Oozie tasks, MR jobs, and hive queries. One scenario is that the namenode > (NN) is abused/spammed, the operator may want to know immediately which MR > job should be blamed so that she can kill it. To this end, the caller context > contains at least the application-dependent "tracking id". > The above is the main effect of the Caller Context. HDFS Client set Caller > Context, then name node get it in audit log to do some work. > Now the Spark and hive have the Caller Context to meet the HDFS Job Audit > requirement. > In my company, flink jobs often cause some problems for HDFS, so we did it > for preventing some cases. > If the feature is general enough. Should we support it, then I can submit a > PR for this. -- This message was sent by Atlassian Jira (v8.20.1#820001)