Hey
I am not sure whether we can directly go and change this. Any changes to Audit 
Log format are considered incompatible.

https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output

-Ayush

> On 10-Oct-2021, at 7:57 PM, tom lee <tomlees...@gmail.com> wrote:
> 
> Hi all,
> 
> In our production environment, we occasionally encounter a problem where a
> user submits an abnormal computation task, causing a sudden flood of
> requests, which causes the queueTime and processingTime of the Namenode to
> rise very high, causing a large backlog of tasks.
> 
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based
> on metrics and audit logs. Currently, IP and UGI are recorded in audit
> logs, but there is no port information, so it is difficult to locate
> specific processes sometimes. Therefore, I propose that we add the port
> information to the audit log, so that we can easily track the upstream
> process.
> 
> Currently, some projects contain port information in audit logs, such as
> Hbase and Alluxio. I think it is also necessary to add port information for
> HDFS audit logs.
> 
> I submitted a PR(https://github.com/apache/hadoop/pull/3538), which has
> been tested in our test environment, and both RPC and HTTP are in effect. I
> look forward to your discussion on possible problems and suggestions for
> modification. I will actively update the PR.
> 
> Best Regards,
> Tom

Reply via email to