Github user Sherry302 commented on the issue: https://github.com/apache/spark/pull/14659 Hi, @srowen . Thank you so much for the review. Sorry for the test failure and late update. The failure reasons are that âjobIDâ were none or there was no âspark.app.nameâ in sparkConf. I have updated the PR to set default values to âjobIDâ and âspark.app.nameâ. When a real application runs on Spark, it will always have âjobIDâ and âspark.app.nameâ. What's the use case for this? When users run Spark applications on Yarn on HDFS, Sparkâs caller contexts will be written into hdfs-audit.log. The Spark caller contexts are JobID_stageID_stageAttemptId_taskID_attemptNumbe and applicationsâ name. The caller context can help users to better diagnose and understand how specific applications impacting parts of the Hadoop system and potential problems they may be creating (e.g. overloading NN). As HDFS mentioned in HDFS-9184, for a given HDFS operation, it's very helpful to track which upper level job issues it.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org