Github user Sherry302 commented on the issue:

    https://github.com/apache/spark/pull/14659
  
    Hi, @srowen . Thank you so much for the review. Sorry for the test
    failure and late update. The failure reasons are that ‘jobID’ were
    none or there was no ‘spark.app.name’ in sparkConf. I have updated the 
PR to set
    default values to ‘jobID’ and ‘spark.app.name’. When a real 
application runs on
    Spark, it will always have ‘jobID’ and ‘spark.app.name’. 
    
    What's the use case for this?
    When users run Spark applications on Yarn on HDFS, Spark’s
    caller contexts will be written into hdfs-audit.log. The Spark caller 
contexts
    are JobID_stageID_stageAttemptId_taskID_attemptNumbe and applications’ 
name. 
    
    The caller context can help users to better diagnose and understand how 
specific
    applications impacting parts of the Hadoop system and potential problems 
they
    may be creating (e.g. overloading NN). As HDFS mentioned in HDFS-9184, for a
    given HDFS operation, it's very helpful to track which upper level job 
issues
    it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to