Github user klion26 commented on the issue:

    https://github.com/apache/spark/pull/19145
  
    Hi @jerryshao, thank you for your reply.
    
    # Problem
    the problem is for long running jobs which run on **yarn with HA** will 
request more executors than it requests.
    
    # How to reproduce 
    1. start a spark streaming job on yarn
    2. mark one of the nodemanagers which runs container of the spark streaming 
program as lost(this step will take 10 minutes in my environment)
    3. the nodemanger which lost in step 2 came back
    4. restart the resourcemanager
    5. after the resourcemanger restarted, we will get more resource than we 
request
    
    
    # Question
    I have one question: should i use 
`completedContainerIdSet.remove(containerId)` instead of 
`completedContainerIdSet. contains(containerId)`, if the container lost message 
will only be reported twice, we should use `remove` instead of `contains` method


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to