Github user tdas commented on the issue:

    https://github.com/apache/spark/pull/20622
  
    @jose-torres I had a long offline chat with @zsxwing, kudos to him for 
catching a corner case in the current solution. The following sequence of 
events may occur.
    
    - In the query thread, the epoch tracking thread is started
    - Before the query thread actually starts the Spark job, the epoch tracking 
thread may detect some sort of reconfiguration and attempt to cancelJob even 
before the query thread has started spark jobs.
    - Query thread starts spark job, gets blocked, never terminates. 
    
    Fundamentally, its not a great setup that one thread is starting the jobs 
and another thread is canceling them. Because of the async nature, we have no 
way reasoning which attempt wins, starting or cancelling. Rather let's make 
sure that we start and cancel in the same thread (then we can do some 
reasoning). Here is an alternate solution.
    - The epoch thread ONLY interrupts the query thread. It's not responsible 
for any Spark state management (other than the enum state).
    - The query thread cancels jobs and stops sources in the `finally` clause.
    
    There is less likely to be race conditions that end up not canceling Spark 
job as a single thread (the query thread) is responsible for all Spark state 
management.
    
    
    
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to