Ngone51 opened a new pull request #24699: [SPARK-27666][CORE] Stop 
PythonRunner's WriteThread immediately when task finishes
URL: https://github.com/apache/spark/pull/24699
 
 
   ## What changes were proposed in this pull request?
   
   PythonRunner uses an asynchronous way, which produces elements in 
WriteThread but consumes elements in another thread, to execute task. When 
child operator, like take()/first(), does not consume all elements produced by 
WriteThread, task would finish before WriteThread and releases all locks on 
blocks. However, WriteThread would continue to produce elements by pulling 
elements from parent operator until it exhausts all elements. And at the time 
WriteThread exhausts all elements, it will try to release the corresponding 
block but hit a AssertionError since task has already released that lock 
previously. 
   
   #24542 previously fix this by catching AssertionError, so that we won't fail 
our executor.
   
   In this PR, we suggest to shutdown WriteThread immediately when task 
finishes. In this way, we avoid the issue thoroughly and save the compute 
resources at the same time.
   
   ## How was this patch tested?
   
   N.A.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to