[ https://issues.apache.org/jira/browse/SPARK-32091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17143924#comment-17143924 ]
Apache Spark commented on SPARK-32091: -------------------------------------- User 'Ngone51' has created a pull request for this issue: https://github.com/apache/spark/pull/28924 > Ignore timeout error when remove blocks on the lost executor > ------------------------------------------------------------ > > Key: SPARK-32091 > URL: https://issues.apache.org/jira/browse/SPARK-32091 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 2.4.0, 3.0.0 > Reporter: wuyi > Priority: Major > > When removing blocks(e.g. RDD, broadcast, shuffle), BlockManagerMaserEndpoint > will make RPC calls to each known BlockManagerSlaveEndpoint to remove the > specific blocks. The PRC call sometimes could end in a timeout when the > executor has been lost, but only notified the BlockManagerMasterEndpoint > after the removing call has already happened. The timeout could therefore > fail the whole query. > In this case, we actually could just ignore the error since those blocks on > the lost executor could be considered as removed already. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org