[ https://issues.apache.org/jira/browse/SPARK-27666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wenchen Fan resolved SPARK-27666. --------------------------------- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24699 [https://github.com/apache/spark/pull/24699] > Do not release lock while TaskContext already completed > ------------------------------------------------------- > > Key: SPARK-27666 > URL: https://issues.apache.org/jira/browse/SPARK-27666 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.0.0 > Reporter: Wenchen Fan > Priority: Major > Fix For: 3.0.0 > > > {code:java} > Exception in thread "Thread-14" java.lang.AssertionError: assertion failed: > Block rdd_0_0 is not locked for reading > at scala.Predef$.assert(Predef.scala:223) > at > org.apache.spark.storage.BlockInfoManager.unlock(BlockInfoManager.scala:299) > at > org.apache.spark.storage.BlockManager.releaseLock(BlockManager.scala:1000) > at > org.apache.spark.storage.BlockManager.$anonfun$getLocalValues$5(BlockManager.scala:746) > at > org.apache.spark.util.CompletionIterator$$anon$1.completion(CompletionIterator.scala:47) > at > org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:36) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) > at org.apache.spark.rdd.RDDSuite.$anonfun$new$265(RDDSuite.scala:1185) > at java.lang.Thread.run(Thread.java:748) > {code} > We're facing an issue reported by SPARK-18406 and SPARK-25139. And > [https://github.com/apache/spark/pull/24542] bypassed the issue by capturing > the assertion error to avoid failing the executor. However, when not using > pyspark, issue still exists when user implements a custom > RDD(https://issues.apache.org/jira/browse/SPARK-18406?focusedCommentId=15969384&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15969384) > or task(see demo below), which spawn a separate thread to consume iterator > from a cached parent RDD. > {code:java} > val rdd0 = sc.parallelize(Range(0, 10), 1).cache() > rdd0.collect() > rdd0.mapPartitions { iter => > val t = new Thread(new Runnable { > override def run(): Unit = { > while(iter.hasNext) { > println(iter.next()) > Thread.sleep(1000) > } > } > }) > t.setDaemon(false) > t.start() > Iterator(0) > }.collect() > {code} > we could easily to reproduce the issue using the demo above. > If we could prevent the separate thread from releasing lock on block when > TaskContext has already completed, > then, we won't hit this issue again. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org