attilapiros commented on PR #44882:
URL: https://github.com/apache/spark/pull/44882#issuecomment-2644417087
If somebody is interested in the original deadlock I have executed the test
without the fix (and without the timeout).
The stack trace is:
```
Found one Java-level deadlock:
=============================
"test-getCacheLocs-0":
waiting to lock monitor 0x0000600000f638e0 (object 0x00000007048507d8, a
scala.collection.mutable.HashMap),
which is held by "test-getCacheLocs-1"
"test-getCacheLocs-1":
waiting to lock monitor 0x0000600000f63810 (object 0x0000000706516030, a
org.apache.spark.rdd.RDD$$anon$1),
which is held by "test-getCacheLocs-0"
Java stack information for the threads listed above:
===================================================
"test-getCacheLocs-0":
at
org.apache.spark.scheduler.DAGScheduler.getCacheLocs(DAGScheduler.scala:399)
- waiting to lock <0x00000007048507d8> (a
scala.collection.mutable.HashMap)
at
org.apache.spark.scheduler.DAGScheduler.getPreferredLocsInternal(DAGScheduler.scala:2742)
at
org.apache.spark.scheduler.DAGScheduler.getPreferredLocs(DAGScheduler.scala:2721)
at
org.apache.spark.SparkContext.getPreferredLocs(SparkContext.scala:1938)
at
org.apache.spark.rdd.DefaultPartitionCoalescer.currPrefLocs(CoalescedRDD.scala:180)
at
org.apache.spark.rdd.DefaultPartitionCoalescer$PartitionLocations.$anonfun$getAllPrefLocs$1(CoalescedRDD.scala:198)
at
org.apache.spark.rdd.DefaultPartitionCoalescer$PartitionLocations$$Lambda$1113/0x000000012da71cb0.apply(Unknown
Source)
at
scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
at
scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
at
scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
at
org.apache.spark.rdd.DefaultPartitionCoalescer$PartitionLocations.getAllPrefLocs(CoalescedRDD.scala:197)
at
org.apache.spark.rdd.DefaultPartitionCoalescer$PartitionLocations.<init>(CoalescedRDD.scala:190)
at
org.apache.spark.rdd.DefaultPartitionCoalescer.coalesce(CoalescedRDD.scala:391)
at
org.apache.spark.rdd.CoalescedRDD.getPartitions(CoalescedRDD.scala:90)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:292)
- locked <0x0000000706516030> (a org.apache.spark.rdd.RDD$$anon$1)
at
org.apache.spark.rdd.RDD$$Lambda$1104/0x000000012da6e7b0.apply(Unknown Source)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:288)
at
org.apache.spark.scheduler.DAGSchedulerSuite$$anon$3.run(DAGSchedulerSuite.scala:225)
- locked <0x0000000706516030> (a org.apache.spark.rdd.RDD$$anon$1)
at
java.util.concurrent.Executors$RunnableAdapter.call([email protected]/Executors.java:539)
at
java.util.concurrent.FutureTask.run([email protected]/FutureTask.java:264)
at
java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1136)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:635)
at java.lang.Thread.run([email protected]/Thread.java:840)
"test-getCacheLocs-1":
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:291)
- waiting to lock <0x0000000706516030> (a
org.apache.spark.rdd.RDD$$anon$1)
at
org.apache.spark.rdd.RDD$$Lambda$1104/0x000000012da6e7b0.apply(Unknown Source)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:288)
at
org.apache.spark.scheduler.DAGScheduler.getCacheLocs(DAGScheduler.scala:402)
- locked <0x00000007048507d8> (a scala.collection.mutable.HashMap)
at
org.apache.spark.scheduler.DAGSchedulerSuite$$anon$4.run(DAGSchedulerSuite.scala:235)
at
java.util.concurrent.Executors$RunnableAdapter.call([email protected]/Executors.java:539)
at
java.util.concurrent.FutureTask.run([email protected]/FutureTask.java:264)
at
java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1136)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:635)
at java.lang.Thread.run([email protected]/Thread.java:840)
Found 1 deadlock.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]