[jira] [Commented] (SPARK-20882) Executor is waiting for ShuffleBlockFetcherIterator
[ https://issues.apache.org/jira/browse/SPARK-20882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16256578#comment-16256578 ] ShuMing Li commented on SPARK-20882: Can I ask what `Still have 1 requests outstanding when connection from 10.0.139.110:7337 is closed` error means ? what's the real problem? > Executor is waiting for ShuffleBlockFetcherIterator > --- > > Key: SPARK-20882 > URL: https://issues.apache.org/jira/browse/SPARK-20882 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0, 2.1.1 >Reporter: cen yuhai > Attachments: executor_jstack, executor_log, screenshot-1.png, > screenshot-2.png, screenshot-3.png > > > This bug is like https://issues.apache.org/jira/browse/SPARK-19300. > but I have updated my client netty version to 4.0.43.Final. > The shuffle service handler is still 4.0.42.Final > spark.sql.adaptive.enabled is true > {code} > "Executor task launch worker for task 4808985" #5373 daemon prio=5 os_prio=0 > tid=0x7f54ef437000 nid=0x1aed0 waiting on condition [0x7f53aebfe000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > parking to wait for <0x000498c249c0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:189) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:332) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:58) > at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) > at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:199) > at > org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:97) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) > at org.apache.spark.scheduler.Task.run(Task.scala:114) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:323) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622) > at java.lang.Thread.run(Thread.java:834) > {code} > {code} > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: > Set(shuffle_5_1431_805, shuffle_5_1431_808, shuffle_5_1431_806, > shuffle_5_1431_809, shuffle_5_1431_807) > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: > Set(shuffle_5_1431_808, shuffle_5_1431_806, shuffle_5_1431_809, > shuffle_5_1431_807) > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: > Set(shuffle_5_1431_808, shuffle_5_1431_809, shuffle_5_1431_807) > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: > Set(shuffle_5_1431_808, shuffle_5_1431_809) > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: > Set(shuffle_5_1431_809) > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: Set() > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 21 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 20 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 19 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 18 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 17 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 16 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 15 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 14 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 13 > 17/05/26 12:04:06 DEBUG
[jira] [Commented] (SPARK-20882) Executor is waiting for ShuffleBlockFetcherIterator
[ https://issues.apache.org/jira/browse/SPARK-20882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027743#comment-16027743 ] cen yuhai commented on SPARK-20882: --- [~zsxwing] It is my problem... It is none of spark's busniess. > Executor is waiting for ShuffleBlockFetcherIterator > --- > > Key: SPARK-20882 > URL: https://issues.apache.org/jira/browse/SPARK-20882 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0, 2.1.1 >Reporter: cen yuhai > Attachments: executor_jstack, executor_log, screenshot-1.png, > screenshot-2.png, screenshot-3.png > > > This bug is like https://issues.apache.org/jira/browse/SPARK-19300. > but I have updated my client netty version to 4.0.43.Final. > The shuffle service handler is still 4.0.42.Final > spark.sql.adaptive.enabled is true > {code} > "Executor task launch worker for task 4808985" #5373 daemon prio=5 os_prio=0 > tid=0x7f54ef437000 nid=0x1aed0 waiting on condition [0x7f53aebfe000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > parking to wait for <0x000498c249c0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:189) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:332) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:58) > at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) > at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:199) > at > org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:97) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) > at org.apache.spark.scheduler.Task.run(Task.scala:114) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:323) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622) > at java.lang.Thread.run(Thread.java:834) > {code} > {code} > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: > Set(shuffle_5_1431_805, shuffle_5_1431_808, shuffle_5_1431_806, > shuffle_5_1431_809, shuffle_5_1431_807) > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: > Set(shuffle_5_1431_808, shuffle_5_1431_806, shuffle_5_1431_809, > shuffle_5_1431_807) > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: > Set(shuffle_5_1431_808, shuffle_5_1431_809, shuffle_5_1431_807) > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: > Set(shuffle_5_1431_808, shuffle_5_1431_809) > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: > Set(shuffle_5_1431_809) > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: Set() > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 21 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 20 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 19 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 18 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 17 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 16 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 15 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 14 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 13 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 12 > 17/05/26 12:04:06 DEBUG
[jira] [Commented] (SPARK-20882) Executor is waiting for ShuffleBlockFetcherIterator
[ https://issues.apache.org/jira/browse/SPARK-20882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026595#comment-16026595 ] Shixiong Zhu commented on SPARK-20882: -- [~cenyuhai] do you mind to share what's the root cause? > Executor is waiting for ShuffleBlockFetcherIterator > --- > > Key: SPARK-20882 > URL: https://issues.apache.org/jira/browse/SPARK-20882 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0, 2.1.1 >Reporter: cen yuhai > Attachments: executor_jstack, executor_log, screenshot-1.png, > screenshot-2.png, screenshot-3.png > > > This bug is like https://issues.apache.org/jira/browse/SPARK-19300. > but I have updated my client netty version to 4.0.43.Final. > The shuffle service handler is still 4.0.42.Final > spark.sql.adaptive.enabled is true > {code} > "Executor task launch worker for task 4808985" #5373 daemon prio=5 os_prio=0 > tid=0x7f54ef437000 nid=0x1aed0 waiting on condition [0x7f53aebfe000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > parking to wait for <0x000498c249c0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:189) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:332) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:58) > at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) > at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:199) > at > org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:97) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) > at org.apache.spark.scheduler.Task.run(Task.scala:114) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:323) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622) > at java.lang.Thread.run(Thread.java:834) > {code} > {code} > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: > Set(shuffle_5_1431_805, shuffle_5_1431_808, shuffle_5_1431_806, > shuffle_5_1431_809, shuffle_5_1431_807) > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: > Set(shuffle_5_1431_808, shuffle_5_1431_806, shuffle_5_1431_809, > shuffle_5_1431_807) > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: > Set(shuffle_5_1431_808, shuffle_5_1431_809, shuffle_5_1431_807) > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: > Set(shuffle_5_1431_808, shuffle_5_1431_809) > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: > Set(shuffle_5_1431_809) > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: Set() > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 21 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 20 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 19 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 18 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 17 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 16 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 15 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 14 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 13 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 12 > 17/05/26 12:04:06 DEBUG
[jira] [Commented] (SPARK-20882) Executor is waiting for ShuffleBlockFetcherIterator
[ https://issues.apache.org/jira/browse/SPARK-20882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16026040#comment-16026040 ] cen yuhai commented on SPARK-20882: --- yes. {code} 17/05/26 12:04:14 INFO TransportResponseHandler: Still have 1 requests outstanding when connection from 10.0.139.110:7337 is closed 17/05/26 12:04:14 INFO TransportResponseHandler: Still have 1 requests outstanding when connection from 10.0.139.110:7337 is closed {code} before the requests in flight 1. the remainingBlocks is empty. {code} 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: Set(shuffle_5_1431_809) 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: Set() {code} > Executor is waiting for ShuffleBlockFetcherIterator > --- > > Key: SPARK-20882 > URL: https://issues.apache.org/jira/browse/SPARK-20882 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0, 2.1.1 >Reporter: cen yuhai > Attachments: executor_jstack, executor_log, screenshot-1.png, > screenshot-2.png, screenshot-3.png > > > This bug is like https://issues.apache.org/jira/browse/SPARK-19300. > but I have updated my client netty version to 4.0.43.Final. > The shuffle service handler is still 4.0.42.Final > spark.sql.adaptive.enabled is true > {code} > "Executor task launch worker for task 4808985" #5373 daemon prio=5 os_prio=0 > tid=0x7f54ef437000 nid=0x1aed0 waiting on condition [0x7f53aebfe000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > parking to wait for <0x000498c249c0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:189) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:332) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:58) > at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) > at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:199) > at > org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:97) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) > at org.apache.spark.scheduler.Task.run(Task.scala:114) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:323) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622) > at java.lang.Thread.run(Thread.java:834) > {code} > {code} > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: > Set(shuffle_5_1431_805, shuffle_5_1431_808, shuffle_5_1431_806, > shuffle_5_1431_809, shuffle_5_1431_807) > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: > Set(shuffle_5_1431_808, shuffle_5_1431_806, shuffle_5_1431_809, > shuffle_5_1431_807) > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: > Set(shuffle_5_1431_808, shuffle_5_1431_809, shuffle_5_1431_807) > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: > Set(shuffle_5_1431_808, shuffle_5_1431_809) > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: > Set(shuffle_5_1431_809) > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: remainingBlocks: Set() > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 21 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 20 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 19 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 18 > 17/05/26 12:04:06 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 17 >
[jira] [Commented] (SPARK-20882) Executor is waiting for ShuffleBlockFetcherIterator
[ https://issues.apache.org/jira/browse/SPARK-20882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025588#comment-16025588 ] Shixiong Zhu commented on SPARK-20882: -- [~cenyuhai] Did you see this log "logger.error("Still have {} requests outstanding when connection from {} is closed","? > Executor is waiting for ShuffleBlockFetcherIterator > --- > > Key: SPARK-20882 > URL: https://issues.apache.org/jira/browse/SPARK-20882 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0, 2.1.1 >Reporter: cen yuhai > Attachments: executor_jstack, executor_log, screenshot-1.png, > screenshot-2.png, screenshot-3.png > > > This bug is like https://issues.apache.org/jira/browse/SPARK-19300. > but I have updated my client netty version to 4.0.43.Final. > The shuffle service handler is still 4.0.42.Final > spark.sql.adaptive.enabled is true > {code} > "Executor task launch worker for task 4808985" #5373 daemon prio=5 os_prio=0 > tid=0x7f54ef437000 nid=0x1aed0 waiting on condition [0x7f53aebfe000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > parking to wait for <0x000498c249c0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:189) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:332) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:58) > at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) > at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:199) > at > org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:97) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) > at org.apache.spark.scheduler.Task.run(Task.scala:114) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:323) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622) > at java.lang.Thread.run(Thread.java:834) > {code} > {code} > 7/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 3 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 2 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 1 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 1 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 1 > 17/05/26 02:02:03 WARN TransportChannelHandler: Exception in connection from > bigdata-apache-hdp-132.xg01/10.0.132.58:7337 > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > at sun.nio.ch.IOUtil.read(IOUtil.java:192) > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) > at > io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:221) > at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:899) > at > io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:275) > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480) > at
[jira] [Commented] (SPARK-20882) Executor is waiting for ShuffleBlockFetcherIterator
[ https://issues.apache.org/jira/browse/SPARK-20882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025581#comment-16025581 ] cen yuhai commented on SPARK-20882: --- I think the problem is when the connection to nodemanger:7337 is closed. It will hang forever? {code} 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in flight 1 17/05/26 02:02:03 WARN TransportChannelHandler: Exception in connection from bigdata-apache-hdp-132.xg01/10.0.132.58:7337 java.io.IOException: Connection reset by peer {code} > Executor is waiting for ShuffleBlockFetcherIterator > --- > > Key: SPARK-20882 > URL: https://issues.apache.org/jira/browse/SPARK-20882 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0, 2.1.1 >Reporter: cen yuhai > Attachments: executor_jstack, executor_log, screenshot-1.png, > screenshot-2.png, screenshot-3.png > > > This bug is like https://issues.apache.org/jira/browse/SPARK-19300. > but I have updated my client netty version to 4.0.43.Final. > The shuffle service handler is still 4.0.42.Final > spark.sql.adaptive.enabled is true > {code} > "Executor task launch worker for task 4808985" #5373 daemon prio=5 os_prio=0 > tid=0x7f54ef437000 nid=0x1aed0 waiting on condition [0x7f53aebfe000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > parking to wait for <0x000498c249c0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:189) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:332) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:58) > at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) > at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:199) > at > org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:97) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) > at org.apache.spark.scheduler.Task.run(Task.scala:114) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:323) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622) > at java.lang.Thread.run(Thread.java:834) > {code} > {code} > 7/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 3 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 2 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 1 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 1 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 1 > 17/05/26 02:02:03 WARN TransportChannelHandler: Exception in connection from > bigdata-apache-hdp-132.xg01/10.0.132.58:7337 > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > at sun.nio.ch.IOUtil.read(IOUtil.java:192) > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) > at > io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:221) > at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:899) > at > io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:275) > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119) > at >
[jira] [Commented] (SPARK-20882) Executor is waiting for ShuffleBlockFetcherIterator
[ https://issues.apache.org/jira/browse/SPARK-20882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025506#comment-16025506 ] Shixiong Zhu commented on SPARK-20882: -- [~cenyuhai] Is it possible to reproduce it and get logs of the shuffle service? > Executor is waiting for ShuffleBlockFetcherIterator > --- > > Key: SPARK-20882 > URL: https://issues.apache.org/jira/browse/SPARK-20882 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0, 2.1.1 >Reporter: cen yuhai > Attachments: executor_jstack, executor_log, screenshot-1.png, > screenshot-2.png, screenshot-3.png > > > This bug is like https://issues.apache.org/jira/browse/SPARK-19300. > but I have updated my client netty version to 4.0.43.Final. > The shuffle service handler is still 4.0.42.Final > spark.sql.adaptive.enabled is true > {code} > "Executor task launch worker for task 4808985" #5373 daemon prio=5 os_prio=0 > tid=0x7f54ef437000 nid=0x1aed0 waiting on condition [0x7f53aebfe000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > parking to wait for <0x000498c249c0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:189) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:332) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:58) > at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) > at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:199) > at > org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:97) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) > at org.apache.spark.scheduler.Task.run(Task.scala:114) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:323) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622) > at java.lang.Thread.run(Thread.java:834) > {code} > {code} > 7/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 3 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 2 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 1 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 1 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 1 > 17/05/26 02:02:03 WARN TransportChannelHandler: Exception in connection from > bigdata-apache-hdp-132.xg01/10.0.132.58:7337 > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > at sun.nio.ch.IOUtil.read(IOUtil.java:192) > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) > at > io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:221) > at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:899) > at > io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:275) > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442)
[jira] [Commented] (SPARK-20882) Executor is waiting for ShuffleBlockFetcherIterator
[ https://issues.apache.org/jira/browse/SPARK-20882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025496#comment-16025496 ] cen yuhai commented on SPARK-20882: --- Yes, I am using shuffle service > Executor is waiting for ShuffleBlockFetcherIterator > --- > > Key: SPARK-20882 > URL: https://issues.apache.org/jira/browse/SPARK-20882 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0, 2.1.1 >Reporter: cen yuhai > Attachments: executor_jstack, executor_log, screenshot-1.png, > screenshot-2.png, screenshot-3.png > > > This bug is like https://issues.apache.org/jira/browse/SPARK-19300. > but I have updated my client netty version to 4.0.43.Final. > The shuffle service handler is still 4.0.42.Final > spark.sql.adaptive.enabled is true > {code} > "Executor task launch worker for task 4808985" #5373 daemon prio=5 os_prio=0 > tid=0x7f54ef437000 nid=0x1aed0 waiting on condition [0x7f53aebfe000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > parking to wait for <0x000498c249c0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:189) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:332) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:58) > at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) > at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:199) > at > org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:97) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) > at org.apache.spark.scheduler.Task.run(Task.scala:114) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:323) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622) > at java.lang.Thread.run(Thread.java:834) > {code} > {code} > 7/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 3 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 2 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 1 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 1 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 1 > 17/05/26 02:02:03 WARN TransportChannelHandler: Exception in connection from > bigdata-apache-hdp-132.xg01/10.0.132.58:7337 > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > at sun.nio.ch.IOUtil.read(IOUtil.java:192) > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) > at > io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:221) > at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:899) > at > io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:275) > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442) > at >
[jira] [Commented] (SPARK-20882) Executor is waiting for ShuffleBlockFetcherIterator
[ https://issues.apache.org/jira/browse/SPARK-20882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025377#comment-16025377 ] Shixiong Zhu commented on SPARK-20882: -- NVM. I saw NettyBlockTransferService in the log. > Executor is waiting for ShuffleBlockFetcherIterator > --- > > Key: SPARK-20882 > URL: https://issues.apache.org/jira/browse/SPARK-20882 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0, 2.1.1 >Reporter: cen yuhai > Attachments: executor_jstack, executor_log, screenshot-1.png, > screenshot-2.png, screenshot-3.png > > > This bug is like https://issues.apache.org/jira/browse/SPARK-19300. > but I have updated my client netty version to 4.0.43.Final. > The shuffle service handler is still 4.0.42.Final > spark.sql.adaptive.enabled is true > {code} > "Executor task launch worker for task 4808985" #5373 daemon prio=5 os_prio=0 > tid=0x7f54ef437000 nid=0x1aed0 waiting on condition [0x7f53aebfe000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > parking to wait for <0x000498c249c0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:189) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:332) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:58) > at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) > at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:199) > at > org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:97) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) > at org.apache.spark.scheduler.Task.run(Task.scala:114) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:323) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622) > at java.lang.Thread.run(Thread.java:834) > {code} > {code} > 7/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 3 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 2 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 1 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 1 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 1 > 17/05/26 02:02:03 WARN TransportChannelHandler: Exception in connection from > bigdata-apache-hdp-132.xg01/10.0.132.58:7337 > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > at sun.nio.ch.IOUtil.read(IOUtil.java:192) > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) > at > io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:221) > at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:899) > at > io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:275) > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442) > at >
[jira] [Commented] (SPARK-20882) Executor is waiting for ShuffleBlockFetcherIterator
[ https://issues.apache.org/jira/browse/SPARK-20882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16025227#comment-16025227 ] Shixiong Zhu commented on SPARK-20882: -- [~cenyuhai] are you using shuffle service? > Executor is waiting for ShuffleBlockFetcherIterator > --- > > Key: SPARK-20882 > URL: https://issues.apache.org/jira/browse/SPARK-20882 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0, 2.1.1 >Reporter: cen yuhai > Attachments: executor_jstack, executor_log, screenshot-1.png, > screenshot-2.png, screenshot-3.png > > > This bug is like https://issues.apache.org/jira/browse/SPARK-19300. > but I have updated my netty version to 4.0.43.Final. > spark.sql.adaptive.enabled is true > {code} > "Executor task launch worker for task 4808985" #5373 daemon prio=5 os_prio=0 > tid=0x7f54ef437000 nid=0x1aed0 waiting on condition [0x7f53aebfe000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > parking to wait for <0x000498c249c0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:189) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:332) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:58) > at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) > at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:199) > at > org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:97) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) > at org.apache.spark.scheduler.Task.run(Task.scala:114) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:323) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622) > at java.lang.Thread.run(Thread.java:834) > {code} > {code} > 7/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 3 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 2 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 1 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 1 > 17/05/26 02:01:55 DEBUG ShuffleBlockFetcherIterator: Number of requests in > flight 1 > 17/05/26 02:02:03 WARN TransportChannelHandler: Exception in connection from > bigdata-apache-hdp-132.xg01/10.0.132.58:7337 > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > at sun.nio.ch.IOUtil.read(IOUtil.java:192) > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) > at > io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:221) > at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:899) > at > io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:275) > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442) > at >
[jira] [Commented] (SPARK-20882) Executor is waiting for ShuffleBlockFetcherIterator
[ https://issues.apache.org/jira/browse/SPARK-20882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024503#comment-16024503 ] cen yuhai commented on SPARK-20882: --- [~zsxwing] > Executor is waiting for ShuffleBlockFetcherIterator > --- > > Key: SPARK-20882 > URL: https://issues.apache.org/jira/browse/SPARK-20882 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0, 2.1.1 >Reporter: cen yuhai > Attachments: executor_jstack, executor_log, screenshot-1.png, > screenshot-2.png, screenshot-3.png > > > This bug is like https://issues.apache.org/jira/browse/SPARK-19300. > but I have updated my netty version to 4.0.43.Final. > {code} > "Executor task launch worker for task 4808985" #5373 daemon prio=5 os_prio=0 > tid=0x7f54ef437000 nid=0x1aed0 waiting on condition [0x7f53aebfe000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > parking to wait for <0x000498c249c0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:189) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:332) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:58) > at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434) > at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:199) > at > org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:97) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54) > at org.apache.spark.scheduler.Task.run(Task.scala:114) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:323) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622) > at java.lang.Thread.run(Thread.java:834) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org