[ 
https://issues.apache.org/jira/browse/SPARK-19147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15837333#comment-15837333
 ] 

Genmao Yu edited comment on SPARK-19147 at 1/25/17 7:39 AM:
------------------------------------------------------------

After dig into code, this issue may occurs when executor is existing and at the 
same time some tasks are running. There are some scenes in which this NPE will 
be thrown:

1. AM re-registers after a failure, and this will resets the state of 
CoarseGrainedSchedulerBackend to the initial state, then old executors will be 
removed. This will only occurs in yarn-client mode.
2. Master may will remove executor when a work is lost, and at the same time 
some tasks may be running.
3. Some other scenes i do not know.

As far as I know, this NPE harms nothing, but may confuse users. Maybe, we 
should improve this exception message to tell users it is safe. What is your 
opinion? [~zsxwing]


was (Author: unclegen):
After dig into code, this issue may occurs when executor is existing and at the 
same time some tasks are running. There are some scenes in which this NPE will 
be thrown:

1. AM re-registers after a failure, and this will resets the state of 
CoarseGrainedSchedulerBackend to the initial state, then old executors will be 
removed. This will only occurs in yarn-client mode.
2. Master may will remove executor when a work is lost, and at the same time 
some tasks may be running.
3. Some other scenes i do not know.

As for as I know, this NPE harms nothing, but may confuse users. Maybe, we 
should improve this exception message to tell users it is safe. What is your 
opinion? [~zsxwing]

> netty throw NPE
> ---------------
>
>                 Key: SPARK-19147
>                 URL: https://issues.apache.org/jira/browse/SPARK-19147
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.1.0
>            Reporter: cen yuhai
>
> {code}
> 17/01/10 19:17:20 ERROR ShuffleBlockFetcherIterator: Failed to get block(s) 
> from bigdata-hdp-apache1828.xg01.diditaxi.com:7337
> java.lang.NullPointerException: group
>       at io.netty.bootstrap.AbstractBootstrap.group(AbstractBootstrap.java:80)
>       at 
> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:203)
>       at 
> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:181)
>       at 
> org.apache.spark.network.shuffle.ExternalShuffleClient$1.createAndStart(ExternalShuffleClient.java:105)
>       at 
> org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140)
>       at 
> org.apache.spark.network.shuffle.RetryingBlockFetcher.start(RetryingBlockFetcher.java:120)
>       at 
> org.apache.spark.network.shuffle.ExternalShuffleClient.fetchBlocks(ExternalShuffleClient.java:114)
>       at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.sendRequest(ShuffleBlockFetcherIterator.scala:169)
>       at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.fetchUpToMaxBytes(ShuffleBlockFetcherIterator.scala:354)
>       at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:332)
>       at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:54)
>       at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
>       at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
>       at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
>       at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>       at 
> org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32)
>       at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
>       at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>       at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.sort_addToSorter$(Unknown
>  Source)
>       at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
>       at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>       at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377)
>       at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.findNextInnerJoinRows$(Unknown
>  Source)
>       at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
>       at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>       at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$2.hasNext(WholeStageCodegenExec.scala:396)
>       at 
> org.apache.spark.sql.execution.columnar.InMemoryRelation$$anonfun$1$$anon$1.hasNext(InMemoryRelation.scala:138)
>       at 
> org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:215)
>       at 
> org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:957)
>       at 
> org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:948)
>       at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:888)
>       at 
> org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:948)
>       at 
> org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:694)
>       at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:334)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:285)
>       at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>       at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>       at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>       at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>       at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>       at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>       at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>       at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
>       at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
>       at org.apache.spark.scheduler.Task.run(Task.scala:99)
>       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to