[GitHub] spark pull request #19951: [SPARK-22760][CORE][YARN] When sc.stop() is calle...

2018-05-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19951


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19951: [SPARK-22760][CORE][YARN] When sc.stop() is calle...

2017-12-12 Thread KaiXinXiaoLei
GitHub user KaiXinXiaoLei opened a pull request:

https://github.com/apache/spark/pull/19951

[SPARK-22760][CORE][YARN] When sc.stop() is called, set stopped is true 
before removing executors

## What changes were proposed in this pull request?

When the number of executors is big, and YarnSchedulerBackend.stop() is 
running,
before  YarnSchedulerBackend.stopped=true,  if some executor is stoped, 
then YarnSchedulerBackend.onDisconnected() will be called. There is a problem 
as follows: 
{noformat}
17/12/12 15:34:45 INFO YarnClientSchedulerBackend: Asking each executor to 
shut down
17/12/12 15:34:45 INFO YarnClientSchedulerBackend: Disabling executor 63.
17/12/12 15:34:45 ERROR Inbox: Ignoring error
org.apache.spark.SparkException: Could not find CoarseGrainedScheduler or 
it has been stopped.
at 
org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:163)
at 
org.apache.spark.rpc.netty.Dispatcher.postOneWayMessage(Dispatcher.scala:133)
at org.apache.spark.rpc.netty.NettyRpcEnv.send(NettyRpcEnv.scala:192)
at 
org.apache.spark.rpc.netty.NettyRpcEndpointRef.send(NettyRpcEnv.scala:516)
at 
org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.reviveOffers(CoarseGrainedSchedulerBackend.scala:356)
at 
org.apache.spark.scheduler.TaskSchedulerImpl.executorLost(TaskSchedulerImpl.scala:497)
at 
org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverEndpoint.disableExecutor(CoarseGrainedSchedulerBackend.scala:301)
at 
org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnDriverEndpoint$$anonfun$onDisconnected$1.apply(YarnSchedulerBackend.scala:121)
at 
org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnDriverEndpoint$$anonfun$onDisconnected$1.apply(YarnSchedulerBackend.scala:120)
at scala.Option.foreach(Option.scala:236)
at 
org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnDriverEndpoint.onDisconnected(YarnSchedulerBackend.scala:120)
at 
org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:142)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:204)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
at 
org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:217)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{noformat}

. So i change the code, when removing executor, check sc.isStopped in 
YarnSchedulerBackend.onDisconnected(). if sc.isStopped=true, the message will 
not be sent.

## How was this patch tested?
Run "spark-sql --master yarn -f query.sql" many times, the problem will be 
exists.

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/KaiXinXiaoLei/spark pendingAdd11

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19951.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19951


commit c4dcc19ce8af02f99be18db8ddfe9b704086dd43
Author: hanghang <584620...@qq.com>
Date:   2017-12-11T23:53:52Z

change code




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org