[ https://issues.apache.org/jira/browse/SPARK-22760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
KaiXinXIaoLei updated SPARK-22760: ---------------------------------- Attachment: 微信图片_20171212094100.jpg > where driver is stopping, and some executors lost because of > YarnSchedulerBackend.stop, then there is a problem, > ----------------------------------------------------------------------------------------------------------------- > > Key: SPARK-22760 > URL: https://issues.apache.org/jira/browse/SPARK-22760 > Project: Spark > Issue Type: Bug > Components: Spark Core, YARN > Affects Versions: 2.2.1 > Reporter: KaiXinXIaoLei > Attachments: 微信图片_20171212094100.jpg > > > Use SPARK-14228 , i find a problem: > 17/12/11 22:38:33 WARN YarnSchedulerBackend$YarnSchedulerEndpoint: Executor > for container container_e02_1509517131757_0001_01_000002 exited because of a > YARN event (e.g., pre-emption) and not because of an error in the running job. > 17/12/11 22:38:33 ERROR YarnClientSchedulerBackend: Could not find > CoarseGrainedScheduler or it has been stopped. > org.apache.spark.SparkException: Could not find CoarseGrainedScheduler or it > has been stopped. > at > org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:163) > at > org.apache.spark.rpc.netty.Dispatcher.postLocalMessage(Dispatcher.scala:128) > at org.apache.spark.rpc.netty.NettyRpcEnv.ask(NettyRpcEnv.scala:231) > at > org.apache.spark.rpc.netty.NettyRpcEndpointRef.ask(NettyRpcEnv.scala:515) > at org.apache.spark.rpc.RpcEndpointRef.ask(RpcEndpointRef.scala:62) > at > org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.removeExecutor(CoarseGrainedSchedulerBackend.scala:392) > at > org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnSchedulerEndpoint$$anonfun$receive$1.applyOrElse(YarnSchedulerBackend.scala:259) > at > org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:116) > at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:204) > I analysis this reason. When the number of executors is big, and > YarnSchedulerBackend.stopped=False after YarnSchedulerBackend.stop() is > running, some executor is stoped, and YarnSchedulerBackend.onDisconnected() > will be called, then the problem is exists -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org