[jira] [Updated] (SPARK-16522) [MESOS] Spark application throws exception on exit

2016-08-09 Thread Michael Gummelt (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-16522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Gummelt updated SPARK-16522:

Fix Version/s: (was: 2.1.0)
   2.0.1

> [MESOS] Spark application throws exception on exit
> --
>
> Key: SPARK-16522
> URL: https://issues.apache.org/jira/browse/SPARK-16522
> Project: Spark
>  Issue Type: Bug
>  Components: Mesos
>Affects Versions: 2.0.0
>Reporter: Sun Rui
>Assignee: Sun Rui
> Fix For: 2.0.1
>
>
> Spark applications running on Mesos throw exception upon exit as follows:
> {noformat}
> 16/07/13 15:20:46 WARN NettyRpcEndpointRef: Error sending message [message = 
> RemoveExecutor(1,Executor finished with state FINISHED)] in 3 attempts
> org.apache.spark.SparkException: Exception thrown in awaitResult
>   at 
> org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
>   at 
> org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
>   at 
> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
>   at 
> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
>   at 
> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
>   at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
>   at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
>   at 
> org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102)
>   at 
> org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:78)
>   at 
> org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.removeExecutor(CoarseGrainedSchedulerBackend.scala:412)
>   at 
> org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedSchedulerBackend.executorTerminated(MesosCoarseGrainedSchedulerBackend.scala:555)
>   at 
> org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedSchedulerBackend.statusUpdate(MesosCoarseGrainedSchedulerBackend.scala:495)
> Caused by: org.apache.spark.SparkException: Could not find 
> CoarseGrainedScheduler.
>   at 
> org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:152)
>   at 
> org.apache.spark.rpc.netty.Dispatcher.postLocalMessage(Dispatcher.scala:127)
>   at org.apache.spark.rpc.netty.NettyRpcEnv.ask(NettyRpcEnv.scala:225)
>   at 
> org.apache.spark.rpc.netty.NettyRpcEndpointRef.ask(NettyRpcEnv.scala:508)
>   at 
> org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
>   ... 4 more
> Exception in thread "Thread-47" org.apache.spark.SparkException: Error 
> notifying standalone scheduler's driver endpoint
>   at 
> org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.removeExecutor(CoarseGrainedSchedulerBackend.scala:415)
>   at 
> org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedSchedulerBackend.executorTerminated(MesosCoarseGrainedSchedulerBackend.scala:555)
>   at 
> org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedSchedulerBackend.statusUpdate(MesosCoarseGrainedSchedulerBackend.scala:495)
> Caused by: org.apache.spark.SparkException: Error sending message [message = 
> RemoveExecutor(1,Executor finished with state FINISHED)]
>   at 
> org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:119)
>   at 
> org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:78)
>   at 
> org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.removeExecutor(CoarseGrainedSchedulerBackend.scala:412)
>   ... 2 more
> Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult
>   at 
> org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
>   at 
> org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
>   at 
> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
>   at 
> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
>   at 
> org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
>   at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
>   at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
>   at 
> org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102)
>   ... 4 more
> Caused by: org.apache.spark.SparkException: Could not find 
> CoarseGrainedScheduler.
>   at 
> org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:152)
>   at 
> org.apache.spark.rpc.netty.Dispatcher.postLocalMessage(Dispatcher.scala:127)
>   at org.apache.spark.rpc.netty.NettyRpcEnv.ask(NettyRpcEnv.scala:225)
> 

[jira] [Updated] (SPARK-16522) [MESOS] Spark application throws exception on exit

2016-07-14 Thread Reynold Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-16522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin updated SPARK-16522:

Description: 
Spark applications running on Mesos throw exception upon exit as follows:
{noformat}
16/07/13 15:20:46 WARN NettyRpcEndpointRef: Error sending message [message = 
RemoveExecutor(1,Executor finished with state FINISHED)] in 3 attempts
org.apache.spark.SparkException: Exception thrown in awaitResult
at 
org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
at 
org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
at 
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at 
org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at 
org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
at 
org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102)
at 
org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:78)
at 
org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.removeExecutor(CoarseGrainedSchedulerBackend.scala:412)
at 
org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedSchedulerBackend.executorTerminated(MesosCoarseGrainedSchedulerBackend.scala:555)
at 
org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedSchedulerBackend.statusUpdate(MesosCoarseGrainedSchedulerBackend.scala:495)
Caused by: org.apache.spark.SparkException: Could not find 
CoarseGrainedScheduler.
at 
org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:152)
at 
org.apache.spark.rpc.netty.Dispatcher.postLocalMessage(Dispatcher.scala:127)
at org.apache.spark.rpc.netty.NettyRpcEnv.ask(NettyRpcEnv.scala:225)
at 
org.apache.spark.rpc.netty.NettyRpcEndpointRef.ask(NettyRpcEnv.scala:508)
at 
org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
... 4 more
Exception in thread "Thread-47" org.apache.spark.SparkException: Error 
notifying standalone scheduler's driver endpoint
at 
org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.removeExecutor(CoarseGrainedSchedulerBackend.scala:415)
at 
org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedSchedulerBackend.executorTerminated(MesosCoarseGrainedSchedulerBackend.scala:555)
at 
org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedSchedulerBackend.statusUpdate(MesosCoarseGrainedSchedulerBackend.scala:495)
Caused by: org.apache.spark.SparkException: Error sending message [message = 
RemoveExecutor(1,Executor finished with state FINISHED)]
at 
org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:119)
at 
org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:78)
at 
org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.removeExecutor(CoarseGrainedSchedulerBackend.scala:412)
... 2 more
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult
at 
org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
at 
org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
at 
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at 
org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at 
org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
at 
org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102)
... 4 more
Caused by: org.apache.spark.SparkException: Could not find 
CoarseGrainedScheduler.
at 
org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:152)
at 
org.apache.spark.rpc.netty.Dispatcher.postLocalMessage(Dispatcher.scala:127)
at org.apache.spark.rpc.netty.NettyRpcEnv.ask(NettyRpcEnv.scala:225)
at 
org.apache.spark.rpc.netty.NettyRpcEndpointRef.ask(NettyRpcEnv.scala:508)
at 
org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
... 4 more
{noformat}

Applications' result is not affected by this error.

This issue can be simply reproduced by launching a spark-shell, and exit after 
running the following commands:
{code}
val rdd = sc.parallelize(1 to 10, 10)
rdd.map { _ + 1} collect
{code}

The root cause is that in SparkContext.stop(), 
MesosCoarseGrainedSchedulerBackend.stop() calls 
CoarseGrainedSchedu