[jira] [Commented] (SPARK-10854) MesosExecutorBackend: Received launchTask but executor was null

2016-05-03 Thread Chris Heller (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-10854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15268635#comment-15268635
 ] 

Chris Heller commented on SPARK-10854:
--

I wanted to bump this issue. I'm running into it from time to time with Spark 
1.5.2, running on Mesos 0.22.2 and using Docker 1.8.3.

The stuck task will hold up an entire job from completion.

Would speculative tasks be an appropriate work around here? That feels like 
overkill to always need to use them, but as a temporary fix it would be ok.

> MesosExecutorBackend: Received launchTask but executor was null
> ---
>
> Key: SPARK-10854
> URL: https://issues.apache.org/jira/browse/SPARK-10854
> Project: Spark
>  Issue Type: Bug
>  Components: Mesos
>Affects Versions: 1.4.0
> Environment: Spark 1.4.0
> Mesos 0.23.0
> Docker 1.8.1
>Reporter: Kevin Matzen
>Priority: Minor
>
> Sometimes my tasks get stuck in staging.  Here's stdout from one such worker. 
>  I'm running mesos-slave inside a docker container with the host's docker 
> exposed and I'm using Spark's docker support to launch the worker inside its 
> own container.  Both containers are running.  I'm using pyspark.  I can see 
> mesos-slave and java running, but I do not see python running.
> {noformat}
> WARNING: Your kernel does not support swap limit capabilities, memory limited 
> without swap.
> Using Spark's default log4j profile: 
> org/apache/spark/log4j-defaults.properties
> 15/09/28 15:02:09 INFO MesosExecutorBackend: Registered signal handlers for 
> [TERM, HUP, INT]
> I0928 15:02:09.65854138 exec.cpp:132] Version: 0.23.0
> 15/09/28 15:02:09 ERROR MesosExecutorBackend: Received launchTask but 
> executor was null
> I0928 15:02:09.70295554 exec.cpp:206] Executor registered on slave 
> 20150928-044200-1140850698-5050-8-S190
> 15/09/28 15:02:09 INFO MesosExecutorBackend: Registered with Mesos as 
> executor ID 20150928-044200-1140850698-5050-8-S190 with 1 cpus
> 15/09/28 15:02:09 INFO SecurityManager: Changing view acls to: root
> 15/09/28 15:02:09 INFO SecurityManager: Changing modify acls to: root
> 15/09/28 15:02:09 INFO SecurityManager: SecurityManager: authentication 
> disabled; ui acls disabled; users with view permissions: Set(root); users 
> with modify permissions: Set(root)
> 15/09/28 15:02:10 INFO Slf4jLogger: Slf4jLogger started
> 15/09/28 15:02:10 INFO Remoting: Starting remoting
> 15/09/28 15:02:10 INFO Remoting: Remoting started; listening on addresses 
> :[akka.tcp://sparkExecutor@:56458]
> 15/09/28 15:02:10 INFO Utils: Successfully started service 'sparkExecutor' on 
> port 56458.
> 15/09/28 15:02:10 INFO DiskBlockManager: Created local directory at 
> /tmp/spark-28a21c2d-54cc-40b3-b0c2-cc3624f1a73c/blockmgr-f2336fec-e1ea-44f1-bd5c-9257049d5e7b
> 15/09/28 15:02:10 INFO MemoryStore: MemoryStore started with capacity 52.1 MB
> 15/09/28 15:02:11 WARN NativeCodeLoader: Unable to load native-hadoop library 
> for your platform... using builtin-java classes where applicable
> 15/09/28 15:02:11 INFO Executor: Starting executor ID 
> 20150928-044200-1140850698-5050-8-S190 on host 
> 15/09/28 15:02:11 INFO Utils: Successfully started service 
> 'org.apache.spark.network.netty.NettyBlockTransferService' on port 57431.
> 15/09/28 15:02:11 INFO NettyBlockTransferService: Server created on 57431
> 15/09/28 15:02:11 INFO BlockManagerMaster: Trying to register BlockManager
> 15/09/28 15:02:11 INFO BlockManagerMaster: Registered BlockManager
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-10854) MesosExecutorBackend: Received launchTask but executor was null

2015-12-17 Thread Dmitry Romanenko (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-10854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15062247#comment-15062247
 ] 

Dmitry Romanenko commented on SPARK-10854:
--

Any update on this? Maybe some suggestions? I keep seeing such issue from time 
to time.

> MesosExecutorBackend: Received launchTask but executor was null
> ---
>
> Key: SPARK-10854
> URL: https://issues.apache.org/jira/browse/SPARK-10854
> Project: Spark
>  Issue Type: Bug
>  Components: Mesos
>Affects Versions: 1.4.0
> Environment: Spark 1.4.0
> Mesos 0.23.0
> Docker 1.8.1
>Reporter: Kevin Matzen
>Priority: Minor
>
> Sometimes my tasks get stuck in staging.  Here's stdout from one such worker. 
>  I'm running mesos-slave inside a docker container with the host's docker 
> exposed and I'm using Spark's docker support to launch the worker inside its 
> own container.  Both containers are running.  I'm using pyspark.  I can see 
> mesos-slave and java running, but I do not see python running.
> {noformat}
> WARNING: Your kernel does not support swap limit capabilities, memory limited 
> without swap.
> Using Spark's default log4j profile: 
> org/apache/spark/log4j-defaults.properties
> 15/09/28 15:02:09 INFO MesosExecutorBackend: Registered signal handlers for 
> [TERM, HUP, INT]
> I0928 15:02:09.65854138 exec.cpp:132] Version: 0.23.0
> 15/09/28 15:02:09 ERROR MesosExecutorBackend: Received launchTask but 
> executor was null
> I0928 15:02:09.70295554 exec.cpp:206] Executor registered on slave 
> 20150928-044200-1140850698-5050-8-S190
> 15/09/28 15:02:09 INFO MesosExecutorBackend: Registered with Mesos as 
> executor ID 20150928-044200-1140850698-5050-8-S190 with 1 cpus
> 15/09/28 15:02:09 INFO SecurityManager: Changing view acls to: root
> 15/09/28 15:02:09 INFO SecurityManager: Changing modify acls to: root
> 15/09/28 15:02:09 INFO SecurityManager: SecurityManager: authentication 
> disabled; ui acls disabled; users with view permissions: Set(root); users 
> with modify permissions: Set(root)
> 15/09/28 15:02:10 INFO Slf4jLogger: Slf4jLogger started
> 15/09/28 15:02:10 INFO Remoting: Starting remoting
> 15/09/28 15:02:10 INFO Remoting: Remoting started; listening on addresses 
> :[akka.tcp://sparkExecutor@:56458]
> 15/09/28 15:02:10 INFO Utils: Successfully started service 'sparkExecutor' on 
> port 56458.
> 15/09/28 15:02:10 INFO DiskBlockManager: Created local directory at 
> /tmp/spark-28a21c2d-54cc-40b3-b0c2-cc3624f1a73c/blockmgr-f2336fec-e1ea-44f1-bd5c-9257049d5e7b
> 15/09/28 15:02:10 INFO MemoryStore: MemoryStore started with capacity 52.1 MB
> 15/09/28 15:02:11 WARN NativeCodeLoader: Unable to load native-hadoop library 
> for your platform... using builtin-java classes where applicable
> 15/09/28 15:02:11 INFO Executor: Starting executor ID 
> 20150928-044200-1140850698-5050-8-S190 on host 
> 15/09/28 15:02:11 INFO Utils: Successfully started service 
> 'org.apache.spark.network.netty.NettyBlockTransferService' on port 57431.
> 15/09/28 15:02:11 INFO NettyBlockTransferService: Server created on 57431
> 15/09/28 15:02:11 INFO BlockManagerMaster: Trying to register BlockManager
> 15/09/28 15:02:11 INFO BlockManagerMaster: Registered BlockManager
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-10854) MesosExecutorBackend: Received launchTask but executor was null

2015-12-03 Thread Matei Zaharia (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-10854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15038058#comment-15038058
 ] 

Matei Zaharia commented on SPARK-10854:
---

Just a note, I saw a log where this happened, and the sequence of events is 
that the executor logs a launchTask callback before registered(). It could be a 
synchronization thing or a problem in the Mesos library.

> MesosExecutorBackend: Received launchTask but executor was null
> ---
>
> Key: SPARK-10854
> URL: https://issues.apache.org/jira/browse/SPARK-10854
> Project: Spark
>  Issue Type: Bug
>  Components: Mesos
>Affects Versions: 1.4.0
> Environment: Spark 1.4.0
> Mesos 0.23.0
> Docker 1.8.1
>Reporter: Kevin Matzen
>Priority: Minor
>
> Sometimes my tasks get stuck in staging.  Here's stdout from one such worker. 
>  I'm running mesos-slave inside a docker container with the host's docker 
> exposed and I'm using Spark's docker support to launch the worker inside its 
> own container.  Both containers are running.  I'm using pyspark.  I can see 
> mesos-slave and java running, but I do not see python running.
> {noformat}
> WARNING: Your kernel does not support swap limit capabilities, memory limited 
> without swap.
> Using Spark's default log4j profile: 
> org/apache/spark/log4j-defaults.properties
> 15/09/28 15:02:09 INFO MesosExecutorBackend: Registered signal handlers for 
> [TERM, HUP, INT]
> I0928 15:02:09.65854138 exec.cpp:132] Version: 0.23.0
> 15/09/28 15:02:09 ERROR MesosExecutorBackend: Received launchTask but 
> executor was null
> I0928 15:02:09.70295554 exec.cpp:206] Executor registered on slave 
> 20150928-044200-1140850698-5050-8-S190
> 15/09/28 15:02:09 INFO MesosExecutorBackend: Registered with Mesos as 
> executor ID 20150928-044200-1140850698-5050-8-S190 with 1 cpus
> 15/09/28 15:02:09 INFO SecurityManager: Changing view acls to: root
> 15/09/28 15:02:09 INFO SecurityManager: Changing modify acls to: root
> 15/09/28 15:02:09 INFO SecurityManager: SecurityManager: authentication 
> disabled; ui acls disabled; users with view permissions: Set(root); users 
> with modify permissions: Set(root)
> 15/09/28 15:02:10 INFO Slf4jLogger: Slf4jLogger started
> 15/09/28 15:02:10 INFO Remoting: Starting remoting
> 15/09/28 15:02:10 INFO Remoting: Remoting started; listening on addresses 
> :[akka.tcp://sparkExecutor@:56458]
> 15/09/28 15:02:10 INFO Utils: Successfully started service 'sparkExecutor' on 
> port 56458.
> 15/09/28 15:02:10 INFO DiskBlockManager: Created local directory at 
> /tmp/spark-28a21c2d-54cc-40b3-b0c2-cc3624f1a73c/blockmgr-f2336fec-e1ea-44f1-bd5c-9257049d5e7b
> 15/09/28 15:02:10 INFO MemoryStore: MemoryStore started with capacity 52.1 MB
> 15/09/28 15:02:11 WARN NativeCodeLoader: Unable to load native-hadoop library 
> for your platform... using builtin-java classes where applicable
> 15/09/28 15:02:11 INFO Executor: Starting executor ID 
> 20150928-044200-1140850698-5050-8-S190 on host 
> 15/09/28 15:02:11 INFO Utils: Successfully started service 
> 'org.apache.spark.network.netty.NettyBlockTransferService' on port 57431.
> 15/09/28 15:02:11 INFO NettyBlockTransferService: Server created on 57431
> 15/09/28 15:02:11 INFO BlockManagerMaster: Trying to register BlockManager
> 15/09/28 15:02:11 INFO BlockManagerMaster: Registered BlockManager
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org