[ 
https://issues.apache.org/jira/browse/SPARK-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092376#comment-14092376
 ] 

Xu Zhongxing edited comment on SPARK-2204 at 8/11/14 6:49 AM:
--------------------------------------------------------------

I encountered this issue again when I use Spark 1.0.2, Mesos 0.18.1, 
spark-cassandra-connector master branch.

Maybe this is not fixed on some failure/exception paths.

I run spark in coarse-grained mode. There are some exceptions thrown at the 
executors. But the spark driver is waiting and printing repeatedly:

TRACE [spark-akka.actor.default-dispatcher-17] 2014-08-11 10:57:32,998 
Logging.scala (line 66) Checking for hosts with\
 no recent heart beats in BlockManagerMaster.

The mesos master WARNING log:
W0811 10:32:58.172175 1646 master.cpp:2103] Ignoring unknown exited executor 
20140808-113811-858302656-5050-1645-2 on slave 20140808-113811-858302656-505\
0-1645-2 (ndb9)
W0811 10:32:58.181217 1649 master.cpp:2103] Ignoring unknown exited executor 
20140808-113811-858302656-5050-1645-5 on slave 20140808-113811-858302656-505\
0-1645-5 (ndb5)
W0811 10:32:58.277014 1650 master.cpp:2103] Ignoring unknown exited executor 
20140808-113811-858302656-5050-1645-3 on slave 20140808-113811-858302656-505\
0-1645-3 (ndb6)
W0811 10:32:58.344130 1648 master.cpp:2103] Ignoring unknown exited executor 
20140808-113811-858302656-5050-1645-0 on slave 20140808-113811-858302656-505\
0-1645-0 (ndb0)
W0811 10:32:58.354117 1651 master.cpp:2103] Ignoring unknown exited executor 
20140804-095254-505981120-5050-20258-11 on slave 20140804-095254-505981120-5\
050-20258-11 (ndb2)
W0811 10:32:58.550233 1647 master.cpp:2103] Ignoring unknown exited executor 
20140804-172212-505981120-5050-26571-2 on slave 20140804-172212-505981120-50\
50-26571-2 (ndb3)
W0811 10:32:58.793258 1653 master.cpp:2103] Ignoring unknown exited executor 
20140804-095254-505981120-5050-20258-19 on slave 20140804-095254-505981120-5\
050-20258-19 (ndb1)
W0811 10:32:58.904842 1652 master.cpp:2103] Ignoring unknown exited executor 
20140804-172212-505981120-5050-26571-0 on slave 20140804-172212-505981120-50\
50-26571-0 (ndb4)

Some other logs are at: 
https://github.com/datastax/spark-cassandra-connector/issues/134



was (Author: xuzhongxing):
I encountered this issue again when I use Spark 1.0.2, Mesos 0.18.1, 
spark-cassandra-connector master branch.

I run spark in coarse-grained mode. There are some exceptions thrown at the 
executors. But the spark driver is waiting and printing repeatedly:

TRACE [spark-akka.actor.default-dispatcher-17] 2014-08-11 10:57:32,998 
Logging.scala (line 66) Checking for hosts with\
 no recent heart beats in BlockManagerMaster.

The mesos master WARNING log:
W0811 10:32:58.172175 1646 master.cpp:2103] Ignoring unknown exited executor 
20140808-113811-858302656-5050-1645-2 on slave 20140808-113811-858302656-505\
0-1645-2 (ndb9)
W0811 10:32:58.181217 1649 master.cpp:2103] Ignoring unknown exited executor 
20140808-113811-858302656-5050-1645-5 on slave 20140808-113811-858302656-505\
0-1645-5 (ndb5)
W0811 10:32:58.277014 1650 master.cpp:2103] Ignoring unknown exited executor 
20140808-113811-858302656-5050-1645-3 on slave 20140808-113811-858302656-505\
0-1645-3 (ndb6)
W0811 10:32:58.344130 1648 master.cpp:2103] Ignoring unknown exited executor 
20140808-113811-858302656-5050-1645-0 on slave 20140808-113811-858302656-505\
0-1645-0 (ndb0)
W0811 10:32:58.354117 1651 master.cpp:2103] Ignoring unknown exited executor 
20140804-095254-505981120-5050-20258-11 on slave 20140804-095254-505981120-5\
050-20258-11 (ndb2)
W0811 10:32:58.550233 1647 master.cpp:2103] Ignoring unknown exited executor 
20140804-172212-505981120-5050-26571-2 on slave 20140804-172212-505981120-50\
50-26571-2 (ndb3)
W0811 10:32:58.793258 1653 master.cpp:2103] Ignoring unknown exited executor 
20140804-095254-505981120-5050-20258-19 on slave 20140804-095254-505981120-5\
050-20258-19 (ndb1)
W0811 10:32:58.904842 1652 master.cpp:2103] Ignoring unknown exited executor 
20140804-172212-505981120-5050-26571-0 on slave 20140804-172212-505981120-50\
50-26571-0 (ndb4)

Some other logs are at: 
https://github.com/datastax/spark-cassandra-connector/issues/134


> Scheduler for Mesos in fine-grained mode launches tasks on wrong executors
> --------------------------------------------------------------------------
>
>                 Key: SPARK-2204
>                 URL: https://issues.apache.org/jira/browse/SPARK-2204
>             Project: Spark
>          Issue Type: Bug
>          Components: Mesos
>    Affects Versions: 1.0.0
>            Reporter: Sebastien Rainville
>            Assignee: Sebastien Rainville
>            Priority: Blocker
>             Fix For: 1.0.1, 1.1.0
>
>
> MesosSchedulerBackend.resourceOffers(SchedulerDriver, List[Offer]) is 
> assuming that TaskSchedulerImpl.resourceOffers(Seq[WorkerOffer]) is returning 
> task lists in the same order as the offers it was passed, but in the current 
> implementation TaskSchedulerImpl.resourceOffers shuffles the offers to avoid 
> assigning the tasks always to the same executors. The result is that the 
> tasks are launched on the wrong executors.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to