[GitHub] spark pull request: mesos executor ids now consist of the slave id...

2014-10-23 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1358#issuecomment-60327322 @tsliwowicz your fix seems good -- thanks for getting to the bottom of this! --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: mesos executor ids now consist of the slave id...

2014-10-20 Thread tsliwowicz
Github user tsliwowicz commented on the pull request: https://github.com/apache/spark/pull/1358#issuecomment-59724452 @mateiz - @KashiErez and I went on a different route. The killer issue was that there is a System.exit(1) in BlockManagerMasterActor which was a huge robustness issue

[GitHub] spark pull request: mesos executor ids now consist of the slave id...

2014-09-17 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1358#issuecomment-55978678 I see, so maybe the problem is that an executor dies, and another is launched on the same Mesos machine with the same executor ID, which then breaks assumptions elsewhere

[GitHub] spark pull request: mesos executor ids now consist of the slave id...

2014-09-17 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1358#issuecomment-55978771 BTW the delta from the original pull request would be that we only increment our counter when the old executor fails. If you want to implement that, please create a JIRA

[GitHub] spark pull request: mesos executor ids now consist of the slave id...

2014-09-15 Thread brndnmtthws
Github user brndnmtthws commented on the pull request: https://github.com/apache/spark/pull/1358#issuecomment-55659794 It seems that this is a symptom of the following issue: https://issues.apache.org/jira/browse/SPARK-3535 --- If your project is set up for it, you can reply

[GitHub] spark pull request: mesos executor ids now consist of the slave id...

2014-09-11 Thread brndnmtthws
Github user brndnmtthws commented on the pull request: https://github.com/apache/spark/pull/1358#issuecomment-55310679 Yep, also hitting this same problem. We're running Spark 1.0.2 and Mesos 0.20.0. From a quick analysis, it looks like a bug in Spark. --- If your project

[GitHub] spark pull request: mesos executor ids now consist of the slave id...

2014-09-11 Thread KashiErez
Github user KashiErez commented on the pull request: https://github.com/apache/spark/pull/1358#issuecomment-55237569 I have encountered this issue: We have a 24/7 Spark running job on Mesos. It happens every 1-3 days. Here are 2 lines from my Driver log file:

[GitHub] spark pull request: mesos executor ids now consist of the slave id...

2014-08-20 Thread gmalouf
Github user gmalouf commented on the pull request: https://github.com/apache/spark/pull/1358#issuecomment-52836949 We've run into this issue a handful of times including once today - is it possible the bug is in Mesos? --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: mesos executor ids now consist of the slave id...

2014-07-28 Thread drexin
Github user drexin commented on the pull request: https://github.com/apache/spark/pull/1358#issuecomment-50318921 @mateiz: You are right. I don't see how an executor could be started more than once per slave, but it seems to happen sometimes (see the mailing list entry). I will close

[GitHub] spark pull request: mesos executor ids now consist of the slave id...

2014-07-28 Thread drexin
Github user drexin closed the pull request at: https://github.com/apache/spark/pull/1358 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: mesos executor ids now consist of the slave id...

2014-07-28 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1358#issuecomment-50413581 Sure, if you find it, let me know. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: mesos executor ids now consist of the slave id...

2014-07-26 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1358#issuecomment-50250085 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: mesos executor ids now consist of the slave id...

2014-07-26 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1358#issuecomment-50250198 QA tests have started for PR 1358. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17229/consoleFull ---

[GitHub] spark pull request: mesos executor ids now consist of the slave id...

2014-07-26 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/1358#discussion_r15436238 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala --- @@ -250,7 +252,7 @@ private[spark] class

[GitHub] spark pull request: mesos executor ids now consist of the slave id...

2014-07-26 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1358#issuecomment-50250554 So I don't quite understand, how can multiple executors be launched for the same Spark application on the same node right now? I thought we always reuse our executor

[GitHub] spark pull request: mesos executor ids now consist of the slave id...

2014-07-11 Thread drexin
Github user drexin commented on the pull request: https://github.com/apache/spark/pull/1358#issuecomment-48706534 Hi Patrick, the problem is described in [this mailing list

[GitHub] spark pull request: mesos executor ids now consist of the slave id...

2014-07-11 Thread drexin
Github user drexin commented on the pull request: https://github.com/apache/spark/pull/1358#issuecomment-48707209 Created a JIRA issue here: https://issues.apache.org/jira/browse/SPARK-2445 --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: mesos executor ids now consist of the slave id...

2014-07-10 Thread drexin
GitHub user drexin opened a pull request: https://github.com/apache/spark/pull/1358 mesos executor ids now consist of the slave id and a counter to fix dupl... ...icate id problems You can merge this pull request into a Git repository by running: $ git pull

[GitHub] spark pull request: mesos executor ids now consist of the slave id...

2014-07-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1358#issuecomment-48630513 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your