[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-11 Thread tedyu
Github user tedyu commented on a diff in the pull request:

https://github.com/apache/spark/pull/11455#discussion_r55873954
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -368,6 +368,30 @@ private[deploy] class Master(
   if (canCompleteRecovery) { completeRecovery() }
 }
 
+case WorkerLatestState(workerId, executors, driverIds) =>
+  idToWorker.get(workerId) match {
+case Some(worker) =>
+  for (exec <- executors) {
+val executorMatches = worker.executors.exists {
+  case (_, e) => e.application.id == exec.appId && e.id == 
exec.execId
+}
+if (!executorMatches) {
+  // master doesn't recognize this executor. So just tell 
worker to kill it.
+  worker.endpoint.send(KillExecutor(masterUrl, exec.appId, 
exec.execId))
+}
+  }
+
+  for (driverId <- driverIds) {
+val driverMatches = worker.drivers.exists { case (id, _) => id 
== driverId }
+if (!driverMatches) {
+  // master doesn't recognize this driver. So just tell worker 
to kill it.
+  worker.endpoint.send(KillDriver(driverId))
--- End diff --

Let me look at other parts of Master.scala and see if I can find anything.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-11 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/11455#discussion_r55867638
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -368,6 +368,30 @@ private[deploy] class Master(
   if (canCompleteRecovery) { completeRecovery() }
 }
 
+case WorkerLatestState(workerId, executors, driverIds) =>
+  idToWorker.get(workerId) match {
+case Some(worker) =>
+  for (exec <- executors) {
+val executorMatches = worker.executors.exists {
+  case (_, e) => e.application.id == exec.appId && e.id == 
exec.execId
+}
+if (!executorMatches) {
+  // master doesn't recognize this executor. So just tell 
worker to kill it.
+  worker.endpoint.send(KillExecutor(masterUrl, exec.appId, 
exec.execId))
+}
+  }
+
+  for (driverId <- driverIds) {
+val driverMatches = worker.drivers.exists { case (id, _) => id 
== driverId }
+if (!driverMatches) {
+  // master doesn't recognize this driver. So just tell worker 
to kill it.
+  worker.endpoint.send(KillDriver(driverId))
--- End diff --

I don't think so. Which part of the code leads you to believe that?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-11 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/11455#discussion_r55863392
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -368,6 +368,30 @@ private[deploy] class Master(
   if (canCompleteRecovery) { completeRecovery() }
 }
 
+case WorkerLatestState(workerId, executors, driverIds) =>
+  idToWorker.get(workerId) match {
+case Some(worker) =>
+  for (exec <- executors) {
+val executorMatches = worker.executors.exists {
+  case (_, e) => e.application.id == exec.appId && e.id == 
exec.execId
+}
+if (!executorMatches) {
+  // master doesn't recognize this executor. So just tell 
worker to kill it.
+  worker.endpoint.send(KillExecutor(masterUrl, exec.appId, 
exec.execId))
+}
+  }
+
+  for (driverId <- driverIds) {
+val driverMatches = worker.drivers.exists { case (id, _) => id 
== driverId }
+if (!driverMatches) {
+  // master doesn't recognize this driver. So just tell worker 
to kill it.
+  worker.endpoint.send(KillDriver(driverId))
--- End diff --

I don't get it. Here just compare them with the executors and drivers of a 
worker stored in the master. If we find any mismatch, just kill it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-11 Thread tedyu
Github user tedyu commented on a diff in the pull request:

https://github.com/apache/spark/pull/11455#discussion_r55858792
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -368,6 +368,30 @@ private[deploy] class Master(
   if (canCompleteRecovery) { completeRecovery() }
 }
 
+case WorkerLatestState(workerId, executors, driverIds) =>
+  idToWorker.get(workerId) match {
+case Some(worker) =>
+  for (exec <- executors) {
+val executorMatches = worker.executors.exists {
+  case (_, e) => e.application.id == exec.appId && e.id == 
exec.execId
+}
+if (!executorMatches) {
+  // master doesn't recognize this executor. So just tell 
worker to kill it.
+  worker.endpoint.send(KillExecutor(masterUrl, exec.appId, 
exec.execId))
+}
+  }
+
+  for (driverId <- driverIds) {
+val driverMatches = worker.drivers.exists { case (id, _) => id 
== driverId }
+if (!driverMatches) {
+  // master doesn't recognize this driver. So just tell worker 
to kill it.
+  worker.endpoint.send(KillDriver(driverId))
--- End diff --

Looks like there may be scenario that Executor gets killed but driver gets 
kept, vice versa.

Is that desirable ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/11455


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-10 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-195122621
  
Merged into master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-195121484
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52861/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-195121483
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-195121283
  
**[Test build #52861 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52861/consoleFull)**
 for PR 11455 at commit 
[`51ac6dd`](https://github.com/apache/spark/commit/51ac6dd2f0e454f07e9cf0b2225e56c3aad841db).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-195082183
  
**[Test build #52861 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52861/consoleFull)**
 for PR 11455 at commit 
[`51ac6dd`](https://github.com/apache/spark/commit/51ac6dd2f0e454f07e9cf0b2225e56c3aad841db).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-195077185
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-195077187
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52853/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-195076982
  
**[Test build #52853 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52853/consoleFull)**
 for PR 11455 at commit 
[`51ac6dd`](https://github.com/apache/spark/commit/51ac6dd2f0e454f07e9cf0b2225e56c3aad841db).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-195036376
  
**[Test build #52853 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52853/consoleFull)**
 for PR 11455 at commit 
[`51ac6dd`](https://github.com/apache/spark/commit/51ac6dd2f0e454f07e9cf0b2225e56c3aad841db).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-10 Thread zsxwing
Github user zsxwing commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-195035761
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-195030239
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52838/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-195030235
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-195029884
  
**[Test build #52838 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52838/consoleFull)**
 for PR 11455 at commit 
[`51ac6dd`](https://github.com/apache/spark/commit/51ac6dd2f0e454f07e9cf0b2225e56c3aad841db).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-10 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-194990436
  
**[Test build #52838 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52838/consoleFull)**
 for PR 11455 at commit 
[`51ac6dd`](https://github.com/apache/spark/commit/51ac6dd2f0e454f07e9cf0b2225e56c3aad841db).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-09 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-194623126
  
LGTM, just style nits.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-09 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/11455#discussion_r55625576
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala 
---
@@ -64,6 +64,11 @@ private[deploy] object DeployMessages {
   case class WorkerSchedulerStateResponse(id: String, executors: 
List[ExecutorDescription],
  driverIds: Seq[String])
 
+  case class WorkerLatestState(
--- End diff --

can you add a comment here to explain when this is called, i.e. when the 
worker reregisters with the master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-09 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/11455#discussion_r55625550
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala 
---
@@ -374,6 +374,10 @@ private[deploy] class Worker(
   }, CLEANUP_INTERVAL_MILLIS, CLEANUP_INTERVAL_MILLIS, 
TimeUnit.MILLISECONDS)
 }
 
+val execs = executors.values.
+  map(e => new ExecutorDescription(e.appId, e.execId, e.cores, 
e.state))
--- End diff --

style:
```
executors.values.map { e =>
  new ExecutorDescription(...)
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-09 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/11455#discussion_r55625529
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -368,6 +368,27 @@ private[deploy] class Master(
   if (canCompleteRecovery) { completeRecovery() }
 }
 
+case WorkerLatestState(workerId, executors, driverIds) =>
+  idToWorker.get(workerId) match {
+case Some(worker) =>
+  for (exec <- executors) {
+if (!worker.executors.exists(
+  e => e._2.application.id == exec.appId && e._2.id == 
exec.execId)) {
--- End diff --

style: can you use `.exists { case (_, something) => 
something.application.id ... }` and store it in a variable? e.g.
```
for (exec <- executors) {
  val executorMatches = worker.executors.exists { ... }
  if (!executorMatches) {
worker.endpoint.send(...)
  }
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-09 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/11455#discussion_r55625534
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -368,6 +368,27 @@ private[deploy] class Master(
   if (canCompleteRecovery) { completeRecovery() }
 }
 
+case WorkerLatestState(workerId, executors, driverIds) =>
+  idToWorker.get(workerId) match {
+case Some(worker) =>
+  for (exec <- executors) {
+if (!worker.executors.exists(
+  e => e._2.application.id == exec.appId && e._2.id == 
exec.execId)) {
+  // master doesn't recognize this executor. So just tell 
worker to kill it.
+  worker.endpoint.send(KillExecutor(masterUrl, exec.appId, 
exec.execId))
+}
+  }
+
+  for (driverId <- driverIds) {
+if (!worker.drivers.exists(_._1 == driverId)) {
--- End diff --

same here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-08 Thread zsxwing
Github user zsxwing commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-194058366
  
ping @andrewor14 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-191501100
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-191501101
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52339/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-02 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-191500837
  
**[Test build #52339 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52339/consoleFull)**
 for PR 11455 at commit 
[`1b95f5b`](https://github.com/apache/spark/commit/1b95f5b4b04541aa2368c96f8431387c63df3c7a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  case class WorkerLatestState(`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-02 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-191445880
  
**[Test build #52339 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52339/consoleFull)**
 for PR 11455 at commit 
[`1b95f5b`](https://github.com/apache/spark/commit/1b95f5b4b04541aa2368c96f8431387c63df3c7a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-191037841
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52275/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-191037838
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-191037651
  
**[Test build #52275 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52275/consoleFull)**
 for PR 11455 at commit 
[`97002e4`](https://github.com/apache/spark/commit/97002e4e82da99ddf4809c602fc05178ad9cb955).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-191032425
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-191032426
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52272/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-191032127
  
**[Test build #52272 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52272/consoleFull)**
 for PR 11455 at commit 
[`6c13702`](https://github.com/apache/spark/commit/6c13702ea10973af27885c3ecaa4213f2f3f0892).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-191015418
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-191015410
  
**[Test build #52279 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52279/consoleFull)**
 for PR 11455 at commit 
[`7e0b2a2`](https://github.com/apache/spark/commit/7e0b2a277503cd7e573ee3e3ff594705f56cb630).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  case class WorkerState(`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-191015420
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52279/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-191014184
  
**[Test build #52279 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52279/consoleFull)**
 for PR 11455 at commit 
[`7e0b2a2`](https://github.com/apache/spark/commit/7e0b2a277503cd7e573ee3e3ff594705f56cb630).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-191002969
  
**[Test build #52275 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52275/consoleFull)**
 for PR 11455 at commit 
[`97002e4`](https://github.com/apache/spark/commit/97002e4e82da99ddf4809c602fc05178ad9cb955).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-01 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/11455#discussion_r54663197
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/master/ApplicationInfo.scala ---
@@ -69,7 +69,11 @@ private[spark] class ApplicationInfo(
 appUIUrlAtHistoryServer = None
   }
 
-  private def newExecutorId(useID: Option[Int] = None): Int = {
+  /**
+   * If `useID` is empty, allocate and return a new executor id. 
Otherwise, update `nextExecutorId`
+   * if necessary and return `useID.get`.
+   */
+  private def getOrCreateExecutorId(useID: Option[Int] = None): Int = {
--- End diff --

Renamed this method as it's confusing. It may just return the executor id.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-01 Thread zsxwing
Github user zsxwing commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-190996138
  
cc @andrewor14 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-01 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/11455#discussion_r54663122
  
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala 
---
@@ -346,19 +346,29 @@ private[deploy] class Master(
   logInfo("Worker has been re-registered: " + workerId)
   worker.state = WorkerState.ALIVE
 
-  val validExecutors = executors.filter(exec => 
idToApp.get(exec.appId).isDefined)
+  val (validExecutors, invalidExecutors) =
+executors.partition(exec => idToApp.get(exec.appId).isDefined)
   for (exec <- validExecutors) {
--- End diff --

`validExecutors` may contains executors launched after master sends 
`RegisteredWorker`. Then `app.addExecutor` and `worker.addExecutor` will be 
called twice. So I made them idempotent. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13604][Core]Sync worker's state after r...

2016-03-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11455#issuecomment-190995264
  
**[Test build #52272 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52272/consoleFull)**
 for PR 11455 at commit 
[`6c13702`](https://github.com/apache/spark/commit/6c13702ea10973af27885c3ecaa4213f2f3f0892).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org