[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126948763
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126948760
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126937407
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126936242
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126918290
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-12651
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-08-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-12644
  
  [Test build #39336 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/39336/console)
 for   PR 7839 at commit 
[`23994ff`](https://github.com/apache/spark/commit/23994ff0fd84162b150b2cb379eea0f8c1d54e5d).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  public static class AppExecId implements Serializable `
  * `public class ExecutorShuffleInfo implements Encodable, Serializable `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-08-01 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126877216
  
  [Test build #39336 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/39336/consoleFull)
 for   PR 7839 at commit 
[`23994ff`](https://github.com/apache/spark/commit/23994ff0fd84162b150b2cb379eea0f8c1d54e5d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126877168
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126877165
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-07-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126867088
  
  [Test build #39329 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/39329/console)
 for   PR 7839 at commit 
[`a36729c`](https://github.com/apache/spark/commit/a36729c7b7dbb9120bc7c57d08c48766c30ddc65).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  public static class AppExecId implements Serializable `
  * `public class ExecutorShuffleInfo implements Encodable, Serializable `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126867089
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-07-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126866744
  
  [Test build #39329 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/39329/consoleFull)
 for   PR 7839 at commit 
[`a36729c`](https://github.com/apache/spark/commit/a36729c7b7dbb9120bc7c57d08c48766c30ddc65).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126865864
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126865860
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-07-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126817177
  
  [Test build #39275 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/39275/console)
 for   PR 7839 at commit 
[`0e9d69b`](https://github.com/apache/spark/commit/0e9d69b9fc4bdebd5d226778c336f3157f3b065f).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `public class ExecutorShuffleInfo implements Encodable, Serializable `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126817183
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-07-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126816790
  
  [Test build #39275 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/39275/consoleFull)
 for   PR 7839 at commit 
[`0e9d69b`](https://github.com/apache/spark/commit/0e9d69b9fc4bdebd5d226778c336f3157f3b065f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126815975
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126815953
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-07-31 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126805317
  
Cool!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126804196
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-07-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126804189
  
  [Test build #39265 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/39265/console)
 for   PR 7839 at commit 
[`0b588bd`](https://github.com/apache/spark/commit/0b588bd5e0dd4ee81f9b4f856f9a57f4a4c72151).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `public class ExecutorShuffleInfo implements Encodable, Serializable `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-07-31 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126803874
  
  [Test build #39265 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/39265/consoleFull)
 for   PR 7839 at commit 
[`0b588bd`](https://github.com/apache/spark/commit/0b588bd5e0dd4ee81f9b4f856f9a57f4a4c72151).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126803595
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-07-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/7839#issuecomment-126803569
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9439] [yarn] [wip] External shuffle ser...

2015-07-31 Thread squito
GitHub user squito opened a pull request:

https://github.com/apache/spark/pull/7839

[SPARK-9439] [yarn] [wip] External shuffle service should be robust to NM 
restarts

https://issues.apache.org/jira/browse/SPARK-9439

In general, Yarn apps should be robust to NodeManager restarts.  However, 
if you run spark with the external shuffle service on, after a NM restart all 
shuffles fail, b/c the shuffle service has lost some state with info on each 
executor.  (Note the shuffle data is perfectly fine on disk across a NM 
restart, the problem is we've lost the small bit of state that lets us *find* 
those files.)

The solution proposed here is that the external shuffle service can write 
out its state to a file every time an executor is added.  When running with 
yarn, that file is in the NM's local dir.  Whenever the service is started, it 
looks for that file, and if it exists, it reads the file and re-registers all 
executors there.

Nothing is changed in non-yarn modes with this patch.  The service is not 
given a place to save the state to, so it operates the same as before.  This 
should make it easy to update other cluster managers as well, but just 
supplying the right file, but I wasn't familiar enough w/ the other modes to 
know where they should stick that file.

The current code needs a lot of polish & tests, but thought I'd put it up 
now to get opinions on the overall approach.  Also I'm pretty sure there are 
some potential leaks, if the NM restarts right when an app ends, and then we 
never remove executors or something like that, need to think about that more.  
But, I have tested this on a small cluster, killed the NM on a running spark 
app, and it works.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/squito/spark 
external_shuffle_service_NM_restart

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/7839.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #7839


commit b9d2ced92a9712edf7612f4cee1ac929e5fa3dab
Author: Imran Rashid 
Date:   2015-07-28T14:17:15Z

incomplete setup for external shuffle service tests

commit 36127d38750e75d682c8858b7128bd634732be71
Author: Imran Rashid 
Date:   2015-07-28T18:27:02Z

wip

commit bb3ba494be05a11ab8105ce4ce570f60d1d33e51
Author: Imran Rashid 
Date:   2015-07-28T18:27:12Z

minor cleanup

commit c69f46b98b6b9da16bfc7fc3a15deeade3e229ed
Author: Imran Rashid 
Date:   2015-07-30T20:08:49Z

maybe working version, needs tests & cleanup ...

commit 5e5a7c36c3f6ac2c0f0dd6738eb4bb66605f8428
Author: Imran Rashid 
Date:   2015-07-31T15:37:14Z

fix build

commit ad122ef6b07eba0326929057704ef827a7eb26d6
Author: Imran Rashid 
Date:   2015-07-31T15:40:00Z

more fixes

commit 0b588bd5e0dd4ee81f9b4f856f9a57f4a4c72151
Author: Imran Rashid 
Date:   2015-07-31T15:52:57Z

more fixes ...




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org