[GitHub] spark pull request: [SPARK-1146] Vagrant support for Spark

2014-02-27 Thread ngbinh
GitHub user ngbinh reopened a pull request: https://github.com/apache/spark/pull/26 [SPARK-1146] Vagrant support for Spark This PR uses Vagrant to create a clusters of three VMs, one master and two workers. It allows running/testing Spark Cluster mode on one machine. My ini

[GitHub] spark pull request: [SPARK-1146] Vagrant support for Spark

2014-02-27 Thread ngbinh
Github user ngbinh closed the pull request at: https://github.com/apache/spark/pull/26 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enable

[GitHub] spark pull request: [SPARK-1146] Vagrant support for Spark

2014-02-27 Thread ngbinh
Github user ngbinh commented on the pull request: https://github.com/apache/spark/pull/26#issuecomment-36275658 Thanks for reminding me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this f

[GitHub] spark pull request: [SPARK-1146] Vagrant support for Spark

2014-02-27 Thread hsaputra
Github user hsaputra commented on the pull request: https://github.com/apache/spark/pull/26#issuecomment-36275259 Hi @ngbinh, do you mind tagging the PR with "[WIP]" prefix to help indicate you are still working on this? Thx! --- If your project is set up for it, you can rep

[GitHub] spark pull request: [SPARK-1146] Vagrant support for Spark

2014-02-27 Thread ngbinh
Github user ngbinh commented on the pull request: https://github.com/apache/spark/pull/26#issuecomment-36226895 One of the main reason I work on this PR is I found many times working on Spark local mode doesn't expose problems when deployed on a cluster. This PR should allow Spark dev

[GitHub] spark pull request: [SPARK-1146] Vagrant support for Spark

2014-02-27 Thread ngbinh
Github user ngbinh commented on the pull request: https://github.com/apache/spark/pull/26#issuecomment-36226574 I can argue that having ec2, stand alone cluster scripts inside the core repo is important for Spark adoption. @markhamstra I agree. My feeling is the benefit is st

[GitHub] spark pull request: [SPARK-1146] Vagrant support for Spark

2014-02-27 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/26#issuecomment-36226183 Yes, they definitely have value, but putting them directly into Spark also has costs and imposes responsibilities on the maintainers. The question is how to get the be

[GitHub] spark pull request: [SPARK-1146] Vagrant support for Spark

2014-02-27 Thread ngbinh
Github user ngbinh commented on the pull request: https://github.com/apache/spark/pull/26#issuecomment-36226015 I agree that while they are not necessary be a part of Spark core because there is usually no direct dependencies between them. But I feel like they make Spark more accessib

[GitHub] spark pull request: [SPARK-1146] Vagrant support for Spark

2014-02-27 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/26#issuecomment-36225852 FWIW I agree. The tendency is almost always to include a bunch of modules that are really separate, slightly-downstream projects. You could make similar arguments for even m

[GitHub] spark pull request: [SPARK-1146] Vagrant support for Spark

2014-02-27 Thread jyotiska
Github user jyotiska commented on the pull request: https://github.com/apache/spark/pull/26#issuecomment-36225645 +1 It will be better if these projects were made separate from core-spark project and grown as independent projects. This keeps the core project lean and helps to grow the

[GitHub] spark pull request: [SPARK-1146] Vagrant support for Spark

2014-02-27 Thread markhamstra
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/26#issuecomment-36224757 I'm bothered by the idea of vagrant, docker, ec2, and potentially other virtualization and cloud environments (EMR, etc.) all becoming supported and maintained parts of

[GitHub] spark pull request: [SPARK-1146] Vagrant support for Spark

2014-02-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/26#issuecomment-36222794 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: [SPARK-1146] Vagrant support for Spark

2014-02-27 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/26#issuecomment-36222795 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12911/ --- If your project i