Re: Should spark-ec2 get its own repo?

2015-08-03 Thread Shivaram Venkataraman
I sent a note to the Mesos developers and created https://github.com/apache/spark/pull/7899 to change the repository pointer. There are 3-4 open PRs right now in the mesos/spark-ec2 repository and I'll work on migrating them to amplab/spark-ec2 later today. My thoughts on moving the python script

Re: Should spark-ec2 get its own repo?

2015-08-02 Thread Nicholas Chammas
On Sat, Aug 1, 2015 at 1:09 PM Matt Goodman meawo...@gmail.com wrote: I am considering porting some of this to a more general spark-cloud launcher, including google/aliyun/rackspace. It shouldn't be hard at all given the current approach for setup/install. FWIW, there are already some tools

Re: Should spark-ec2 get its own repo?

2015-08-01 Thread Matt Goodman
I think that is a good idea, and slated to happen. At the very least a README or some such. Is this a use case for git submodules? I am considering porting some of this to a more general spark-cloud launcher, including google/aliyun/rackspace. It shouldn't be hard at all given the current

Re: Should spark-ec2 get its own repo?

2015-08-01 Thread Josh Rosen
I don't think that using git submodules is a good idea here: - The extra `git submodule init git submodule update` step can lead to confusing problems in certain workflows. - We'd wind up with many commits that serve only to bump the submodule SHA; these commits will be hard to

Re: Should spark-ec2 get its own repo?

2015-07-31 Thread Patrick Wendell
Hey All, I've mostly kept quiet since I am not very active in maintaining this code anymore. However, it is a bit odd that the project is split-brained with a lot of the code being on github and some in the Spark repo. If the consensus is to migrate everything to github, that seems okay with me.

Re: Should spark-ec2 get its own repo?

2015-07-31 Thread Shivaram Venkataraman
Yes - It is still in progress, but I have just not gotten time to get to this. I think getting the repo moved from mesos to amplab in the codebase by 1.5 should be possible. Thanks Shivaram On Fri, Jul 31, 2015 at 3:08 AM, Sean Owen so...@cloudera.com wrote: PS is this still in progress? it

Re: Should spark-ec2 get its own repo?

2015-07-31 Thread Sean Owen
PS is this still in progress? it feels like something that would be good to do before 1.5.0, if it's going to happen soon. On Wed, Jul 22, 2015 at 6:59 AM, Shivaram Venkataraman shiva...@eecs.berkeley.edu wrote: Yeah I'll send a note to the mesos dev list just to make sure they are informed.

Re: Should spark-ec2 get its own repo?

2015-07-22 Thread Shivaram Venkataraman
Yeah I'll send a note to the mesos dev list just to make sure they are informed. Shivaram On Tue, Jul 21, 2015 at 11:47 AM, Sean Owen so...@cloudera.com wrote: I agree it's worth informing Mesos devs and checking that there are no big objections. I presume Shivaram is plugged in enough to

Re: Should spark-ec2 get its own repo?

2015-07-21 Thread Shivaram Venkataraman
There is technically no PMC for the spark-ec2 project (I guess we are kind of establishing one right now). I haven't heard anything from the Spark PMC on the dev list that might suggest a need for a vote so far. I will send another round of email notification to the dev list when we have a JIRA /

Re: Should spark-ec2 get its own repo?

2015-07-21 Thread Mridul Muralidharan
If I am not wrong, since the code was hosted within mesos project repo, I assume (atleast part of it) is owned by mesos project and so its PMC ? - Mridul On Tue, Jul 21, 2015 at 9:22 AM, Shivaram Venkataraman shiva...@eecs.berkeley.edu wrote: There is technically no PMC for the spark-ec2

Re: Should spark-ec2 get its own repo?

2015-07-21 Thread Shivaram Venkataraman
Thats part of the confusion we are trying to fix here -- the repository used to live in the mesos github account but was never a part of the Apache Mesos project. It was a remnant part of Spark from when Spark used to live at github.com/mesos/spark. Shivaram On Tue, Jul 21, 2015 at 11:03 AM,

Re: Should spark-ec2 get its own repo?

2015-07-21 Thread Mridul Muralidharan
That sounds good. Thanks for clarifying ! Regards, Mridul On Tue, Jul 21, 2015 at 11:09 AM, Shivaram Venkataraman shiva...@eecs.berkeley.edu wrote: Thats part of the confusion we are trying to fix here -- the repository used to live in the mesos github account but was never a part of the

Re: Should spark-ec2 get its own repo?

2015-07-20 Thread Shivaram Venkataraman
I've created https://github.com/amplab/spark-ec2 and added an initial set of committers. Note that this is not a fork of the existing github.com/mesos/spark-ec2 and users will need to fork from here. This is mostly to avoid the base-fork in pull requests being set incorrectly etc. I'll be

Re: Should spark-ec2 get its own repo?

2015-07-20 Thread Mridul Muralidharan
Might be a good idea to get the PMC's of both projects to sign off to prevent future issues with apache. Regards, Mridul On Mon, Jul 20, 2015 at 12:01 PM, Shivaram Venkataraman shiva...@eecs.berkeley.edu wrote: I've created https://github.com/amplab/spark-ec2 and added an initial set of

Re: Should spark-ec2 get its own repo?

2015-07-17 Thread Shivaram Venkataraman
Some replies inline On Wed, Jul 15, 2015 at 1:08 AM, Sean Owen so...@cloudera.com wrote: The code can continue to be a good reference implementation, no matter where it lives. In fact, it can be a better more complete one, and easier to update. I agree that ec2/ needs to retain some kind of

Re: Should spark-ec2 get its own repo?

2015-07-17 Thread Sean Owen
On Fri, Jul 17, 2015 at 6:58 PM, Shivaram Venkataraman shiva...@eecs.berkeley.edu wrote: I am not sure why the ASF JIRA can be only used to track one set of artifacts that are packaged and released together. I agree that marking a fix version as 1.5 for a change in another repo doesn't make a

Re: Should spark-ec2 get its own repo?

2015-07-15 Thread Sean Owen
The code can continue to be a good reference implementation, no matter where it lives. In fact, it can be a better more complete one, and easier to update. I agree that ec2/ needs to retain some kind of pointer to the new location. Yes, maybe a script as well that does the checkout as you say. We

Re: Should spark-ec2 get its own repo?

2015-07-14 Thread Matt Goodman
I concur with the things Sean said about keeping the same JIRA. Frankly, its a pretty small part of spark, and as mentioned by Nicholas, a reference implementation of getting Spark running in ec2. I can see wanting to grow it to a little more general tool that implements launchers for other

Re: Should spark-ec2 get its own repo?

2015-07-13 Thread Shivaram Venkataraman
I think moving the repo-location and re-organizing the python code to handle dependencies, testing etc. sounds good to me. However, I think there are a couple of things which I am not sure about 1. I strongly believe that we should preserve existing command-line in ec2/spark-ec2 (i.e. the shell

Re: Should spark-ec2 get its own repo?

2015-07-13 Thread Nicholas Chammas
At a high level I see the spark-ec2 scripts as an effort to provide a reference implementation for launching EC2 clusters with Apache Spark On a side note, this is precisely how I used spark-ec2 for a personal project that does something similar: reference implementation. Nick 2015년 7월 13일 (월)

Re: Should spark-ec2 get its own repo?

2015-07-12 Thread Sean Owen
I agree with these points. The ec2 support is substantially a separate project, and would likely be better managed as one. People can much more rapidly iterate on it and release it. I suggest: 1. Pick a new repo location. amplab/spark-ec2 ? spark-ec2/spark-ec2 ? 2. Add interested parties as

Re: Should spark-ec2 get its own repo?

2015-07-11 Thread Matt Goodman
I wanted to revive the conversation about the spark-ec2 tools, as it seems to have been lost in the 1.4.1 release voting spree. I think that splitting it into its own repository is a really good move, and I would also be happy to help with this transition, as well as help maintain the resulting

Re: Should spark-ec2 get its own repo?

2015-07-03 Thread Sean Owen
I'll render an opinion although I'm only barely qualified by having just had a small discussion on this -- It does seem like mesos/spark-ec2 is in the wrong place, although really, that is at best an issue for Mesos. But it does highlight that the Spark EC2 support doesn't entirely live with and

Re: Should spark-ec2 get its own repo?

2015-07-03 Thread Shivaram Venkataraman
As the person maintaining the mesos/spark-ec2 repo, here are my 2 cents - I don't think it makes sense to put the scripts in the Spark repo itself. Cloning the scripts on the EC2 instances is an intentional design which allows us to make minor config changes in EC2 launches without needing a new