Thanks for posting that script, Patrick. It looks like a good place to start.
Regarding Docker vs. Packer, as I understand it you can use Packer to create Docker containers at the same time as AMIs and other image types. Nick On Sat, Oct 4, 2014 at 2:49 AM, Patrick Wendell <pwend...@gmail.com> wrote: > Hey All, > > Just a couple notes. I recently posted a shell script for creating the > AMI's from a clean Amazon Linux AMI. > > https://github.com/mesos/spark-ec2/blob/v3/create_image.sh > > I think I will update the AMI's soon to get the most recent security > updates. For spark-ec2's purpose this is probably sufficient (we'll > only need to re-create them every few months). > > However, it would be cool if someone wanted to tackle providing a more > general mechanism for defining Spark-friendly "images" that can be > used more generally. I had thought that docker might be a good way to > go for something like this - but maybe this packer thing is good too. > > For one thing, if we had a standard image we could use it to create > containers for running Spark's unit test, which would be really cool. > This would help a lot with random issues around port and filesystem > contention we have for unit tests. > > I'm not sure if the long term place for this would be inside the spark > codebase or a community library or what. But it would definitely be > very valuable to have if someone wanted to take it on. > > - Patrick > > On Fri, Oct 3, 2014 at 5:20 PM, Nicholas Chammas > <nicholas.cham...@gmail.com> wrote: > > FYI: There is an existing issue -- SPARK-3314 > > <https://issues.apache.org/jira/browse/SPARK-3314> -- about scripting > the > > creation of Spark AMIs. > > > > With Packer, it looks like we may be able to script the creation of > > multiple image types (VMWare, GCE, AMI, Docker, etc...) at once from a > > single Packer template. That's very cool. > > > > I'll be looking into this. > > > > Nick > > > > > > On Thu, Oct 2, 2014 at 8:23 PM, Nicholas Chammas < > nicholas.cham...@gmail.com > >> wrote: > > > >> Thanks for the update, Nate. I'm looking forward to seeing how these > >> projects turn out. > >> > >> David, Packer looks very, very interesting. I'm gonna look into it more > >> next week. > >> > >> Nick > >> > >> > >> On Thu, Oct 2, 2014 at 8:00 PM, Nate D'Amico <n...@reactor8.com> wrote: > >> > >>> Bit of progress on our end, bit of lagging as well. Our guy leading > >>> effort got little bogged down on client project to update hive/sql > testbed > >>> to latest spark/sparkSQL, also launching public service so we have > been bit > >>> scattered recently. > >>> > >>> Will have some more updates probably after next week. We are planning > on > >>> taking our client work around hive/spark, plus taking over the bigtop > >>> automation work to modernize and get that fit for human consumption > outside > >>> or org. All our work and puppet modules will be open sourced, > documented, > >>> hopefully start to rally some other folks around effort that find it > useful > >>> > >>> Side note, another effort we are looking into is gradle tests/support. > >>> We have been leveraging serverspec for some basic infrastructure > tests, but > >>> with bigtop switching over to gradle builds/testing setup in 0.8 we > want to > >>> include support for that in our own efforts, probably some stuff that > can > >>> be learned and leveraged in spark world for repeatable/tested > infrastructure > >>> > >>> If anyone has any specific automation questions to your environment you > >>> can drop me a line directly.., will try to help out best I can. Else > will > >>> post update to dev list once we get on top of our own product release > and > >>> the bigtop work > >>> > >>> Nate > >>> > >>> > >>> -----Original Message----- > >>> From: David Rowe [mailto:davidr...@gmail.com] > >>> Sent: Thursday, October 02, 2014 4:44 PM > >>> To: Nicholas Chammas > >>> Cc: dev; Shivaram Venkataraman > >>> Subject: Re: EC2 clusters ready in launch time + 30 seconds > >>> > >>> I think this is exactly what packer is for. See e.g. > >>> http://www.packer.io/intro/getting-started/build-image.html > >>> > >>> On a related note, the current AMI for hvm systems (e.g. m3.*, r3.*) > has > >>> a bad package for httpd, whcih causes ganglia not to start. For some > reason > >>> I can't get access to the raw AMI to fix it. > >>> > >>> On Fri, Oct 3, 2014 at 9:30 AM, Nicholas Chammas < > >>> nicholas.cham...@gmail.com > >>> > wrote: > >>> > >>> > Is there perhaps a way to define an AMI programmatically? Like, a > >>> > collection of base AMI id + list of required stuff to be installed + > >>> > list of required configuration changes. I'm guessing that's what > >>> > people use things like Puppet, Ansible, or maybe also AWS > >>> CloudFormation for, right? > >>> > > >>> > If we could do something like that, then with every new release of > >>> > Spark we could quickly and easily create new AMIs that have > everything > >>> we need. > >>> > spark-ec2 would only have to bring up the instances and do a minimal > >>> > amount of configuration, and the only thing we'd need to track in the > >>> > Spark repo is the code that defines what goes on the AMI, as well as > a > >>> > list of the AMI ids specific to each release. > >>> > > >>> > I'm just thinking out loud here. Does this make sense? > >>> > > >>> > Nate, > >>> > > >>> > Any progress on your end with this work? > >>> > > >>> > Nick > >>> > > >>> > > >>> > On Sun, Jul 13, 2014 at 8:53 PM, Shivaram Venkataraman < > >>> > shiva...@eecs.berkeley.edu> wrote: > >>> > > >>> > > It should be possible to improve cluster launch time if we are > >>> > > careful about what commands we run during setup. One way to do this > >>> > > would be to walk down the list of things we do for cluster > >>> > > initialization and see if there is anything we can do make things > >>> > > faster. Unfortunately this might > >>> > be > >>> > > pretty time consuming, but I don't know of a better strategy. The > >>> > > place > >>> > to > >>> > > start would be the setup.sh file at > >>> > > https://github.com/mesos/spark-ec2/blob/v3/setup.sh > >>> > > > >>> > > Here are some things that take a lot of time and could be improved: > >>> > > 1. Creating swap partitions on all machines. We could check if > there > >>> > > is a way to get EC2 to always mount a swap partition 2. Copying / > >>> > > syncing things across slaves. The copy-dir script is called too > many > >>> > > times right now and each time it pauses for a few milliseconds > >>> > > between slaves [1]. This could be improved by removing unnecessary > >>> > > copies 3. We could make less frequently used modules like Tachyon, > >>> > > persistent > >>> > hdfs > >>> > > not a part of the default setup. > >>> > > > >>> > > [1] https://github.com/mesos/spark-ec2/blob/v3/copy-dir.sh#L42 > >>> > > > >>> > > Thanks > >>> > > Shivaram > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > On Sat, Jul 12, 2014 at 7:02 PM, Nicholas Chammas < > >>> > > nicholas.cham...@gmail.com> wrote: > >>> > > > >>> > > > On Thu, Jul 10, 2014 at 8:10 PM, Nate D'Amico <n...@reactor8.com > > > >>> > wrote: > >>> > > > > >>> > > > > Starting to work through some automation/config stuff for spark > >>> > > > > stack > >>> > > on > >>> > > > > EC2 with a project, will be focusing the work through the > apache > >>> > bigtop > >>> > > > > effort to start, can then share with spark community directly > as > >>> > things > >>> > > > > progress if people are interested > >>> > > > > >>> > > > > >>> > > > Let us know how that goes. I'm definitely interested in hearing > >>> more. > >>> > > > > >>> > > > Nick > >>> > > > > >>> > > > >>> > > >>> > >>> > >> >