If you aren't going to use EMR, possibly because of hadoop version issues, then I strongly second the recommendation of the alestic instances.
All of these include a start script that is downloaded from what is called a "user-data file". This can be up to 16K in length. I used that script to customize my instances with additional loftware like hadoop, java, our own software as well as reconfiguring the instance as necessary, mounting elastic block volumes and tweaking the DHCP configuration to add an over-ride to avoid a few gotchas. Total boot time was still typically < 40 s and I hear that it has gotten faster since then. On Sun, Jan 10, 2010 at 3:18 PM, Sean Owen <[email protected]> wrote: > I like the Alestic instances, though they don't have Java (IIRC). > http://alestic.com/ > > On Sun, Jan 10, 2010 at 11:16 PM, Grant Ingersoll <[email protected]> > wrote: > > Anyone have recs on a good AMI to start with on EC2 to load with Mahout? > Preferably Linux and already has Java 1.6 installed. > > > > Thanks, > > Grant > -- Ted Dunning, CTO DeepDyve
