On Jan 18, 2010, at 10:31 AM, Sean Owen wrote:

> AFAIK AMIs are fixed. You make your instance as you like it, then run
> some special voodoo to save it off as an AMI. Later you can run the
> AMI, change it, build a new one, but that's a new one. Yeah anyone can
> do it.

Right, I just mostly want a way for others, presumably committers, to be able 
to edit the same image, so that we aren't duplicating efforts or spinning off a 
bunch of different AMI's that confuse people.  


> 
> I think this came up before and my only question is, what's the use
> case for this we're trying to answer? So far it sounds like a regular
> instance with a copy of a Mahout .jar. Is this meaningfully more
> useful for someone than simply providing the .jar? I can't exactly
> migrate from one Mahout AMI to another in any sense, when upgrades are
> provided -- AMIs aren't a mechanism for distributing a library.
> 
> We're also not talking about providing a ready-to-go Hadoop cluster.
> And shouldn't. This is something Elastic Mapreduce is already great
> for.
> 

Except EMR is on 0.18.3.  So, yes, I am interested in a ready-to-go Hadoop 
cluster along w/ a suite of data sets that we can use to benchmark Mahout trunk 
and make it easier for people to try out Mahout or even run in production.  So 
while I would agree they aren't a mechanism for distributing a library, they 
are very useful for getting people up and running very quickly.

At any rate, I think the bigger takeaway from your point is this doesn't have 
to be some officially supported thing and it isn't required of releases.

I mostly, right now, have a need to benchmark Mahout's clustering capabilities 
and thus need a Hadoop cluster.  Rather than do a one off like many others have 
done, I'd like to share my efforts w/ others so that we all, hopefully, 
benefit.  I can definitely say that if there was an AMI on it that was already 
preconfigured for me w/ Mahout trunk and Hadoop ready to go, I'd use it and I 
bet others would too.

So far, I have everything on an instance (mvn, svn, java, Mahout, etc.) except 
the Hadoop cluster stuff.  I've already run mvn install on Mahout.  In other 
words, it's pretty ready to go.


> Once upon a time I wrote an AMI that would fire up, automatically
> download data from a location, run recommendations, upload them, and
> quit. Pretty simple, pretty nice. *That* kind of thing I think is
> really useful. The AMI is like one big remote method invocation.

+1.  

> 
> On Mon, Jan 18, 2010 at 3:26 PM, Grant Ingersoll <[email protected]> wrote:
>> 
>> On Jan 18, 2010, at 10:20 AM, Robin Anil wrote:
>> 
>>> Perfect!. We can have two ami's. Mahout trunk and mahout release version.
>> 
>> Cool.  I'll get my base AMI up (just as soon as I figure out the security 
>> stuff) and then we can coordinate.  Is it possible to have multiple people 
>> "manage" an AMI so that the Mahout committers can reasonably take on keeping 
>> them up to date?
>> 
>> -Grant

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem using Solr/Lucene: 
http://www.lucidimagination.com/search

Reply via email to