On May 19, 2009, at 6:59 AM, Tim Bass wrote:
Dear All,
A few months ago (on the developer's list) we briefly touched on the
idea of building a Mahout public AMI on EC2.
Subsequently, Amazon released EMR and a number of folks have
experimented with running sample Mahout jobs on EMR.
What are the pros and cons of creating a public Mahout AMI with Hadoop
and MapReduce configured with the versions that
are supported by the developers, in addition to Amazon's EMR
implementation?
AFAICT, one issue seems to be that EMR locks you into a specific
Hadoop instance. Not sure if "locks" is too strong, maybe I should
say it "encourages" you to use a specific version?
As Ted and others pointed out, I think we would benefit from tools
that make it easy to add Mahout to an AMI.
-Grant