Add Mahout as a service
-----------------------

                 Key: WHIRR-384
                 URL: https://issues.apache.org/jira/browse/WHIRR-384
             Project: Whirr
          Issue Type: New Feature
          Components: new service
    Affects Versions: 0.7.0
            Reporter: Frank Scholten
             Fix For: 0.7.0


Here is an initial patch to support Mahout as a Whirr service.

I created the role 'mahout-home' which can be used to install the binary Mahout 
distribution on a Hadoop namenode.
By combining this role with configuration for a Hadoop cluster you can SSH into 
the namenode, su to root and start running Mahout jobs via the mahout script 
immediately.

The 'mahout-home' role has two properties

Mahout version                                  whirr.mahout.version 
URL of the Mahout binary distribution tarball   whirr.mahout.tarball.url

Note that I used a snapshot version of Mahout for testing, revision 1169784, 
because there were some problems with the Mahout script in 0.5 that have been 
fixed on trunk, see MAHOUT-680. To test you can set the tarball property to 
this link 
http://dl.dropbox.com/u/13436484/mahout-distribution-0.6-SNAPSHOT.tar.gz

I used configure actions and the onBeforeConfigure(). If there is a better way 
to express this with the Whirr API let me know.

Currently I am investigating a 'mahout-jar' role, which installs the Mahout 
examples job jar under $HADOOP_HOME/lib on a tasktracer node. I already have 
some code for putting the jar in place but when running a job from my local 
machine I still get ClassNotFoundExceptions. I believe this is because Hadoop 
has already started before the jar is put in the lib dir, so the jar won't be 
picked up, but I have to investigate some more. From WHIRR-221 I understood 
that there is no support (yet?) for ordering of services but if you have an 
idea on how to fix this let me know.

Comments and suggestions welcome!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to