[ 
https://issues.apache.org/jira/browse/MAHOUT-185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797233#action_12797233
 ] 

Jake Mannix commented on MAHOUT-185:
------------------------------------

As a note on this:  one of the things I've sometimes done (and we do for 
managing our Hadoop jobs at LinkedIn) to make dealing with messy CLI stuff more 
managable, is to also allow for Properties files with default arguments for 
various jobs (makes for much more easily reproducible results, and it's self 
documenting - just have "mahout classify" look first in classify.props to see 
if default args are defined, go from there...).

Using a base class like hadoop's Tool, you can leverage ToolRunner and 
GenericOptionsParser as well, and then hooking in a Properties-based way to run 
it as well makes it pretty flexible.

It would be really nice to consolidate all of our Driver/Job classes into this 
issue, so that it's a) not duplicated, but b) in one place.  

This issue should get some priority - it will seriously help with our usability 
if there's an easy way to launch all the various tasks from one simple place.  
I'd love to have a little jruby script to run some of this stuff too, because 
when I was first writing decomposer, I found it invaluable to be able to just 
drop into jirb's REPL and start issuing java commands to run the various Hadoop 
jobs I was testing.

> Add mahout shell script for easy launching of various algorithms
> ----------------------------------------------------------------
>
>                 Key: MAHOUT-185
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-185
>             Project: Mahout
>          Issue Type: New Feature
>    Affects Versions: 0.2
>         Environment: linux, bash
>            Reporter: Robin Anil
>             Fix For: 0.3
>
>
> Currently, Each algorithm has a different point of entry. At its too 
> complicated to understand and launch each one.  A mahout shell script needs 
> to be made in the bin directory which does something like the following
> mahout classify -algorithm bayes [OPTIONS]
> mahout cluster -algorithm canopy  [OPTIONS]
> mahout fpm -algorithm pfpgrowth [OPTIONS]
> mahout taste -algorithm slopeone [OPTIONS] 
> mahout misc -algorithm createVectorsFromText [OPTIONS]
> mahout examples WikipediaExample

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to