Agreed, and the need for two constants (FOO_OPTION and FOO_OPTION_KEY) is tedious. I will look at moving the argMap up to AbstractJob where it can be accessed by a single key constant. I'm working on ClusterDumper right now and will post a patch in this direction before I head out this morning.

On 7/16/10 7:47 AM, Drew Farris (JIRA) wrote:
     [ 
https://issues.apache.org/jira/browse/MAHOUT-294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12889199#action_12889199
 ]

Drew Farris commented on MAHOUT-294:
------------------------------------

bq, Testability of the command line option processing is complicated by the 
fact that run() bundles this in with running the job so the command line stuff 
cannot be tested in isolation. This makes testing all of the argument 
corner-cases tedious and unnecessarily time-consuming. I'm going to look at 
factoring the options parsing out of run.

This will most definitely be a good thing to do. I've always wondered if  more 
of the option processing can be pushed up to AbstractJob. Once the arguments 
hash is obtained it still seems that there is a lot of work to be done prior to 
launching each job

Uniform API behavior for Jobs
-----------------------------

                 Key: MAHOUT-294
                 URL: https://issues.apache.org/jira/browse/MAHOUT-294
             Project: Mahout
          Issue Type: Improvement
          Components: Classification, Clustering, Collaborative Filtering, 
Frequent Itemset/Association Rule Mining, Genetic Algorithms, Math, Utils
    Affects Versions: 0.4
            Reporter: Robin Anil
             Fix For: 0.4

         Attachments: MAHOUT-294.patch, MAHOUT-294.patch


* Move AbstractJob to common and convert all the Driver classes to extend that.
    One suggestion is:
    AlgorithmParams params = ParamsBuilder.build().withParam("-i", 
input).withParam("-o", output)....
    MyAlgorithmn.runJob(params) throws ParameterMissingException;
* Give uniform command-line parameters for various algorithms.
    e.g Currently distance measure is -d, -dm, -m at different places in 
clustering
* Add a temp directory as a parameter 
http://www.lucidimagination.com/search/document/28a979aa62c02a1/who_owns_mahout_bucket_on_s3#ddb5855e8bdace45
This issue will keep track of all discussion/patches related to the design and 
cleanup of Mahout API

Reply via email to