[ 
https://issues.apache.org/jira/browse/MAHOUT-520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe Prasanna Kumar updated MAHOUT-520:
--------------------------------------

    Attachment: MAHOUT-520-syntheticcontrol.patch

The attached patch contains a script for running the clustering algos on 
synthetic control data.

Script runs in 2 mode
1. default is the interactive mode -> from MAHOUT_HOME directory, 
examples/bin/build-cluster-syntheticcontrol.sh
2. non-interactive mode -> examples/bin/build-cluster-syntheticcontrol.sh -ni . 
this mode can be used by hudson script for automated testing

The script 
1. checks if HADOOP_HOME is set, if not throws error and halts
2. checks health of dfs by invoking $HADOOP_HOME/bin/hadoop fs -ls. If not 
healthy, throws error and halts
3. uploads synthetic_control.data to hdfs
4. checks with user on which clustering algo they'd want to use.
5. User chooses a # and the corresponding algo is executed.

I have tested the scenarios failure and success scenarios from my end. If 
someone also want to verify, that'll be wonderful.

regards
Joe.


> Add example scripts / integration tests for various algorithms.
> ---------------------------------------------------------------
>
>                 Key: MAHOUT-520
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-520
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Classification
>    Affects Versions: 0.4
>            Reporter: Drew Farris
>            Assignee: Drew Farris
>            Priority: Minor
>         Attachments: MAHOUT-520-syntheticcontrol.patch, MAHOUT-520.patch
>
>
> Scripts like build-reuters.sh are useful in that they both demonstrate 
> typical usage of Mahout from the command-line but also serve as integration 
> tests. We should add additional scripts that drive the algorithms so new 
> users can quickly run the examples. 
> Perhaps these can also be run from hudson as a part of the nightly builds and 
> can serve as integration tests.
> As a start towards this goal, provide build-20news-bayes.sh example (in the 
> same vein as build-reuters.sh, that follows 
> https://cwiki.apache.org/MAHOUT/twenty-newsgroups.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to