[ 
https://issues.apache.org/jira/browse/MAHOUT-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13658697#comment-13658697
 ] 

Hudson commented on MAHOUT-1181:
--------------------------------

Integrated in Mahout-Quality #1997 (See 
[https://builds.apache.org/job/Mahout-Quality/1997/])
    MAHOUT-1181: Adding StreamingKMeans MapReduce classes

These classes implement the MapReduce version of StreamingKMeans, add a driver
and a new command line tool. (Revision 1482907)

     Result = SUCCESS
dfilimon : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1482907
Files : 
* /mahout/trunk/CHANGELOG
* /mahout/trunk/core/pom.xml
* 
/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/streaming/mapreduce
* 
/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/streaming/mapreduce/CentroidWritable.java
* 
/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/streaming/mapreduce/StreamingKMeansDriver.java
* 
/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/streaming/mapreduce/StreamingKMeansMapper.java
* 
/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/streaming/mapreduce/StreamingKMeansReducer.java
* 
/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/streaming/mapreduce/StreamingKMeansThread.java
* 
/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/streaming/mapreduce/StreamingKMeansUtilsMR.java
* 
/mahout/trunk/core/src/test/java/org/apache/mahout/clustering/streaming/mapreduce
* 
/mahout/trunk/core/src/test/java/org/apache/mahout/clustering/streaming/mapreduce/StreamingKMeansTestMR.java
* /mahout/trunk/src/conf/driver.classes.default.props

                
> Adding StreamingKMeans MapReduce classes
> ----------------------------------------
>
>                 Key: MAHOUT-1181
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1181
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Clustering
>    Affects Versions: 0.8
>            Reporter: Dan Filimon
>             Fix For: 0.8
>
>         Attachments: MAHOUT_1181.patch, MAHOUT_1181_props.patch, 
> MAHOUT_1181_test.patch
>
>
> This patch implements the MapReduce version of StreamingKMeans for 
> MAHOUT-1154.
> It adds 5 new classes:
> - CentroidWritable: class representing a centroid that can be written to a 
> SeqFile
> - StreamingKMeansDriver: class implementing AbstractJob that is the entry 
> point to the mapreduction
> - StreamingKMeansMapper: mapper, running StreamingKMeans (see MAHOUT-1162) 
> clustering the points one by one
> - StreamingKMeansReducer: reducer, running BallKMeans (see MAHOUT-1162) a 
> number of times and picking the clustering with the lowest total clustering 
> cost.
> The cost is determined by randomly splitting the incoming centroids into a 
> "training" and "test" set, computing the centroids on the training set and 
> the cost on the test set. The intent is to see whether the centroids actually 
> describe the distribution of the points or not.
> - StreamingKMeansUtilMR: helper class with a method to instantiate a searcher 
> from a Configuration.
> Additionally, there is a test class StreamingKMeansTestMR that tests the 
> mapper, reducer and mapper and reducer together using MRUnit.
> !!!
> Since MRUnit is now a dependency, the core pom.xml file adds MRUnit as a 
> dependency. We depend on snapshot 1.0 which is not yet released (it will be 
> very soon), hence the updated pom.xml is not provided for now.
> !!!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to