[
https://issues.apache.org/jira/browse/MAHOUT-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13973994#comment-13973994
]
Hudson commented on MAHOUT-1440:
--------------------------------
SUCCESS: Integrated in Mahout-Quality #2576 (See
[https://builds.apache.org/job/Mahout-Quality/2576/])
MAHOUT-1440 Add option to set the RNG seed for inital cluster generation in
Kmeans/fKmeans (ssc: rev 1588439)
* /mahout/trunk/CHANGELOG
*
/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/fuzzykmeans/FuzzyKMeansDriver.java
*
/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/kmeans/KMeansDriver.java
*
/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/kmeans/RandomSeedGenerator.java
*
/mahout/trunk/core/src/main/java/org/apache/mahout/common/commandline/DefaultOptionCreator.java
*
/mahout/trunk/core/src/test/java/org/apache/mahout/clustering/kmeans/TestRandomSeedGenerator.java
* /mahout/trunk/src/conf/fkmeans.props
* /mahout/trunk/src/conf/kmeans.props
> Add option to set the RNG seed for inital cluster generation in Kmeans/fKmeans
> ------------------------------------------------------------------------------
>
> Key: MAHOUT-1440
> URL: https://issues.apache.org/jira/browse/MAHOUT-1440
> Project: Mahout
> Issue Type: Improvement
> Components: CLI, Clustering
> Affects Versions: 1.0
> Reporter: Andrew Palumbo
> Assignee: Sebastian Schelter
> Priority: Minor
> Labels: reproducibility
> Fix For: 1.0
>
> Attachments: MAHOUT-1440.patch
>
>
> It was noted recently that there should be a way to set a static seed for the
> the initial clusters of Kmeans. In the interests of reproducibility and
> benchmarking, this patch adds an option to set the seed in the RNG used in
> the RandomSeedGenerator.buildRandom() method called from the KmeansDriver and
> FuzzyKMeansDriver.
> I've added in a CLI option -setRandomSeed that when set to the same value
> (with the -k option set) will produce reproducible results from kmeans and
> fkmeans.
> This patch allows the user to set a value. It may make more sense to just
> have an option to set a flag to use the STANDARD_SEED from RandomWrapper.
> I am still feeling my way around the codebase so if this will be useful and
> there need to be any changes let me know.
--
This message was sent by Atlassian JIRA
(v6.2#6252)