-1
The cluster reuters example results in zero clusters when choosing
streaming k-means. The other steps, unpacking and building do work.
I see this stacktrace:
INFO: Number of Centroids: 0
Jan 19, 2014 3:51:08 PM org.apache.hadoop.mapred.LocalJobRunner$Job run
WARNING: job_local797072544_0001
java.lang.IllegalArgumentException: Must have nonzero number of training
and test vectors. Asked for %.1f %% of %d vectors for test
[10.000000149011612, 0]
at
com.google.common.base.Preconditions.checkArgument(Preconditions.java:120)
at
org.apache.mahout.clustering.streaming.cluster.BallKMeans.splitTrainTest(BallKMeans.java:176)
at
org.apache.mahout.clustering.streaming.cluster.BallKMeans.cluster(BallKMeans.java:192)
at
org.apache.mahout.clustering.streaming.mapreduce.StreamingKMeansReducer.getBestCentroids(StreamingKMeansReducer.java:107)
at
org.apache.mahout.clustering.streaming.mapreduce.StreamingKMeansReducer.reduce(StreamingKMeansReducer.java:73)
at
org.apache.mahout.clustering.streaming.mapreduce.StreamingKMeansReducer.reduce(StreamingKMeansReducer.java:37)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:177)
at
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)
Num clusters: 0; maxDistance: 0.000000
[Dunn Index] First: Infinity
[Davies-Bouldin Index] First: NaN
Jan 19, 2014 3:51:09 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Program took 278 ms (Minutes: 0.004633333333333333)
cluster,distance.mean,distance.sd
,distance.q0,distance.q1,distance.q2,distance.q3,distance.q4,count,is.train
Here is the full log: http://pastebin.com/TxLV0rDr
As of yet I am unfamiliar with the streaming k-means code and the
algorithms behind it. If anyone has suggestion on what goes wrong in the
code I am I happy to help where I can.
Frank
On Sun, Jan 19, 2014 at 10:55 AM, Suneel Marthi <[email protected]>wrote:
> Thanks Grant.
>
> Not sure if I can vote given my role as the BuildMeister/ReleaseMeister
> for 0.9.
> Here's my +1 FWIW.
>
> a) Attached is the draft of the Release notes for 0.9, would definitely
> appreciate feedback on that.
>
> b) The vote is open until Monday, Jan 20, 2014 11:59PM EST and passes if a
> majority of atleast 3 +1 PMC votes are cast.
>
> The release files, including signatures, digests, etc can be found at:
>
> https://repository.apache.org/content/repositories/orgapachemahout-1002/org/apache/mahout/mahout-distribution/0.9/
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachemahout-1002<https://repository.apache.org/content/repositories/orgapachemahout-1002/org/apache/mahout/mahout-distribution/0.9/>
>
> Release artifacts have been signed with the following key:
> https://people.apache.org/keys/committer/smarthi.asc<https://people.apache.org/keys/committer/pwendell.asc>
>
>
>
>
>
>
> On Saturday, January 18, 2014 12:27 PM, Grant Ingersoll <
> [email protected]> wrote:
> Ran the tests, verified sigs, tried out a few of the examples.
>
> +1 (binding)
>
> On Jan 16, 2014, at 9:41 AM, Suneel Marthi <[email protected]>
> wrote:
>
> > Third time's a Charm!!!
> >
> >
> > Here's the new URL for Mahout 0.9 Release:
> >
> https://repository.apache.org/content/repositories/orgapachemahout-1002/org/apache/mahout/mahout-distribution/0.9/
> >
> > For those volunteering to test this, some of the things to be verified:
> >
> > a) Verify that u can unpack the release (tar or zip)
> > b) Verify u r able to compile the distro
> > c) Run through the unit tests: mvn clean test
> > d) Run the example scripts under $MAHOUT_HOME/examples/bin. Please run
> through all the different options in each script.
> >
> >
> > Committers
> > and PMC members:
> > ---------------------------------------
> >
> > Need 'at least 3 +1 votes' for the Release to pass.
> >
> >
> > Thanks and Regards.
>
>
>
>