Oh that fixes it! +Ted for the Mahout in Action example. I'll fix it in the wiki if I can find it.
On Thu, Jun 9, 2011 at 9:23 AM, Mark <[email protected]> wrote: > Hector > > Try using the reuters-vectors/tfidf-vectors folder as input, not the top > level reueters-vectors. > > > On 6/9/11 8:45 AM, Hector Yee wrote: > >> I was following the book examples and k means , dirichlet and lda all have >> this casting problem. It may be a Mac issue not sure . I suspect it may be >> seq2sparse messing up the inputs, maybe wrong version. It outputs the >> regular part-r-* but the lda driver expects a file called data. >> >> Sent from my iPad >> >> On Jun 9, 2011, at 7:40 AM, Mark<[email protected]> wrote: >> >> Forgot to mention... great book :) >>> >>> On 6/9/11 7:30 AM, Mark wrote: >>> >>>> KMeans is busted? What do you mean by this? The algorithm simply won't >>>> work or just the reuters example? >>>> >>>> Thanks >>>> >>>> On 6/9/11 12:28 AM, Sean Owen wrote: >>>> >>>>> (Assuming you are on HEAD,) I think KMeans is busted -- this has come >>>>> up >>>>> before. I don't know if it is being maintained. Anyone who's willing >>>>> to >>>>> step up and fix it is also welcome to overhaul it IMHO. >>>>> >>>>> On Thu, Jun 9, 2011 at 12:03 AM, Hector Yee<[email protected]> >>>>> wrote: >>>>> >>>>> I got a slightly different error on the next line of KMeansDriver.java >>>>>> (running on OS X Snow Leopard) >>>>>> >>>>>> 11/06/08 16:02:12 INFO compress.CodecPool: Got brand-new compressor >>>>>> Exception in thread "main" java.lang.ClassCastException: >>>>>> org.apache.hadoop.io.IntWritable cannot be cast to >>>>>> org.apache.mahout.math.VectorWritable >>>>>> at >>>>>> >>>>>> >>>>>> org.apache.mahout.clustering.kmeans.RandomSeedGenerator.buildRandom(RandomSeedGenerator.java:90) >>>>>> at >>>>>> >>>>>> org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:102) >>>>>> >>>>>> >>>>>> On Sun, Jun 5, 2011 at 9:31 PM, Jeff Eastman<[email protected]> >>>>>> wrote: >>>>>> >>>>>> IIRC, Reuters used to run on a cluster but no longer does due to some >>>>>>> obscure Lucene changes. In 0.5 it only works in local mode. I really >>>>>>> hope >>>>>>> this can be repaired by 0.6 as Reuters is a key entry point into >>>>>>> Mahout >>>>>>> clustering for many users. >>>>>>> >>>>>>> -- Yee Yang Li Hector http://hectorgon.blogspot.com/ (tech + travel) http://hectorgon.com (book reviews)
