MAHOUT-1442 has been created. Will submit the patch too.
On Sun, Mar 9, 2014 at 9:03 PM, Ted Dunning <ted.dunn...@gmail.com> wrote: > Can you file a JIRA and attach your patch? > > > On Sun, Mar 9, 2014 at 8:03 AM, Bikash Gupta <bikash.gupt...@gmail.com > >wrote: > > > Info for everyone > > > > I have successfully forced Mahout to build with Guava 11.0.2. Error and > > fixes as mentioned below > > > > 1. Class: org.apache.mahout.math.stats.GroupTree > > - Change Line No 171 to - stack = new ArrayDeque<GroupTree>(); > > - Import package java.util.ArrayDeque; > > > > 2. Class: org.apache.mahout.classifier.sgd.OnlineLogisticRegressionTest > > - 11.0.2 dosent have Closer in IO, hence I have used try-with-resources > > - changed java to 1.7 > > - code changed as shown below > > > > try(ByteArrayOutputStream byteArrayOutputStream = new > > ByteArrayOutputStream(); > > DataOutputStream dataOutputStream = new > > DataOutputStream(byteArrayOutputStream)) { > > PolymorphicWritable.write(dataOutputStream, lr); > > output = byteArrayOutputStream.toByteArray(); > > } > > > > OnlineLogisticRegression read; > > > > try(ByteArrayInputStream byteArrayInputStream = new > > ByteArrayInputStream(output); > > DataInputStream dataInputStream = new > > DataInputStream(byteArrayInputStream)) { > > read = PolymorphicWritable.read(dataInputStream, > > OnlineLogisticRegression.class); > > } > > > > 3. org.apache.mahout.utils.vectors.lucene.LuceneIterableTest > > - Iterators.advance was not present in 11.0.2. Hence just added the > > respective code. sample shown below > > int numberToAdvance = 1; > > int iterateNumberToAdvance; > > for (iterateNumberToAdvance = 0; iterateNumberToAdvance < > > numberToAdvance && iterator.hasNext(); iterateNumberToAdvance++) { > > iterator.next(); > > } > > > > If anyone has good suggestion then please flag. > > > > @Suneel, > > > > Going back to my original question. I was able to call ClusteringUtils > for > > Kmeans, however I cannot use ClusterQualitySummarizer bcoz it doesnt > > support WeightedPropertyVectorWritable. > > > > > > > > On Sun, Mar 9, 2014 at 6:28 PM, Bikash Gupta <bikash.gupt...@gmail.com > > >wrote: > > > > > Just FYI... downgrading guava to 11.0.2 has fixed the build error in > > > mahout-math as suggested by Ted however it is causing some other build > > > error in mahout-core > > > > > > [INFO] ------------------------------------------------------------- > > > [ERROR] > > > > > > /mahout-trunk/core/src/test/java/org/apache/mahout/classifier/sgd/OnlineLogisticRegressionTest.java:[24,28] > > > cannot find symbol > > > symbol: class Closer > > > location: package com.google.common.io > > > [ERROR] > > > > > > /mahout-trunk/core/src/test/java/org/apache/mahout/classifier/sgd/OnlineLogisticRegressionTest.java:[289,5] > > > cannot find symbol > > > symbol: class Closer > > > location: class > > > org.apache.mahout.classifier.sgd.OnlineLogisticRegressionTest > > > [ERROR] > > > > > > /mahout-trunk/core/src/test/java/org/apache/mahout/classifier/sgd/OnlineLogisticRegressionTest.java:[289,21] > > > cannot find symbol > > > symbol: variable Closer > > > location: class > > > org.apache.mahout.classifier.sgd.OnlineLogisticRegressionTest > > > > > > > > > On Sun, Mar 9, 2014 at 3:45 PM, Suneel Marthi <suneel_mar...@yahoo.com > > >wrote: > > > > > >> Darn. U r the second guy to report that this week. Change that line > to > > >> what ted suggested. The issue is with guava incompatibility with > > Hadoop's > > >> antiquated guava version. > > >> > > >> Sent from my iPhone > > >> > > >> On Mar 9, 2014, at 6:10 AM, Bikash Gupta <bikash.gupt...@gmail.com> > > >> wrote: > > >> > > >> I am successfully able to run ClusteringUtils on Kmeans(needs to check > > >> the scenario which you have mentionbed). However I am getting error > from > > >> TDigest class > > >> > > >> Exception in thread "main" java.lang.NoSuchMethodError: > > >> com.google.common.collect.Queues.newArrayDeque()Ljava/util/ArrayDeque; > > >> at > > org.apache.mahout.math.stats.GroupTree$1.<init>(GroupTree.java:171) > > >> at > > org.apache.mahout.math.stats.GroupTree.iterator(GroupTree.java:169) > > >> at > > >> org.apache.mahout.math.stats.GroupTree.access$300(GroupTree.java:14) > > >> at > > >> org.apache.mahout.math.stats.GroupTree$2.iterator(GroupTree.java:317) > > >> at org.apache.mahout.math.stats.TDigest.add(TDigest.java:105) > > >> at org.apache.mahout.math.stats.TDigest.add(TDigest.java:88) > > >> at org.apache.mahout.math.stats.TDigest.add(TDigest.java:76) > > >> at > > >> > > > org.apache.mahout.math.stats.OnlineSummarizer.add(OnlineSummarizer.java:57) > > >> at > > >> > > > org.apache.mahout.clustering.ClusteringUtils.summarizeClusterDistances(ClusteringUtils.java:65) > > >> > > >> Few days ago I saw a post where an user got a similar issue on TDigest > > >> class. Ted suggested to replace the line with below code > > >> > > >> stack = new ArrayDeque<GroupTree>(); > > >> > > >> Let me know if I am correct. > > >> > > >> > > >> On Sun, Mar 9, 2014 at 3:18 PM, Suneel Marthi < > suneel_mar...@yahoo.com > > >wrote: > > >> > > >>> U could call ClusterQualitySummarizer which then calls > ClusteringUtils > > >>> to spew out the different metrics u had specified. > > >>> For an example, see the Streaming Kmeans section in > > >>> examples/bin/cluster-reuters.sh. > > >>> > > >>> It calls 'qualcluster' with options -i <tf-idf vectors generated from > > >>> seq2sparse> -c <output of Kmeans> -o <output file generated with the > > >>> metrics> > > >>> > > >>> > > >>> I have not tried this on KMeans and since the output format of KMeans > > is > > >>> different from Streaming KMeans, this might just fall flat. > > >>> Also it may fail to read some of the clusters if the clusters have > only > > >>> a single clusteredpoint, this is due to new TDigest summarizer that > > expects > > >>> atleast 2 points in order to calculate - max, quartiles, mean. > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> On Sunday, March 9, 2014 4:19 AM, Bikash Gupta < > > bikash.gupt...@gmail.com> > > >>> wrote: > > >>> > > >>> Hi, > > >>> > > >>> I want to use ClusteringUtils on Kmeans clusteredPoints to get > > >>> summarizeClusterDistances , daviesBouldinIndex & dunnIndex > > >>> > > >>> Is there any sample or example how to use these features? > > >>> -- > > >>> Thanks & Regards > > >>> Bikash Kumar Gupta > > >>> > > >> > > >> > > >> > > >> -- > > >> Thanks & Regards > > >> Bikash Kumar Gupta > > >> > > >> > > > > > > > > > -- > > > Thanks & Regards > > > Bikash Kumar Gupta > > > > > > > > > > > -- > > Thanks & Regards > > Bikash Kumar Gupta > > > -- Thanks & Regards Bikash Kumar Gupta