MAHOUT-1442 has been created. Will submit the patch too.

On Sun, Mar 9, 2014 at 9:03 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:

> Can you file a JIRA and attach your patch?
>
>
> On Sun, Mar 9, 2014 at 8:03 AM, Bikash Gupta <bikash.gupt...@gmail.com
> >wrote:
>
> > Info for everyone
> >
> > I have successfully forced Mahout to build with Guava 11.0.2. Error and
> > fixes as mentioned below
> >
> > 1.  Class: org.apache.mahout.math.stats.GroupTree
> > - Change Line No 171 to - stack = new ArrayDeque<GroupTree>();
> > - Import package java.util.ArrayDeque;
> >
> > 2. Class: org.apache.mahout.classifier.sgd.OnlineLogisticRegressionTest
> > -  11.0.2 dosent have Closer in IO, hence I have used try-with-resources
> > - changed java to 1.7
> > - code changed as shown below
> >
> >  try(ByteArrayOutputStream byteArrayOutputStream = new
> > ByteArrayOutputStream();
> >         DataOutputStream dataOutputStream = new
> > DataOutputStream(byteArrayOutputStream)) {
> >       PolymorphicWritable.write(dataOutputStream, lr);
> >       output = byteArrayOutputStream.toByteArray();
> >     }
> >
> >     OnlineLogisticRegression read;
> >
> >     try(ByteArrayInputStream byteArrayInputStream = new
> > ByteArrayInputStream(output);
> >       DataInputStream dataInputStream = new
> > DataInputStream(byteArrayInputStream)) {
> >       read = PolymorphicWritable.read(dataInputStream,
> > OnlineLogisticRegression.class);
> >     }
> >
> > 3. org.apache.mahout.utils.vectors.lucene.LuceneIterableTest
> > -  Iterators.advance was not present in 11.0.2. Hence just added the
> > respective code. sample shown below
> > int numberToAdvance = 1;
> >     int iterateNumberToAdvance;
> >     for (iterateNumberToAdvance = 0; iterateNumberToAdvance <
> > numberToAdvance && iterator.hasNext(); iterateNumberToAdvance++) {
> >       iterator.next();
> >     }
> >
> > If anyone has good suggestion then please flag.
> >
> > @Suneel,
> >
> > Going back to my original question. I was able to call ClusteringUtils
> for
> > Kmeans, however I cannot use ClusterQualitySummarizer bcoz it doesnt
> > support WeightedPropertyVectorWritable.
> >
> >
> >
> > On Sun, Mar 9, 2014 at 6:28 PM, Bikash Gupta <bikash.gupt...@gmail.com
> > >wrote:
> >
> > > Just FYI... downgrading guava to 11.0.2 has fixed the build error in
> > > mahout-math as suggested by Ted however it is causing some other build
> > > error in mahout-core
> > >
> > > [INFO] -------------------------------------------------------------
> > > [ERROR]
> > >
> >
> /mahout-trunk/core/src/test/java/org/apache/mahout/classifier/sgd/OnlineLogisticRegressionTest.java:[24,28]
> > > cannot find symbol
> > >   symbol:   class Closer
> > >   location: package com.google.common.io
> > > [ERROR]
> > >
> >
> /mahout-trunk/core/src/test/java/org/apache/mahout/classifier/sgd/OnlineLogisticRegressionTest.java:[289,5]
> > > cannot find symbol
> > >   symbol:   class Closer
> > >   location: class
> > > org.apache.mahout.classifier.sgd.OnlineLogisticRegressionTest
> > > [ERROR]
> > >
> >
> /mahout-trunk/core/src/test/java/org/apache/mahout/classifier/sgd/OnlineLogisticRegressionTest.java:[289,21]
> > > cannot find symbol
> > >   symbol:   variable Closer
> > >   location: class
> > > org.apache.mahout.classifier.sgd.OnlineLogisticRegressionTest
> > >
> > >
> > > On Sun, Mar 9, 2014 at 3:45 PM, Suneel Marthi <suneel_mar...@yahoo.com
> > >wrote:
> > >
> > >> Darn. U r the second guy to report that this week.  Change that line
> to
> > >> what ted suggested.  The issue is with guava incompatibility with
> > Hadoop's
> > >> antiquated guava version.
> > >>
> > >> Sent from my iPhone
> > >>
> > >> On Mar 9, 2014, at 6:10 AM, Bikash Gupta <bikash.gupt...@gmail.com>
> > >> wrote:
> > >>
> > >> I am successfully able to run ClusteringUtils on Kmeans(needs to check
> > >> the scenario which you have mentionbed). However I am getting error
> from
> > >> TDigest class
> > >>
> > >> Exception in thread "main" java.lang.NoSuchMethodError:
> > >> com.google.common.collect.Queues.newArrayDeque()Ljava/util/ArrayDeque;
> > >>     at
> > org.apache.mahout.math.stats.GroupTree$1.<init>(GroupTree.java:171)
> > >>     at
> > org.apache.mahout.math.stats.GroupTree.iterator(GroupTree.java:169)
> > >>     at
> > >> org.apache.mahout.math.stats.GroupTree.access$300(GroupTree.java:14)
> > >>     at
> > >> org.apache.mahout.math.stats.GroupTree$2.iterator(GroupTree.java:317)
> > >>     at org.apache.mahout.math.stats.TDigest.add(TDigest.java:105)
> > >>     at org.apache.mahout.math.stats.TDigest.add(TDigest.java:88)
> > >>     at org.apache.mahout.math.stats.TDigest.add(TDigest.java:76)
> > >>     at
> > >>
> >
> org.apache.mahout.math.stats.OnlineSummarizer.add(OnlineSummarizer.java:57)
> > >>     at
> > >>
> >
> org.apache.mahout.clustering.ClusteringUtils.summarizeClusterDistances(ClusteringUtils.java:65)
> > >>
> > >> Few days ago I saw a post where an user got a similar issue on TDigest
> > >> class. Ted suggested to replace the line with below code
> > >>
> > >> stack = new ArrayDeque<GroupTree>();
> > >>
> > >> Let me know if I am correct.
> > >>
> > >>
> > >> On Sun, Mar 9, 2014 at 3:18 PM, Suneel Marthi <
> suneel_mar...@yahoo.com
> > >wrote:
> > >>
> > >>> U could call ClusterQualitySummarizer which then calls
> ClusteringUtils
> > >>> to spew out the different metrics u had specified.
> > >>> For an example, see the Streaming Kmeans section in
> > >>> examples/bin/cluster-reuters.sh.
> > >>>
> > >>> It calls 'qualcluster' with options -i <tf-idf vectors generated from
> > >>> seq2sparse> -c <output of Kmeans> -o <output file generated with the
> > >>> metrics>
> > >>>
> > >>>
> > >>> I have not tried this on KMeans and since the output format of KMeans
> > is
> > >>> different from Streaming KMeans, this might just fall flat.
> > >>> Also it may fail to read some of the clusters if the clusters have
> only
> > >>> a single clusteredpoint, this is due to new TDigest summarizer that
> > expects
> > >>> atleast 2 points in order to calculate - max, quartiles, mean.
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On Sunday, March 9, 2014 4:19 AM, Bikash Gupta <
> > bikash.gupt...@gmail.com>
> > >>> wrote:
> > >>>
> > >>> Hi,
> > >>>
> > >>> I want to use ClusteringUtils on Kmeans clusteredPoints to get
> > >>> summarizeClusterDistances , daviesBouldinIndex & dunnIndex
> > >>>
> > >>> Is there any sample or example how to use these features?
> > >>> --
> > >>> Thanks & Regards
> > >>> Bikash Kumar Gupta
> > >>>
> > >>
> > >>
> > >>
> > >> --
> > >> Thanks & Regards
> > >> Bikash Kumar Gupta
> > >>
> > >>
> > >
> > >
> > > --
> > > Thanks & Regards
> > > Bikash Kumar Gupta
> > >
> >
> >
> >
> > --
> > Thanks & Regards
> > Bikash Kumar Gupta
> >
>



-- 
Thanks & Regards
Bikash Kumar Gupta

Reply via email to