Should I give a try for exclusion list?

Let me know so that I can provide a patch.


On Mon, Mar 10, 2014 at 4:53 AM, Ted Dunning <ted.dunn...@gmail.com> wrote:

> Exclusion sadly doesn't work because the resulting program will be running
> with the class path of Hadoop unless you build a jar with dependencies.
>
>
>
>
> On Sun, Mar 9, 2014 at 3:37 PM, Suneel Marthi <suneel_mar...@yahoo.com
> >wrote:
>
> > Thinking loud here. If this is indeed a build error that u r seeing, a
> > better fix would be to exclude hadoop's guava 11 transitive dependency in
> > the pom as opposed to having downgrade Mahout code to be guava 11
> > compatible.
> >
> > We might have missed excluding Hadoop's Guava 11 jar during the recent
> > patch for Hadoop 2 (this needs to be done for both hadoop 1 & 2 profiles)
> > if that indeed fixes the issue.
> >
> >
> >
> >
> >
> >
> >
> >
> > On Sunday, March 9, 2014 2:14 PM, Bikash Gupta <bikash.gupt...@gmail.com
> >
> > wrote:
> >
> > MAHOUT-1442 has been created. Will submit the patch too.
> >
> >
> > On Sun, Mar 9, 2014 at 9:03 PM, Ted Dunning <ted.dunn...@gmail.com>
> wrote:
> >
> > > Can you file a JIRA and attach your patch?
> > >
> > >
> > > On Sun, Mar 9, 2014 at 8:03 AM, Bikash Gupta <bikash.gupt...@gmail.com
> > > >wrote:
> > >
> > > > Info for everyone
> > > >
> > > > I have successfully forced Mahout to build with Guava 11.0.2. Error
> and
> > > > fixes as mentioned below
> > > >
> > >
> >  > 1.  Class: org.apache.mahout.math.stats.GroupTree
> > > > - Change Line No 171 to - stack = new ArrayDeque<GroupTree>();
> > > > - Import package java.util.ArrayDeque;
> > > >
> > > > 2. Class:
> org.apache.mahout.classifier.sgd.OnlineLogisticRegressionTest
> > > > -  11.0.2 dosent have Closer in IO, hence I have used
> > try-with-resources
> > > > - changed java to 1.7
> > > > - code changed as shown below
> > > >
> > > >  try(ByteArrayOutputStream byteArrayOutputStream = new
> > > > ByteArrayOutputStream();
> > > >         DataOutputStream dataOutputStream = new
> > > > DataOutputStream(byteArrayOutputStream)) {
> > > >
> >  PolymorphicWritable.write(dataOutputStream, lr);
> > > >       output =
> >  byteArrayOutputStream.toByteArray();
> > > >     }
> > > >
> > > >     OnlineLogisticRegression read;
> > > >
> > > >     try(ByteArrayInputStream byteArrayInputStream = new
> > > > ByteArrayInputStream(output);
> > > >       DataInputStream dataInputStream = new
> > > > DataInputStream(byteArrayInputStream)) {
> > > >       read = PolymorphicWritable.read(dataInputStream,
> > > > OnlineLogisticRegression.class);
> > > >     }
> > > >
> > > > 3. org.apache.mahout.utils.vectors.lucene.LuceneIterableTest
> > > > -  Iterators.advance was not present in 11.0.2. Hence just added the
> > > > respective code. sample shown
> >  below
> > > > int numberToAdvance = 1;
> > > >     int iterateNumberToAdvance;
> > > >     for (iterateNumberToAdvance = 0; iterateNumberToAdvance <
> > > > numberToAdvance && iterator.hasNext(); iterateNumberToAdvance++) {
> > > >       iterator.next();
> > > >     }
> > > >
> > > > If anyone has good suggestion then please flag.
> > > >
> > > > @Suneel,
> > > >
> > > > Going back to my original question. I was able to call
> ClusteringUtils
> > > for
> > > > Kmeans, however I cannot use ClusterQualitySummarizer bcoz it doesnt
> > > > support WeightedPropertyVectorWritable.
> > > >
> > > >
> > > >
> > > > On Sun, Mar 9, 2014 at 6:28 PM, Bikash Gupta <
> bikash.gupt...@gmail.com
> > > > >wrote:
> > > >
> > > > > Just FYI... downgrading guava to 11.0.2 has fixed the build error
> in
> > > > > mahout-math as suggested by Ted however it is causing some other
> > build
> > > > > error in mahout-core
> > > > >
> > > > > [INFO]
> -------------------------------------------------------------
> > > > > [ERROR]
> > > > >
> > > >
> > >
> >
> /mahout-trunk/core/src/test/java/org/apache/mahout/classifier/sgd/OnlineLogisticRegressionTest.java:[24,28]
> > > > > cannot
> >  find symbol
> > >
> >  > >   symbol:   class Closer
> > > > >   location: package com.google.common.io
> > > > > [ERROR]
> > > > >
> > > >
> > >
> >
> /mahout-trunk/core/src/test/java/org/apache/mahout/classifier/sgd/OnlineLogisticRegressionTest.java:[289,5]
> > > > > cannot find symbol
> > > > >   symbol:   class Closer
> > > > >   location: class
> > > > > org.apache.mahout.classifier.sgd.OnlineLogisticRegressionTest
> > > > > [ERROR]
> > > > >
> > > >
> > >
> >
> /mahout-trunk/core/src/test/java/org/apache/mahout/classifier/sgd/OnlineLogisticRegressionTest.java:[289,21]
> > > > > cannot find symbol
> > > > >   symbol:   variable Closer
> > > > >   location: class
> > > > > org.apache.mahout.classifier.sgd.OnlineLogisticRegressionTest
> > > > >
> > > > >
> > > > > On Sun, Mar 9, 2014 at 3:45 PM, Suneel Marthi <
> > suneel_mar...@yahoo.com
> > > > >wrote:
> > > > >
> > > > >> Darn. U r the second guy to report that this week.  Change that
> line
> > > to
> > > > >> what ted suggested.  The issue is with guava incompatibility with
> > > > Hadoop's
> > > > >> antiquated guava version.
> > > > >>
> > > > >> Sent from my iPhone
> > > >
> >  >>
> > > >
> >  >> On Mar 9, 2014, at 6:10 AM, Bikash Gupta <bikash.gupt...@gmail.com>
> > > > >> wrote:
> > > > >>
> > > > >> I am successfully able to run ClusteringUtils on Kmeans(needs to
> > check
> > > > >> the scenario which you have mentionbed). However I am getting
> error
> > > from
> > > > >> TDigest class
> > > > >>
> > > > >> Exception in thread "main" java.lang.NoSuchMethodError:
> > > > >>
> > com.google.common.collect.Queues.newArrayDeque()Ljava/util/ArrayDeque;
> > > > >>     at
> > > > org.apache.mahout.math.stats.GroupTree$1.<init>(GroupTree.java:171)
> > > >
> >  >>     at
> > > > org.apache.mahout.math.stats.GroupTree.iterator(GroupTree.java:169)
> > > > >>     at
> > > > >>
> org.apache.mahout.math.stats.GroupTree.access$300(GroupTree.java:14)
> > > > >>     at
> > > > >>
> > org.apache.mahout.math.stats.GroupTree$2.iterator(GroupTree.java:317)
> > > > >>     at org.apache.mahout.math.stats.TDigest.add(TDigest.java:105)
> > > > >>     at org.apache.mahout.math.stats.TDigest.add(TDigest.java:88)
> > > > >>     at org.apache.mahout.math.stats.TDigest.add(TDigest.java:76)
> > > > >>     at
> > > > >>
> > > >
> > >
> >
> >
>  org.apache.mahout.math.stats.OnlineSummarizer.add(OnlineSummarizer.java:57)
> > > > >>     at
> > > > >>
> > > >
> > >
> >
> org.apache.mahout.clustering.ClusteringUtils.summarizeClusterDistances(ClusteringUtils.java:65)
> > > > >>
> > > > >> Few days ago I saw a post where an user got a similar issue on
> > TDigest
> > > > >> class. Ted suggested to replace the line with below code
> > > > >>
> > > > >> stack = new ArrayDeque<GroupTree>();
> > > > >>
> > > > >> Let me know if I am correct.
> > > > >>
> > > > >>
> > > > >> On Sun, Mar 9, 2014 at 3:18 PM, Suneel Marthi <
> > > suneel_mar...@yahoo.com
> > > > >wrote:
> > > > >>
> > > > >>> U could call ClusterQualitySummarizer which then calls
> > > ClusteringUtils
> > > > >>> to spew out the different metrics u had specified.
> > > > >>> For an example, see the Streaming Kmeans section in
> > > > >>> examples/bin/cluster-reuters.sh.
> > > > >>>
> > > > >>> It calls 'qualcluster' with options -i <tf-idf vectors generated
> > from
> > > > >>> seq2sparse> -c <output of Kmeans> -o <output file generated with
> > the
> > > > >>> metrics>
> > > > >>>
> > > >
> >  >>>
> > > > >>> I have not tried this on KMeans and since the output format of
> > KMeans
> > > > is
> > > > >>> different from Streaming KMeans, this might just fall flat.
> > > > >>> Also it may fail to read some of the clusters if the clusters
> have
> > > only
> > > > >>> a single clusteredpoint, this is due to new TDigest summarizer
> that
> > > > expects
> > > > >>> atleast 2 points in order to calculate - max, quartiles, mean.
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > >
> >  >>> On
> >  Sunday, March 9, 2014 4:19 AM, Bikash Gupta <
> > > > bikash.gupt...@gmail.com>
> > > > >>> wrote:
> > > > >>>
> > > > >>> Hi,
> > > > >>>
> > > > >>> I want to use ClusteringUtils on Kmeans clusteredPoints to get
> > > > >>> summarizeClusterDistances , daviesBouldinIndex & dunnIndex
> > > > >>>
> > > > >>> Is there any sample or example how to use these features?
> > > > >>> --
> > > > >>> Thanks & Regards
> > > > >>> Bikash Kumar Gupta
> >
> > > > >>>
> > > > >>
> > > > >>
> > > > >>
> > > > >> --
> > > > >> Thanks & Regards
> > > > >> Bikash Kumar Gupta
> > > > >>
> > > > >>
> > > > >
> > > > >
> > > > > --
> > > > > Thanks & Regards
> > > > > Bikash Kumar Gupta
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Thanks & Regards
> > > > Bikash Kumar Gupta
> > > >
> > >
> >
> >
> >
> > --
> > Thanks &
> >  Regards
> > Bikash Kumar Gupta
>



-- 
Thanks & Regards
Bikash Kumar Gupta

Reply via email to