I'm not sure about this either but I think these are all the changes to Mahout in CDH 4.6.0: http://archive.cloudera.com/cdh4/cdh/4/mahout-0.7-cdh4.6.0.CHANGES.txt
MAHOUT-1291 MAHOUT-1033 MAHOUT-1142 On Wed, Mar 5, 2014 at 8:30 AM, Suneel Marthi <suneel_mar...@yahoo.com>wrote: > Not sure if the CDH4 patches on top of 0.7 has fixes for M-1067 and M-1098 > which address the issues u r seeing. > > > > The second part of the issue u r seeing with Mahout 0.9 distro seems to be > related to how u set it up on CDH4. I apologize for not being helpful here > as I am not a CDH4 user or expert. > > Sean? > > > > > On Wednesday, March 5, 2014 10:23 AM, Kevin Moulart < > kevinmoul...@gmail.com> wrote: > > Previous mail sent only to Suneel : (my bad sorry) > > According to my stacktrace it seems that I am running mahout 0.7 indeed. > > That's the version provided by Cloudera when I install mahout using yum. > > But according to Sean Owen, it really is a 0.8 inside... > > Anyway I tried with the compiled version and it didn't work : > > Running on hadoop, using /opt/cloudera/parcels/CDH/lib/hadoop/bin/hadoop > > and HADOOP_CONF_DIR= > > Exception in thread "main" java.lang.NoSuchMethodError: > > org.apache.hadoop.util.ProgramDriver.driver([Ljava/lang/String;)V > > at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:122) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > > at > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:606) > > at org.apache.hadoop.util.RunJar.main(RunJar.java:208) > > MAHOUT-JOB: > > /home/cacf/Downloads/mahout-distribution-0.9/mahout-examples-0.9-job.jar > > > > And now I changed the conf directory of mahout 0.9 to be linked to the one > used by the existing working mahout and the trace changes : > > MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. > Running on hadoop, using /opt/cloudera/parcels/CDH/lib/hadoop/bin/hadoop > and HADOOP_CONF_DIR=/etc/hadoop/conf > MAHOUT-JOB: > > /home/myCompany/Downloads/mahout-distribution-0.9/mahout-examples-0.9-job.jar > 14/03/05 16:16:23 WARN driver.MahoutDriver: Unable to add class: > org.apache.mahout.clustering.meanshift.MeanShiftCanopyDriver > java.lang.ClassNotFoundException: > org.apache.mahout.clustering.meanshift.MeanShiftCanopyDriver > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:190) > at org.apache.mahout.driver.MahoutDriver.addClass(MahoutDriver.java:237) > at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:118) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.main(RunJar.java:208) > 14/03/05 16:16:23 WARN driver.MahoutDriver: Unable to add class: > org.apache.mahout.clustering.spectral.eigencuts.EigencutsDriver > java.lang.ClassNotFoundException: > org.apache.mahout.clustering.spectral.eigencuts.EigencutsDriver > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:190) > at org.apache.mahout.driver.MahoutDriver.addClass(MahoutDriver.java:237) > at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:118) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.main(RunJar.java:208) > 14/03/05 16:16:23 WARN driver.MahoutDriver: Unable to add class: > org.apache.mahout.clustering.minhash.MinHashDriver > java.lang.ClassNotFoundException: > org.apache.mahout.clustering.minhash.MinHashDriver > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:190) > at org.apache.mahout.driver.MahoutDriver.addClass(MahoutDriver.java:237) > at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:118) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.main(RunJar.java:208) > 14/03/05 16:16:23 WARN driver.MahoutDriver: Unable to add class: > org.apache.mahout.clustering.dirichlet.DirichletDriver > java.lang.ClassNotFoundException: > org.apache.mahout.clustering.dirichlet.DirichletDriver > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:190) > at org.apache.mahout.driver.MahoutDriver.addClass(MahoutDriver.java:237) > at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:118) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.main(RunJar.java:208) > Exception in thread "main" java.lang.NoSuchMethodError: > org.apache.hadoop.util.ProgramDriver.driver([Ljava/lang/String;)V > at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:122) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.main(RunJar.java:208) > > Changing the hadoop home to > /opt/cloudera/parcels/CDH/lib/hadoop/bin/hadoop-mapreduce doesn't change > the output, nor does > /opt/cloudera/parcels/CDH/lib/hadoop/bin/hadoop-0.20-mapreduce > > Any idea now ? > > > > 2014-03-05 15:45 GMT+01:00 Suneel Marthi <suneel_mar...@yahoo.com>: > > Are u using Mahout 0.7 ? > > > > From this line in ur stacktrace that seems to be the case: > > MAHOUT-JOB: /usr/lib/mahout/mahout-examples-0.7-cdh4.5.0-job.jar > > > > You could build Mahout outside of CDH from Mahout trunk and put the jars > > onto CDH5. > > I am no Cloudera expert or CDH5 user to help with CDHx build. > > > > > > > > > > > > > > On Wednesday, March 5, 2014 9:30 AM, Kevin Moulart < > > kevinmoul...@gmail.com> wrote: > > Hi and thanks for your help! > > > > I had been told that the version of mahout used by Cloudera (CDH 4.6) was > > in fact 0.8 with a patch for mr2 support. > > ( > > > http://mail-archives.apache.org/mod_mbox/mahout-user/201402.mbox/%3CCAEccTywqSAKA_HeX4vTZ-5XPmKtj5b8zMGQUfn5qRsiq=7o=u...@mail.gmail.com%3E > ) > > > > But I tried to install 0.9 on my own, by compiling it with mvn after I > > changed the pom.xml : > > > > - Added cloudera repository : > > > > <repository> > > <id>cloudera-repo</id> > > <name>Cloudera Repository</name> > > <url>https://repository.cloudera.com/artifactory/cloudera-repos > > </url> > > </repository> > > > > - Changed the version of hadoop to use : > > <hadoop.1.version>2.0.0-mr1-cdh4.6.0</hadoop.1.version> > > - I tried adding this one too : > > <hadoop2.version>2.0.0-cdh4.6.0</hadoop2.version> > > > > But then I get a lot of errors when Maven begins to compile the core > > package : > > https://gist.github.com/kmoulart/9368193 > > > > Could you tell me what I did wrong ? > > > > > > 2014-03-04 19:02 GMT+01:00 Suneel Marthi <suneel_mar...@yahoo.com>: > > > > The -us option was fixed for Mahout 0.8, seems like u r using Mahout 0.7 > > which had this issue (from ur stacktrace, its apparent u r using Mahout > > 0.7). Please upgrade to the latest mahout version. > > > > > > > > > > > > On Tuesday, March 4, 2014 8:54 AM, Kevin Moulart <kevinmoul...@gmail.com > > > > wrote: > > > > Hi, > > > > I'm trying to apply a PCA to reduce the dimension of a matrix of 1603 > > columns and 100.000 to 30.000.000 lines using ssvd with the pca option, > and > > I always get a StackOverflowError : > > > > Here is my command line : > > mahout ssvd -i /user/myUser/Echant100k -o /user/myUser/Echant/SVD100 -k > 100 > > -pca "true" -U "false" -V "false" -t 3 -ow > > > > I also tried to put "-us true" as mentionned in > > > > > https://cwiki.apache.org/confluence/download/attachments/27832158/SSVD-CLI.pdf?version=18&modificationDate=1381347063000&api=v2but > > the option is not available anymore. > > > > The output of the previous command is : > > MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. > > Running on hadoop, using /opt/cloudera/parcels/CDH/lib/hadoop/bin/hadoop > > and HADOOP_CONF_DIR=/etc/hadoop/conf > > MAHOUT-JOB: /usr/lib/mahout/mahout-examples-0.7-cdh4.5.0-job.jar > > 14/03/04 14:45:16 INFO common.AbstractJob: Command line arguments: > > {--abtBlockHeight=[200000], --blockHeight=[10000], --broadcast=[true], > > --computeU=[false], --computeV=[false], --endPhase=[2147483647], > > --input=[/user/myUser/Echant100k], --minSplitSize=[-1], > > --outerProdBlockHeight=[30000], --output=[/user/myUser/Echant/SVD100], > > --oversampling=[15], --overwrite=null, --pca=[true], --powerIter=[0], > > --rank=[100], --reduceTasks=[3], --startPhase=[0], --tempDir=[temp], > > --uHalfSigma=[false], --vHalfSigma=[false]} > > Exception in thread "main" java.lang.StackOverflowError > > at > > > > > org.apache.mahout.math.hadoop.MatrixColumnMeansJob.run(MatrixColumnMeansJob.java:55) > > at > > > > > org.apache.mahout.math.hadoop.MatrixColumnMeansJob.run(MatrixColumnMeansJob.java:55) > > at > > > > > org.apache.mahout.math.hadoop.MatrixColumnMeansJob.run(MatrixColumnMeansJob.java:55) > > ... > > > > I search online and didn't find a solution to my problem. > > > > Can you help me ? > > > > Thanks in advance, > > > > -- > > Kévin Moulart > > > > > > > > > > -- > > Kévin Moulart > > GSM France : +33 7 81 06 10 10 > > GSM Belgique : +32 473 85 23 85 > > Téléphone fixe : +32 2 771 88 45 > > > > > > > > > -- > Kévin Moulart > GSM France : +33 7 81 06 10 10 > GSM Belgique : +32 473 85 23 85 > Téléphone fixe : +32 2 771 88 45 >