On Aug 3, 2009, at 2:33 PM, tigertail wrote:
Got to ask this again.I installed and started Hadoop-0.20.0 on a cluster with two boxes properly. Then I just follow the steps Paul gave to install mahout on the master node. After that i can run canopy with no problem. But I cannot run kmeans. Thereis always the error java.lang.NoClassDefFoundError: com/google/gson/reflect/TypeToken. Can Paul, or Grant, help me out this please, thanks! tigertail wrote:Hi Paul,Sorry for the naive question, can you show me how to "flatten the JOB with all classes in the same JAR"? And has this error been fixed in the new SVNversion? Paul Ingles-4 wrote:I've flattened the JOB with all classes in the same JAR and that workssuccessfully. Steps:1) svn co http://svn.apache.org/repos/asf/lucene/mahout/trunk mahout-trunk 2) cd mahout-trunk 3) mvn install 4) hadoop jar examples/target/mahout-examples-0.2-SNAPSHOT.job org.apache.mahout.clustering.syntheticcontrol.kmeans.Job -libjars examples/target/dependency/gson-1.3.jar As for setting up Hadoop in pseudo-distributed, that was donefollowing the guide on the site but I'll check that again if it's beenupdated recently. Thanks again for all the help, Paul On 17 Jul 2009, at 13:39, Grant Ingersoll wrote:Have you tried flattening the JOB so all the classes are packed in asingle JAR? Also, can you give the full list of steps you are doing, because I am able to run this in pseudo-distro withoutgetting this error. Also, have you checked the Hadoop logs ($HADOOP/logs, I believe) I also notice that the Hadoop quick start has different configuration settings now due to 0.20 -Grant On Jul 17, 2009, at 5:00 AM, Paul Ingles wrote:I've tried re-running specifically adding the gson jar as follows: $ hadoop jar examples/target/mahout-examples-0.2-SNAPSHOT.job org.apache.mahout.clustering.syntheticcontrol.kmeans.Job -libjars examples/target/dependency/gson-1.3.jar Unfortunately, I get the same errors as before: 09/07/17 09:53:50 INFO kmeans.KMeansDriver: Clustering 09/07/17 09:53:50 INFO kmeans.KMeansDriver: Running Clustering 09/07/17 09:53:50 INFO kmeans.KMeansDriver: Input: output/data Clusters In: output/clusters-4 Out: output/points Distance: org.apache.mahout.utils.EuclideanDistanceMeasure 09/07/17 09:53:50 INFO kmeans.KMeansDriver: convergence: 0.5 Input Vectors: org.apache.mahout.matrix.SparseVector 09/07/17 09:53:50 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.09/07/17 09:53:50 INFO mapred.FileInputFormat: Total input paths toprocess : 2 09/07/17 09:53:51 INFO mapred.JobClient: Running job: job_200907161209_0018 09/07/17 09:53:52 INFO mapred.JobClient: map 0% reduce 0% 09/07/17 09:54:06 INFO mapred.JobClient: Task Id : attempt_200907161209_0018_m_000000_0, Status : FAILED java.lang.NoClassDefFoundError: com/google/gson/reflect/TypeToken at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:703) atjava .security.SecureClassLoader.defineClass(SecureClassLoader.java:124) at java.net.URLClassLoader.defineClass(URLClassLoader.java:260) at java.net.URLClassLoader.access$000(URLClassLoader.java:56) at java.net.URLClassLoader$1.run(URLClassLoader.java:195) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:319) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:330) at java.lang.ClassLoader.loadClass(ClassLoader.java:254) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:402) at org .apache.mahout.matrix.AbstractVector.asFormatString(AbstractVector.java: 374)at org .apache .mahout.clustering .kmeans.Cluster.outputPointWithClusterInfo(Cluster.java:198) at org .apache .mahout.clustering .kmeans.KMeansClusterMapper.map(KMeansClusterMapper.java:39) at org .apache .mahout.clustering .kmeans.KMeansClusterMapper.map(KMeansClusterMapper.java:32) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java: 356)at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.lang.ClassNotFoundException: com.google.gson.reflect.TypeToken at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:319) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:330) at java.lang.ClassLoader.loadClass(ClassLoader.java:254) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:402) ... 20 more This is running pseudo-distributed on my laptop. On 16 Jul 2009, at 18:57, Adil Aijaz wrote:My basic understanding of the class loader stuff is:1. Any jars that need to be available to map/reduce jobs should bespecified through -libjars (e.g hadoop --config ... -libjars gson.jar jar <path to my jar> ...) 2. Any jars that need to be available to the main class should be specified through lib/*.jar (that is in the mahout-examples-0.2- SNAPSHOT/lib/*.jar)unless of course as Jeff is saying one ends up flattening the lib/*.jar into top level classes. Adil Jeff Eastman wrote:Isn't this the same old problem that our Job jar file has a libdirectory with the Mahout code in it and the way Hadoop loads the jar it sometimes cannot resolve classes in it? IIRC, one needs tosmash the job jar file into a single jar in order for Dirichlet (at least, and any other examples which contain non-core classes). I confess I do not understand the class loader stuff enough to be more specific.I have duplicated the CNF exception by defining and using a user-defined distance measure in the Job file and running KMeans with it, so it is not specific to Dirichlet. classes Grant Ingersoll wrote:Hmm, I'm not seeing the ClassNotFound problem but am getting fetch failures. Will look later. -Grant On Jul 16, 2009, at 11:32 AM, Paul Ingles wrote:I've just tried setting a brand new machine (Ubuntu 8.04 Virtual Machine) with Hadoop 0.20.0 and running the compile jobs against it. I get the same problems as before... still scratching my head :( On 16 Jul 2009, at 12:15, Paul Ingles wrote:Sure,I'm running (currently) on my MacBook Air, running OSX Leopard.JDK: java version "1.6.0_13" Java(TM) SE Runtime Environment (build 1.6.0_13-b03-211) Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02-83, mixed mode) Hadoop is: 0.20.0, r763504 I'm compiling mahout from trunk (r794023) as follows (in the root of the project directory): % mvn install % hadoop jar examples/target/mahout-examples-0.2-SNAPSHOT.job org.apache.mahout.clustering.syntheticcontrol.kmeans.Job The only difference (for dirichlet) is the different class to run. Thanks, Paul On 16 Jul 2009, at 11:33, Grant Ingersoll wrote:Can you share how you built and how you are running, as in command line options, etc.? Also, JDK version, Hadoop version, etc. On Jul 16, 2009, at 6:21 AM, Paul Ingles wrote:Hi, Thank you for the suggestion. Unfortunately, when I tried that I received the same error. I've also tried copying the gson jar directly into $HADOOP_HOME/lib (when I was running a single node pseudo-distributed) and get the same error still. Weirdly enough, if I try and run the Dirichlet example on the cluster I receive another ClassNotFoundException: 09/07/16 10:27:54 INFO mapred.JobClient: Task Id : attempt_200907161026_0002_m_000001_0, Status : FAILED java.lang.RuntimeException: Error in configuring object at org .apache.hadoop .util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) at org .apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java: 64)at org .apache .hadoop .util.ReflectionUtils.newInstance(ReflectionUtils.java:117) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java: 352) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun .reflect .NativeMethodAccessorImpl .invoke(NativeMethodAccessorImpl.java:39) at sun .reflect .DelegatingMethodAccessorImpl .invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org .apache.hadoop .util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) ... 5 more Caused by: java.lang.RuntimeException: Error in configuring object at org .apache.hadoop .util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) at org .apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java: 64)at org .apache .hadoop .util.ReflectionUtils.newInstance(ReflectionUtils.java:117) atorg .apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) ... 10 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun .reflect .NativeMethodAccessorImpl .invoke(NativeMethodAccessorImpl.java:39) at sun .reflect .DelegatingMethodAccessorImpl .invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org .apache.hadoop .util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) ... 13 more Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: org .apache .mahout .clustering .syntheticcontrol.dirichlet.NormalScModelDistribution at org .apache .mahout .clustering .dirichlet .DirichletMapper.getDirichletState(DirichletMapper.java:95) at org .apache .mahout .clustering.dirichlet.DirichletMapper.configure(DirichletMapper.java: 60)... 18 more Caused by: java.lang.ClassNotFoundException: org .apache .mahout .clustering .syntheticcontrol.dirichlet.NormalScModelDistribution at java.net.URLClassLoader$1.run(URLClassLoader.java:200)at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java: 188)at java.lang.ClassLoader.loadClass(ClassLoader.java:316)at sun.misc.Launcher $AppClassLoader.loadClass(Launcher.java:288) at java.lang.ClassLoader.loadClass(ClassLoader.java:251) at org .apache .mahout .clustering.dirichlet .DirichletDriver.createState(DirichletDriver.java:121) at org .apache .mahout .clustering .dirichlet .DirichletMapper.getDirichletState(DirichletMapper.java:71) ... 19 more Hoping this sparks some other suggestions :) Thanks, Paul On Wed Jul 15 22:08:09 UTC 2009, Adil Aijaz <a...@yahoo-inc.comwrote:try hadoop --config <hod-cluster-dir> jar -libjars <path togson.jar> <your job/jar file> <your class> <arguments> Adil Paul Ingles wrote:Hi, Apologies for the cross-posting (I also sent this to the Hadoop user list) but I'm still getting errors if I try and run the KMeans examples on a cluster, whether that be my single-node Mac Pro, or our cluster. I've attached the stack trace at the bottom of the email. The gson jar is definitely included in the packaged .job, and is also in the temporary directory when the task tracker picks up the work.The gson jar also includes TypeToken.class in the expectedpath. Again, really appreciate people's help in getting this going! ----snip---- 09/07/15 17:06:38 INFO mapred.JobClient: Task Id : attempt_200907151617_0010_m_000000_0, Status : FAILED java.lang.NoClassDefFoundError: com/google/gson/reflect/ TypeToken at java.lang.ClassLoader.defineClass1(Native Method)at java.lang.ClassLoader.defineClass(ClassLoader.java: 703)at java .security.SecureClassLoader.defineClass(SecureClassLoader.java: 124) at java.net.URLClassLoader.defineClass(URLClassLoader.java:260)at java.net.URLClassLoader.access $000(URLClassLoader.java:56) at java.net.URLClassLoader$1.run(URLClassLoader.java:195) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java: 188) at java.lang.ClassLoader.loadClass(ClassLoader.java:319) at sun.misc.Launcher $AppClassLoader.loadClass(Launcher.java:330) at java.lang.ClassLoader.loadClass(ClassLoader.java:254) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java: 402) at org .apache .mahout.matrix .AbstractVector.asFormatString(AbstractVector.java:374) at org .apache .mahout .clustering.kmeans.Cluster.outputPointWithClusterInfo(Cluster.java: 198)at org .apache .mahout .clustering.kmeans .KMeansClusterMapper.map(KMeansClusterMapper.java:39)at org .apache .mahout .clustering.kmeans .KMeansClusterMapper.map(KMeansClusterMapper.java:32)at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)atorg .apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:356) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.lang.ClassNotFoundException: com.google.gson.reflect.TypeToken at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java: 188) at java.lang.ClassLoader.loadClass(ClassLoader.java:319) at sun.misc.Launcher $AppClassLoader.loadClass(Launcher.java:330) at java.lang.ClassLoader.loadClass(ClassLoader.java:254) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java: 402) ... 20 more ----snip----Incidentally, as part of this work I've also implemented aPearsondistance measure, if people think it would be useful to befolded in I'd be happy to get the SVN patch with tests and implementation together. Thanks, Paul-------------------------- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/ Droids) using Solr/Lucene: http://www.lucidimagination.com/search-------------------------- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search-- View this message in context: http://www.nabble.com/ClassNotFoundException-with-pseudo-distributed-run-of-KMeans-tp24505889p24795839.html Sent from the Mahout User List mailing list archive at Nabble.com.
-------------------------- Grant Ingersoll http://www.lucidimagination.com/Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene:
http://www.lucidimagination.com/search