Re: ClassNotFoundException: -libjars not working?
Hi, -libjars doesn't always work.Better way is to create a runnable jar with all dependencies ( if no of dependency is less) or u have to keep the jars into the lib folder of the hadoop in all machines. On Wed, Feb 22, 2012 at 8:13 PM, Ioan Eugen Stan stan.ieu...@gmail.comwrote: Hello, I'm trying to run a map-reduce job and I get ClassNotFoundException, but I have the class submitted with -libjars. What's wrong with how I do things? Please help. I'm running hadoop-0.20.2-cdh3u1, and I have everithing on the -libjars line. The job is submitted via a java app like: exec /usr/lib/jvm/java-6-sun/bin/**java -Dproc_jar -Xmx200m -server -Dhadoop.log.dir=/opt/ui/var/**log/mailsearch -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/**hadoop -Dhadoop.id.str=hbase -Dhadoop.root.logger=INFO,**console -Dhadoop.policy.file=hadoop-**policy.xml -classpath '/usr/lib/hadoop/conf:/usr/**lib/jvm/java-6-sun/lib/tools.** jar:/usr/lib/hadoop:/usr/lib/**hadoop/hadoop-core-0.20.2-** cdh3u1.jar:/usr/lib/hadoop/**lib/ant-contrib-1.0b3.jar:/** usr/lib/hadoop/lib/apache-**log4j-extras-1.1.jar:/usr/lib/** hadoop/lib/aspectjrt-1.6.5.**jar:/usr/lib/hadoop/lib/** aspectjtools-1.6.5.jar:/usr/**lib/hadoop/lib/commons-cli-1.** 2.jar:/usr/lib/hadoop/lib/**commons-codec-1.4.jar:/usr/** lib/hadoop/lib/commons-daemon-**1.0.1.jar:/usr/lib/hadoop/lib/** commons-el-1.0.jar:/usr/lib/**hadoop/lib/commons-httpclient-** 3.0.1.jar:/usr/lib/hadoop/lib/**commons-logging-1.0.4.jar:/** usr/lib/hadoop/lib/commons-**logging-api-1.0.4.jar:/usr/** lib/hadoop/lib/commons-net-1.**4.1.jar:/usr/lib/hadoop/lib/** core-3.1.1.jar:/usr/lib/**hadoop/lib/hadoop-**fairscheduler-0.20.2-cdh3u1. **jar:/usr/lib/hadoop/lib/**hsqldb-1.8.0.10.jar:/usr/lib/** hadoop/lib/jackson-core-asl-1.**5.2.jar:/usr/lib/hadoop/lib/** jackson-mapper-asl-1.5.2.jar:/**usr/lib/hadoop/lib/jasper-** compiler-5.5.12.jar:/usr/lib/**hadoop/lib/jasper-runtime-5.5.** 12.jar:/usr/lib/hadoop/lib /jcl-over-slf4j-1.6.1.jar:/**usr/lib/hadoop/lib/jets3t-0.6.** 1.jar:/usr/lib/hadoop/lib/**jetty-6.1.26.jar:/usr/lib/** hadoop/lib/jetty-servlet-**tester-6.1.26.jar:/usr/lib/** hadoop/lib/jetty-util-6.1.26.**jar:/usr/lib/hadoop/lib/jsch-** 0.1.42.jar:/usr/lib/hadoop/**lib/junit-4.5.jar:/usr/lib/** hadoop/lib/kfs-0.2.2.jar:/usr/**lib/hadoop/lib/log4j-1.2.15.** jar:/usr/lib/hadoop/lib/**mockito-all-1.8.2.jar:/usr/** lib/hadoop/lib/oro-2.0.8.jar:/**usr/lib/hadoop/lib/servlet-** api-2.5-20081211.jar:/usr/lib/**hadoop/lib/servlet-api-2.5-6.** 1.14.jar:/usr/lib/hadoop/lib/**slf4j-api-1.6.1.jar:/usr/lib/** hadoop/lib/slf4j-log4j12-1.6.**1.jar:/usr/lib/hadoop/lib/** xmlenc-0.52.jar:/usr/lib/**hadoop/lib/jsp-2.1/jsp-2.1.** jar:/usr/lib/hadoop/lib/jsp-2.**1/jsp-api-2.1.jar:/usr/share/** mailbox-convertor/lib/*:/usr/**lib/hadoop/contrib/capacity-** scheduler/hadoop-capacity-**scheduler-0.20.2-cdh3u1.jar:/** usr/lib/hbase/lib/hadoop-lzo-**0.4.13.jar:/usr/lib/hbase/** hbase.jar:/etc/hbase/conf:/**usr/lib/hbase/lib:/usr/lib/** zookeeper/zookeeper.jar:/usr/**lib/hadoop/contrib /capacity-scheduler/hadoop-**capacity-scheduler-0.20.2-** cdh3u1.jar:/usr/lib/hbase/lib/**hadoop-lzo-0.4.13.jar:/usr/** lib/hbase/hbase.jar:/etc/**hbase/conf:/usr/lib/hbase/lib:** /usr/lib/zookeeper/zookeeper.**jar' org.apache.hadoop.util.RunJar /usr/share/mailbox-convertor/**mailbox-convertor-0.1-**SNAPSHOT.jar -libjars=/usr/share/mailbox-**convertor/lib/antlr-2.7.7.jar,** /usr/share/mailbox-convertor/**lib/aopalliance-1.0.jar,/usr/** share/mailbox-convertor/lib/**asm-3.1.jar,/usr/share/** mailbox-convertor/lib/**backport-util-concurrent-3.1.** jar,/usr/share/mailbox-**convertor/lib/cglib-2.2.jar,/** usr/share/mailbox-convertor/**lib/hadoop-ant-3.0-u1.pom,/** usr/share/mailbox-convertor/**lib/speed4j-0.9.jar,/usr/** share/mailbox-convertor/lib/**jamm-0.2.2.jar,/usr/share/** mailbox-convertor/lib/uuid-3.**2.0.jar,/usr/share/mailbox-** convertor/lib/high-scale-lib-**1.1.1.jar,/usr/share/mailbox-** convertor/lib/jsr305-1.3.9.**jar,/usr/share/mailbox-** convertor/lib/guava-11.0.1.**jar,/usr/share/mailbox-** convertor/lib/protobuf-java-2.**4.0a.jar,/usr/share/mailbox-** convertor/lib/**concurrentlinkedhashmap-lru-1.**1.jar,/usr/share/mailbox-* *convertor/lib/json-simple-1.1.**jar,/usr/share/mailbox-** convertor/lib/itext-2.1.5.jar,**/usr/share/mailbox-convertor/** lib/jmxtools-1.2.1.jar,/usr/**share/mailbox-convertor/lib/** jersey-client-1.4.jar,/usr/**share/mailbox-converto r/lib/jersey-core-1.4.jar,/**usr/share/mailbox-convertor/** lib/jersey-json-1.4.jar,/usr/**share/mailbox-convertor/lib/** jersey-server-1.4.jar,/usr/**share/mailbox-convertor/lib/** jmxri-1.2.1.jar,/usr/share/**mailbox-convertor/lib/jaxb-** impl-2.1.12.jar,/usr/share/**mailbox-convertor/lib/xstream-** 1.2.2.jar,/usr/share/mailbox-**convertor/lib/commons-metrics-** 1.3.jar,/usr/share/mailbox-**convertor/lib/commons-** monitoring-2.9.1.jar,/usr/**share/mailbox-convertor/lib/**
Re: ClassNotFoundException: -libjars not working?
Pe 28.02.2012 10:58, madhu phatak a scris: Hi, -libjars doesn't always work.Better way is to create a runnable jar with all dependencies ( if no of dependency is less) or u have to keep the jars into the lib folder of the hadoop in all machines. Thanks for the reply Madhu, I adopted the second solution as explained in [1]. From what I found browsing the net it seems that -libjars is broken in hadoop version 0.18. I didn't got time to check the code yet. Cloudera released hadoop sources are packaged a bit odd and Netbeans doens't seem to play well with that and this really affects my will to try to fix the problem. -libjars is a nice feature that permits the use of skinny jars and would help system admins do better packaging. It also allows better control over the classpath. Too bad it didn't work. [1] http://www.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/ Cheers, -- Ioan Eugen Stan http://ieugen.blogspot.com
RE: ClassNotFoundException while running quick start guide on Windows.
Hi Drew, I don't know if this is actually the issue or not, but the output below makes me think you might be passing Cygwin pathes into the java.exe launcher. If that's the case, it won't work. java.exe is pure Windows and doesn't know about '/cygdrive/c' for example (it also expects the path separator to be semicolon rather than colon). Every once in a while when I try to use java.exe from the Cygwin CLI on my Windows box, I get bitten by this. Sandy -Original Message- From: Drew Gross [mailto:drew.a.gr...@gmail.com] Sent: Tuesday, June 21, 2011 21:26 To: common-user@hadoop.apache.org Subject: Re: ClassNotFoundException while running quick start guide on Windows. Thanks Jeff, it was a problem with JAVA_HOME. I have another problem now though, I have this: $JAVA: /cygdrive/c/Program Files/Java/jdk1.6.0_26/bin/java $JAVA_HEAP_MAX: -Xmx1000m $HADOOP_OPTS: -Dhadoop.log.dir=C:\Users\Drew Gross\Documents\Projects\discom\hadoop-0.21.0\logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=C:\Users\Drew Gross\Documents\Projects\discom\hadoop-0.21.0\ -Dhadoop.id.str= - Dhadoop.root.logger=INFO,console -Djava.library.path=/cygdrive/c/Users/Drew Gross/Documents/Projects/discom/hadoop-0.21.0/lib/native/ -Dhadoop.policy.file=hadoop-policy.xml $CLASS: org.apache.hadoop.util.RunJar Exception in thread main java.lang.NoClassDefFoundError: Gross\Documents\Projects\discom\hadoop-0/21/0\logs Caused by: java.lang.ClassNotFoundException: Gross\Documents\Projects\discom\hadoop-0.21.0\logs at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Could not find the main class: Gross\Documents\Projects\discom\hadoop-0.21.0\logs. Program will exit. (This is with some extra debugging info added by me in bin/hadoop) It looks like the windows style file names are causing problems, especially the spaces. Has anyone encountered this before, and know how to fix? I tried escaping the spaces and surrounding the file paths with quotes (not at the same time), but that didn't help. Drew On Tue, Jun 21, 2011 at 6:24 AM, madhu phatak phatak@gmail.com wrote: I think the jar have some issuses where its not able to read the Main class from manifest . try unjar the jar and see in Manifest.xml what is the main class and then run as follows bin/hadoop jar hadoop-*-examples.jar Full qualified main class grep input output 'dfs[a-z.]+' On Thu, Jun 16, 2011 at 10:23 AM, Drew Gross drew.a.gr...@gmail.com wrote: Hello, I'm trying to run the example from the quick start guide on Windows and I get this error: $ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+' Exception in thread main java.lang.NoClassDefFoundError: Caused by: java.lang.ClassNotFoundException: at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Could not find the main class: . Program will exit. Exception in thread main java.lang.NoClassDefFoundError: Gross\Documents\Projects\discom\hadoop-0/21/0\logs Caused by: java.lang.ClassNotFoundException: Gross\Documents\Projects\discom\hadoop-0.21.0\logs at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Could not find the main class: Gross\Documents\Projects\discom\hadoop-0.21.0\logs. Program will exit. Does anyone know what I need to change? Thank you. From, Drew -- Forget the environment. Print this e-mail immediately. Then burn it. -- Forget the environment. Print this e-mail immediately. Then burn it.
Re: ClassNotFoundException while running quick start guide on Windows.
I think the jar have some issuses where its not able to read the Main class from manifest . try unjar the jar and see in Manifest.xml what is the main class and then run as follows bin/hadoop jar hadoop-*-examples.jar Full qualified main class grep input output 'dfs[a-z.]+' On Thu, Jun 16, 2011 at 10:23 AM, Drew Gross drew.a.gr...@gmail.com wrote: Hello, I'm trying to run the example from the quick start guide on Windows and I get this error: $ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+' Exception in thread main java.lang.NoClassDefFoundError: Caused by: java.lang.ClassNotFoundException: at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Could not find the main class: . Program will exit. Exception in thread main java.lang.NoClassDefFoundError: Gross\Documents\Projects\discom\hadoop-0/21/0\logs Caused by: java.lang.ClassNotFoundException: Gross\Documents\Projects\discom\hadoop-0.21.0\logs at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Could not find the main class: Gross\Documents\Projects\discom\hadoop-0.21.0\logs. Program will exit. Does anyone know what I need to change? Thank you. From, Drew -- Forget the environment. Print this e-mail immediately. Then burn it.
Re: ClassNotFoundException while running quick start guide on Windows.
Thanks Jeff, it was a problem with JAVA_HOME. I have another problem now though, I have this: $JAVA: /cygdrive/c/Program Files/Java/jdk1.6.0_26/bin/java $JAVA_HEAP_MAX: -Xmx1000m $HADOOP_OPTS: -Dhadoop.log.dir=C:\Users\Drew Gross\Documents\Projects\discom\hadoop-0.21.0\logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=C:\Users\Drew Gross\Documents\Projects\discom\hadoop-0.21.0\ -Dhadoop.id.str= -Dhadoop.root.logger=INFO,console -Djava.library.path=/cygdrive/c/Users/Drew Gross/Documents/Projects/discom/hadoop-0.21.0/lib/native/ -Dhadoop.policy.file=hadoop-policy.xml $CLASS: org.apache.hadoop.util.RunJar Exception in thread main java.lang.NoClassDefFoundError: Gross\Documents\Projects\discom\hadoop-0/21/0\logs Caused by: java.lang.ClassNotFoundException: Gross\Documents\Projects\discom\hadoop-0.21.0\logs at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Could not find the main class: Gross\Documents\Projects\discom\hadoop-0.21.0\logs. Program will exit. (This is with some extra debugging info added by me in bin/hadoop) It looks like the windows style file names are causing problems, especially the spaces. Has anyone encountered this before, and know how to fix? I tried escaping the spaces and surrounding the file paths with quotes (not at the same time), but that didn't help. Drew On Tue, Jun 21, 2011 at 6:24 AM, madhu phatak phatak@gmail.com wrote: I think the jar have some issuses where its not able to read the Main class from manifest . try unjar the jar and see in Manifest.xml what is the main class and then run as follows bin/hadoop jar hadoop-*-examples.jar Full qualified main class grep input output 'dfs[a-z.]+' On Thu, Jun 16, 2011 at 10:23 AM, Drew Gross drew.a.gr...@gmail.com wrote: Hello, I'm trying to run the example from the quick start guide on Windows and I get this error: $ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+' Exception in thread main java.lang.NoClassDefFoundError: Caused by: java.lang.ClassNotFoundException: at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Could not find the main class: . Program will exit. Exception in thread main java.lang.NoClassDefFoundError: Gross\Documents\Projects\discom\hadoop-0/21/0\logs Caused by: java.lang.ClassNotFoundException: Gross\Documents\Projects\discom\hadoop-0.21.0\logs at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Could not find the main class: Gross\Documents\Projects\discom\hadoop-0.21.0\logs. Program will exit. Does anyone know what I need to change? Thank you. From, Drew -- Forget the environment. Print this e-mail immediately. Then burn it. -- Forget the environment. Print this e-mail immediately. Then burn it.
Re: ClassNotFoundException
The answer is in your log output: 10/12/31 10:26:54 WARN mapreduce.JobSubmitter: No job jar file set. User classes may not be found. See Job or Job#setJar(String). Alternatively, use Job.setJarByClass(Class class); On Fri, Dec 31, 2010 at 3:02 PM, Cavus,M.,Fa. Post Direkt m.ca...@postdirekt.de wrote: I look in my Jar File but I get a ClassNotFoundException why?: -- Harsh J www.harshj.com
RE: ClassNotFoundException
Could you give me the command Harsh? -Original Message- From: Harsh J [mailto:qwertyman...@gmail.com] Sent: Tuesday, December 28, 2010 5:15 PM To: common-user@hadoop.apache.org Subject: Re: ClassNotFoundException In your job driving class (WordCount as per that command), have you specified the jar by calling the Job.setJarByClass() [or on Stable API, JobConf.setJarByClass()]? I'm not sure if hadoop.util.RunJar automatically sends the jar across for distribution to TaskTrackers. On Tue, Dec 28, 2010 at 8:27 PM, Cavus,M.,Fa. Post Direkt m.ca...@postdirekt.de wrote: Hi, I process this command: ./hadoop jar /home/userme/hd.jar org.postdirekt.hadoop.WordCount gutenberg gutenberberg-output -- Harsh J www.harshj.com
Re: ClassNotFoundException
jar -tvf the jar file and double check that it is a class that is listed. Can't be in an included jar file. Sent from my mobile. Please excuse the typos. On 2010-12-28, at 7:58 AM, Cavus,M.,Fa. Post Direkt m.ca...@postdirekt.de wrote: Hi, I process this command: ./hadoop jar /home/userme/hd.jar org.postdirekt.hadoop.WordCount gutenberg gutenberberg-output and get this why? Because I have org.postdirekt.hadoop.Map in the jar File. 10/12/28 15:28:30 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_0, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:28:41 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_1, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:28:53 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_2, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:29:09 INFO mapreduce.Job: Job complete: job_201012281524_0002 10/12/28 15:29:09 INFO mapreduce.Job: Counters: 7 Job Counters Data-local map tasks=4 Total time spent by all maps waiting after reserving slots (ms)=0 Total time spent by all reduces waiting after reserving slots (ms)=0 Failed map tasks=1 SLOTS_MILLIS_MAPS=45636 SLOTS_MILLIS_REDUCES=0 Launched map tasks=4
RE: ClassNotFoundException
What must I do James? -Original Message- From: James Seigel [mailto:ja...@tynt.com] Sent: Tuesday, December 28, 2010 4:03 PM To: common-user@hadoop.apache.org Subject: Re: ClassNotFoundException jar -tvf the jar file and double check that it is a class that is listed. Can't be in an included jar file. Sent from my mobile. Please excuse the typos. On 2010-12-28, at 7:58 AM, Cavus,M.,Fa. Post Direkt m.ca...@postdirekt.de wrote: Hi, I process this command: ./hadoop jar /home/userme/hd.jar org.postdirekt.hadoop.WordCount gutenberg gutenberberg-output and get this why? Because I have org.postdirekt.hadoop.Map in the jar File. 10/12/28 15:28:30 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_0, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:28:41 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_1, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:28:53 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_2, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:29:09 INFO mapreduce.Job: Job complete: job_201012281524_0002 10/12/28 15:29:09 INFO mapreduce.Job: Counters: 7 Job Counters Data-local map tasks=4 Total time spent by all maps waiting after reserving slots (ms)=0 Total time spent by all reduces waiting after reserving slots (ms)=0
Re: ClassNotFoundException
Just run this and make sure you really have the class file in jar jar -tvf | grep org.postdirekt.hadoop.Map if you don't get any output, the you don't have the class file in your jar + Praveen On Dec 28, 2010, at 9:12 AM, Cavus,M.,Fa. Post Direkt wrote: What must I do James? -Original Message- From: James Seigel [mailto:ja...@tynt.com] Sent: Tuesday, December 28, 2010 4:03 PM To: common-user@hadoop.apache.org Subject: Re: ClassNotFoundException jar -tvf the jar file and double check that it is a class that is listed. Can't be in an included jar file. Sent from my mobile. Please excuse the typos. On 2010-12-28, at 7:58 AM, Cavus,M.,Fa. Post Direkt m.ca...@postdirekt.de wrote: Hi, I process this command: ./hadoop jar /home/userme/hd.jar org.postdirekt.hadoop.WordCount gutenberg gutenberberg-output and get this why? Because I have org.postdirekt.hadoop.Map in the jar File. 10/12/28 15:28:30 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_0, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:28:41 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_1, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:28:53 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_2, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:29:09 INFO mapreduce.Job: Job complete: job_201012281524_0002 10/12/28 15:29:09
RE: ClassNotFoundException
Hi Praveen, I get this 2398 Mon Dec 27 16:19:16 CET org/postdirekt/hadoop/Map.class -Original Message- From: Praveen Bathala [mailto:pbatha...@gmail.com] Sent: Tuesday, December 28, 2010 4:17 PM To: common-user@hadoop.apache.org Subject: Re: ClassNotFoundException Just run this and make sure you really have the class file in jar jar -tvf | grep org.postdirekt.hadoop.Map if you don't get any output, the you don't have the class file in your jar + Praveen On Dec 28, 2010, at 9:12 AM, Cavus,M.,Fa. Post Direkt wrote: What must I do James? -Original Message- From: James Seigel [mailto:ja...@tynt.com] Sent: Tuesday, December 28, 2010 4:03 PM To: common-user@hadoop.apache.org Subject: Re: ClassNotFoundException jar -tvf the jar file and double check that it is a class that is listed. Can't be in an included jar file. Sent from my mobile. Please excuse the typos. On 2010-12-28, at 7:58 AM, Cavus,M.,Fa. Post Direkt m.ca...@postdirekt.de wrote: Hi, I process this command: ./hadoop jar /home/userme/hd.jar org.postdirekt.hadoop.WordCount gutenberg gutenberberg-output and get this why? Because I have org.postdirekt.hadoop.Map in the jar File. 10/12/28 15:28:30 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_0, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:28:41 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_1, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:28:53 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_2, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native
RE: ClassNotFoundException
I'm using hadoop-0.20.2 and I see this for my map/reduce class com/ngc/asoc/recommend/Predict$Counter.class com/ngc/asoc/recommend/Predict$R.class com/ngc/asoc/recommend/Predict$M.class com/ngc/asoc/recommend/Predict.class I'm a java idiot so I don't know why they appear but perhaps you have similar? Michael D. Black Senior Scientist Advanced Analytics Directorate Northrop Grumman Information Systems From: Cavus,M.,Fa. Post Direkt [mailto:m.ca...@postdirekt.de] Sent: Tue 12/28/2010 9:21 AM To: common-user@hadoop.apache.org Subject: EXTERNAL:RE: ClassNotFoundException Hi Praveen, I get this 2398 Mon Dec 27 16:19:16 CET org/postdirekt/hadoop/Map.class
Re: ClassNotFoundException
In your job driving class (WordCount as per that command), have you specified the jar by calling the Job.setJarByClass() [or on Stable API, JobConf.setJarByClass()]? I'm not sure if hadoop.util.RunJar automatically sends the jar across for distribution to TaskTrackers. On Tue, Dec 28, 2010 at 8:27 PM, Cavus,M.,Fa. Post Direkt m.ca...@postdirekt.de wrote: Hi, I process this command: ./hadoop jar /home/userme/hd.jar org.postdirekt.hadoop.WordCount gutenberg gutenberberg-output -- Harsh J www.harshj.com
Re: ClassNotFoundException with contrib/join example
Sorry, I should have mentioned that I tried that as well and it also gives an error: $ p...@hadoop01:~/hadoop_tests$ hadoop jar -libjars ./samplejoin.jar /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input datajoin/output Text 1 org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text Exception in thread main java.io.IOException: Error opening job jar: -libjars at org.apache.hadoop.util.RunJar.main(RunJar.java:90) Caused by: java.util.zip.ZipException: error in opening zip file at java.util.zip.ZipFile.open(Native Method) at java.util.zip.ZipFile.init(ZipFile.java:114) at java.util.jar.JarFile.init(JarFile.java:133) at java.util.jar.JarFile.init(JarFile.java:70) at org.apache.hadoop.util.RunJar.main(RunJar.java:88) Has something changed or is my environment not set up correctly? Appreciate any help. On Fri, Mar 26, 2010 at 8:23 PM, Ted Yu yuzhih...@gmail.com wrote: Then use the syntax given by http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/util/GenericOptionsParser.html : $ bin/hadoop jar -libjars ./samplejoin.jar /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input ... On Fri, Mar 26, 2010 at 5:10 PM, M B machac...@gmail.com wrote: Sorry, but where exactly do I include the libjars option? I tried to put it where you stated (after the DataJoinJob class), but it just comes back with usage information (as if the option is not valid): $ p...@hadoop01:~/hadoop_tests$ hadoop jar /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar org.apache.hadoop.contrib.utils.join.DataJoinJob -libjars ./samplejoin.jar datajoin/input datajoin/output Text 1 org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text *usage: DataJoinJob inputdirs outputdir map_input_file_format numofParts mapper_class reducer_class map_output_value_class output_value_class [maxNumOfValuesPerGroup [descriptionOfJob]]]* It seems like it's not taking the option for some reason, like it's failing an argument check in DataJoinJob - does that not use the standard args or something? On Fri, Mar 26, 2010 at 4:38 PM, Ted Yu yuzhih...@gmail.com wrote: DataJoinJob is contained in hadoop-0.20.2-datajoin.jar which is in your HADOOP_CLASSPATH I think you should specify samplejoin.jar using -libjars instead of putting it directly after jar command: hadoop jar hadoop-0.20.2-datajoin.jar org.apache.hadoop.contrib.utils.join.DataJoinJob -libjars ./samplejoin.jar ... (same as your example) Cheers On Fri, Mar 26, 2010 at 3:24 PM, M B machac...@gmail.com wrote: I may be having a setup issue with classpaths, would appreciate some help. I created a jar with all the Sample* classes in contrib/DataJoin. Here is the listing of my samplejoin.jar file: zip.vim version v22 Browsing zipfile /home/hadoop/hadoop_tests/samplejoin.jar Select a file with cursor and press ENTER META-INF/ META-INF/MANIFEST.MF org/ org/apache/ org/apache/hadoop/ org/apache/hadoop/contrib/ org/apache/hadoop/contrib/utils/ org/apache/hadoop/contrib/utils/join/ org/apache/hadoop/contrib/utils/join/SampleDataJoinReducer.class org/apache/hadoop/contrib/utils/join/SampleTaggedMapOutput.class org/apache/hadoop/contrib/utils/join/SampleDataJoinMapper.class When I go to run this, things start to run, but every Map try errors out with: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Here is the command: hadoop jar ./samplejoin.jar org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input datajoin/output Text 1 org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text This is a new install of 0.20.2. HADOOP_CLASSPATH is set to: /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar Any help would be appreciated.
RE: ClassNotFoundException with contrib/join example
M B, I'm not sure about the -libjars argument but 'hadoop jar' is expecting the jarfile immediately afterwards: hadoop jar jarFile [mainClass] args... Nick Jones -Original Message- From: M B [mailto:machac...@gmail.com] Sent: Monday, March 29, 2010 10:26 AM To: common-user@hadoop.apache.org Subject: Re: ClassNotFoundException with contrib/join example Sorry, I should have mentioned that I tried that as well and it also gives an error: $ p...@hadoop01:~/hadoop_tests$ hadoop jar -libjars ./samplejoin.jar /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input datajoin/output Text 1 org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text Exception in thread main java.io.IOException: Error opening job jar: -libjars at org.apache.hadoop.util.RunJar.main(RunJar.java:90) Caused by: java.util.zip.ZipException: error in opening zip file at java.util.zip.ZipFile.open(Native Method) at java.util.zip.ZipFile.init(ZipFile.java:114) at java.util.jar.JarFile.init(JarFile.java:133) at java.util.jar.JarFile.init(JarFile.java:70) at org.apache.hadoop.util.RunJar.main(RunJar.java:88) Has something changed or is my environment not set up correctly? Appreciate any help. On Fri, Mar 26, 2010 at 8:23 PM, Ted Yu yuzhih...@gmail.com wrote: Then use the syntax given by http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/util/GenericOptionsParser.html : $ bin/hadoop jar -libjars ./samplejoin.jar /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input ... On Fri, Mar 26, 2010 at 5:10 PM, M B machac...@gmail.com wrote: Sorry, but where exactly do I include the libjars option? I tried to put it where you stated (after the DataJoinJob class), but it just comes back with usage information (as if the option is not valid): $ p...@hadoop01:~/hadoop_tests$ hadoop jar /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar org.apache.hadoop.contrib.utils.join.DataJoinJob -libjars ./samplejoin.jar datajoin/input datajoin/output Text 1 org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text *usage: DataJoinJob inputdirs outputdir map_input_file_format numofParts mapper_class reducer_class map_output_value_class output_value_class [maxNumOfValuesPerGroup [descriptionOfJob]]]* It seems like it's not taking the option for some reason, like it's failing an argument check in DataJoinJob - does that not use the standard args or something? On Fri, Mar 26, 2010 at 4:38 PM, Ted Yu yuzhih...@gmail.com wrote: DataJoinJob is contained in hadoop-0.20.2-datajoin.jar which is in your HADOOP_CLASSPATH I think you should specify samplejoin.jar using -libjars instead of putting it directly after jar command: hadoop jar hadoop-0.20.2-datajoin.jar org.apache.hadoop.contrib.utils.join.DataJoinJob -libjars ./samplejoin.jar ... (same as your example) Cheers On Fri, Mar 26, 2010 at 3:24 PM, M B machac...@gmail.com wrote: I may be having a setup issue with classpaths, would appreciate some help. I created a jar with all the Sample* classes in contrib/DataJoin. Here is the listing of my samplejoin.jar file: zip.vim version v22 Browsing zipfile /home/hadoop/hadoop_tests/samplejoin.jar Select a file with cursor and press ENTER META-INF/ META-INF/MANIFEST.MF org/ org/apache/ org/apache/hadoop/ org/apache/hadoop/contrib/ org/apache/hadoop/contrib/utils/ org/apache/hadoop/contrib/utils/join/ org/apache/hadoop/contrib/utils/join/SampleDataJoinReducer.class org/apache/hadoop/contrib/utils/join/SampleTaggedMapOutput.class org/apache/hadoop/contrib/utils/join/SampleDataJoinMapper.class When I go to run this, things start to run, but every Map try errors out with: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Here is the command: hadoop jar ./samplejoin.jar org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input datajoin/output Text 1 org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text This is a new install of 0.20.2. HADOOP_CLASSPATH is set to: /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar Any help would be appreciated.
Re: ClassNotFoundException with contrib/join example
Right, that was the first option I tried and it fails there as well. Maybe I need to step back and ask a higher-level question - does anyone have a full, step-by-step example of using a reduce-side join in an M/R job? Preferrably using the contrib/DataJoin classes, but I'll be happy with whatever example I could get. I'd love to see the actual code and then how it's kicked off on the command line so I can try it on my end as a prototype. I must be doing something wrong, but don't know what it is. Thanks. On Mon, Mar 29, 2010 at 8:31 AM, Jones, Nick nick.jo...@amd.com wrote: M B, I'm not sure about the -libjars argument but 'hadoop jar' is expecting the jarfile immediately afterwards: hadoop jar jarFile [mainClass] args... Nick Jones -Original Message- From: M B [mailto:machac...@gmail.com] Sent: Monday, March 29, 2010 10:26 AM To: common-user@hadoop.apache.org Subject: Re: ClassNotFoundException with contrib/join example Sorry, I should have mentioned that I tried that as well and it also gives an error: $ p...@hadoop01:~/hadoop_tests$ hadoop jar -libjars ./samplejoin.jar /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input datajoin/output Text 1 org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text Exception in thread main java.io.IOException: Error opening job jar: -libjars at org.apache.hadoop.util.RunJar.main(RunJar.java:90) Caused by: java.util.zip.ZipException: error in opening zip file at java.util.zip.ZipFile.open(Native Method) at java.util.zip.ZipFile.init(ZipFile.java:114) at java.util.jar.JarFile.init(JarFile.java:133) at java.util.jar.JarFile.init(JarFile.java:70) at org.apache.hadoop.util.RunJar.main(RunJar.java:88) Has something changed or is my environment not set up correctly? Appreciate any help. On Fri, Mar 26, 2010 at 8:23 PM, Ted Yu yuzhih...@gmail.com wrote: Then use the syntax given by http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/util/GenericOptionsParser.html : $ bin/hadoop jar -libjars ./samplejoin.jar /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input ... On Fri, Mar 26, 2010 at 5:10 PM, M B machac...@gmail.com wrote: Sorry, but where exactly do I include the libjars option? I tried to put it where you stated (after the DataJoinJob class), but it just comes back with usage information (as if the option is not valid): $ p...@hadoop01:~/hadoop_tests$ hadoop jar /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar org.apache.hadoop.contrib.utils.join.DataJoinJob -libjars ./samplejoin.jar datajoin/input datajoin/output Text 1 org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text *usage: DataJoinJob inputdirs outputdir map_input_file_format numofParts mapper_class reducer_class map_output_value_class output_value_class [maxNumOfValuesPerGroup [descriptionOfJob]]]* It seems like it's not taking the option for some reason, like it's failing an argument check in DataJoinJob - does that not use the standard args or something? On Fri, Mar 26, 2010 at 4:38 PM, Ted Yu yuzhih...@gmail.com wrote: DataJoinJob is contained in hadoop-0.20.2-datajoin.jar which is in your HADOOP_CLASSPATH I think you should specify samplejoin.jar using -libjars instead of putting it directly after jar command: hadoop jar hadoop-0.20.2-datajoin.jar org.apache.hadoop.contrib.utils.join.DataJoinJob -libjars ./samplejoin.jar ... (same as your example) Cheers On Fri, Mar 26, 2010 at 3:24 PM, M B machac...@gmail.com wrote: I may be having a setup issue with classpaths, would appreciate some help. I created a jar with all the Sample* classes in contrib/DataJoin. Here is the listing of my samplejoin.jar file: zip.vim version v22 Browsing zipfile /home/hadoop/hadoop_tests/samplejoin.jar Select a file with cursor and press ENTER META-INF/ META-INF/MANIFEST.MF org/ org/apache/ org/apache/hadoop/ org/apache/hadoop/contrib/ org/apache/hadoop/contrib/utils/ org/apache/hadoop/contrib/utils/join/ org/apache/hadoop/contrib/utils/join/SampleDataJoinReducer.class org/apache/hadoop/contrib/utils/join/SampleTaggedMapOutput.class org/apache/hadoop/contrib/utils/join/SampleDataJoinMapper.class When I go to run this, things start to run, but every Map try errors out with: java.lang.RuntimeException: java.lang.ClassNotFoundException
Re: ClassNotFoundException with contrib/join example
I can run the sample (I created the input files according to contrib/data_join/src/examples/org/apache/hadoop/contrib/utils/join/README.txt): [r...@tyu-linux datajoin]# pwd /opt/ks/hadoop-0.20.2/build/contrib/datajoin [r...@tyu-linux datajoin]# /opt/ks/hadoop-0.20.2/bin/hadoop jar hadoop-0.20.2-datajoin-examples.jar org.apache.hadoop.contrib.utils.join.DataJoinJob input output Text 1 org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text Using TextInputFormat: Text Using TextOutputFormat: Text 10/03/29 09:01:30 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 10/03/29 09:01:30 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 10/03/29 09:01:30 INFO mapred.FileInputFormat: Total input paths to process : 2 Job job_local_0001 is submitted Job job_local_0001 is still running. 10/03/29 09:01:30 INFO mapred.FileInputFormat: Total input paths to process : 2 10/03/29 09:01:31 INFO mapred.MapTask: numReduceTasks: 1 10/03/29 09:01:31 INFO mapred.MapTask: io.sort.mb = 100 10/03/29 09:01:31 INFO mapred.MapTask: data buffer = 79691776/99614720 10/03/29 09:01:31 INFO mapred.MapTask: record buffer = 262144/327680 10/03/29 09:01:31 INFO mapred.MapTask: Starting flush of map output 10/03/29 09:01:31 INFO mapred.MapTask: Finished spill 0 10/03/29 09:01:32 INFO mapred.TaskRunner: Task:attempt_local_0001_m_00_0 is done. And is in the process of commiting 10/03/29 09:01:32 INFO mapred.LocalJobRunner: collectedCount6 totalCount 6 10/03/29 09:01:32 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_00_0' done. 10/03/29 09:01:32 INFO mapred.MapTask: numReduceTasks: 1 10/03/29 09:01:32 INFO mapred.MapTask: io.sort.mb = 100 10/03/29 09:01:32 INFO mapred.MapTask: data buffer = 79691776/99614720 10/03/29 09:01:32 INFO mapred.MapTask: record buffer = 262144/327680 10/03/29 09:01:32 INFO mapred.MapTask: Starting flush of map output 10/03/29 09:01:32 INFO mapred.MapTask: Finished spill 0 10/03/29 09:01:32 INFO mapred.TaskRunner: Task:attempt_local_0001_m_01_0 is done. And is in the process of commiting 10/03/29 09:01:32 INFO mapred.LocalJobRunner: collectedCount5 totalCount 5 10/03/29 09:01:32 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_01_0' done. 10/03/29 09:01:32 INFO mapred.LocalJobRunner: 10/03/29 09:01:32 INFO mapred.Merger: Merging 2 sorted segments 10/03/29 09:01:32 INFO mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 939 bytes 10/03/29 09:01:32 INFO mapred.LocalJobRunner: 10/03/29 09:01:32 INFO util.NativeCodeLoader: Loaded the native-hadoop library 10/03/29 09:01:32 INFO zlib.ZlibFactory: Successfully loaded initialized native-zlib library 10/03/29 09:01:32 INFO datajoin.job: key: A.a11 this.largestNumOfValues: 3 10/03/29 09:01:32 INFO mapred.TaskRunner: Task:attempt_local_0001_r_00_0 is done. And is in the process of commiting 10/03/29 09:01:32 INFO mapred.LocalJobRunner: 10/03/29 09:01:32 INFO mapred.TaskRunner: Task attempt_local_0001_r_00_0 is allowed to commit now 10/03/29 09:01:32 INFO mapred.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_00_0' to file:/opt/kindsight/hadoop-0.20.2/build/contrib/datajoin/output 10/03/29 09:01:32 INFO mapred.LocalJobRunner: actuallyCollectedCount5 collectedCount 7 groupCount 6 reduce 10/03/29 09:01:32 INFO mapred.TaskRunner: Task 'attempt_local_0001_r_00_0' done. [r...@tyu-linux datajoin]# date Mon Mar 29 09:02:37 PDT 2010 It took a minute between the last INFO log and exit of DataJoinJob. Cheers On Mon, Mar 29, 2010 at 8:26 AM, M B machac...@gmail.com wrote: Sorry, I should have mentioned that I tried that as well and it also gives an error: $ p...@hadoop01:~/hadoop_tests$ hadoop jar -libjars ./samplejoin.jar /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input datajoin/output Text 1 org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text Exception in thread main java.io.IOException: Error opening job jar: -libjars at org.apache.hadoop.util.RunJar.main(RunJar.java:90) Caused by: java.util.zip.ZipException: error in opening zip file at java.util.zip.ZipFile.open(Native Method) at java.util.zip.ZipFile.init(ZipFile.java:114) at java.util.jar.JarFile.init(JarFile.java:133) at java.util.jar.JarFile.init(JarFile.java:70) at org.apache.hadoop.util.RunJar.main(RunJar.java:88) Has something changed or is my environment not set up correctly? Appreciate any help. On Fri, Mar 26, 2010 at 8:23 PM, Ted Yu yuzhih...@gmail.com wrote: Then use the syntax given by
Re: ClassNotFoundException with contrib/join example
I don't see hadoop-0.20.2-datajoin-examples.jar in the build/contrib/datajoin directory. Is that a jar you created separately? I tried creating one, but it still doesn't run (the mappers show the same error of missing the classes). had...@hadoop01:/opt/hadoop-0.20.2/build/contrib/datajoin$ ls classes examples test On Mon, Mar 29, 2010 at 9:26 AM, Ted Yu yuzhih...@gmail.com wrote: I can run the sample (I created the input files according to contrib/data_join/src/examples/org/apache/hadoop/contrib/utils/join/README.txt): [r...@tyu-linux datajoin]# pwd /opt/ks/hadoop-0.20.2/build/contrib/datajoin [r...@tyu-linux datajoin]# /opt/ks/hadoop-0.20.2/bin/hadoop jar hadoop-0.20.2-datajoin-examples.jar org.apache.hadoop.contrib.utils.join.DataJoinJob input output Text 1 org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text Using TextInputFormat: Text Using TextOutputFormat: Text 10/03/29 09:01:30 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 10/03/29 09:01:30 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 10/03/29 09:01:30 INFO mapred.FileInputFormat: Total input paths to process : 2 Job job_local_0001 is submitted Job job_local_0001 is still running. 10/03/29 09:01:30 INFO mapred.FileInputFormat: Total input paths to process : 2 10/03/29 09:01:31 INFO mapred.MapTask: numReduceTasks: 1 10/03/29 09:01:31 INFO mapred.MapTask: io.sort.mb = 100 10/03/29 09:01:31 INFO mapred.MapTask: data buffer = 79691776/99614720 10/03/29 09:01:31 INFO mapred.MapTask: record buffer = 262144/327680 10/03/29 09:01:31 INFO mapred.MapTask: Starting flush of map output 10/03/29 09:01:31 INFO mapred.MapTask: Finished spill 0 10/03/29 09:01:32 INFO mapred.TaskRunner: Task:attempt_local_0001_m_00_0 is done. And is in the process of commiting 10/03/29 09:01:32 INFO mapred.LocalJobRunner: collectedCount6 totalCount 6 10/03/29 09:01:32 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_00_0' done. 10/03/29 09:01:32 INFO mapred.MapTask: numReduceTasks: 1 10/03/29 09:01:32 INFO mapred.MapTask: io.sort.mb = 100 10/03/29 09:01:32 INFO mapred.MapTask: data buffer = 79691776/99614720 10/03/29 09:01:32 INFO mapred.MapTask: record buffer = 262144/327680 10/03/29 09:01:32 INFO mapred.MapTask: Starting flush of map output 10/03/29 09:01:32 INFO mapred.MapTask: Finished spill 0 10/03/29 09:01:32 INFO mapred.TaskRunner: Task:attempt_local_0001_m_01_0 is done. And is in the process of commiting 10/03/29 09:01:32 INFO mapred.LocalJobRunner: collectedCount5 totalCount 5 10/03/29 09:01:32 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_01_0' done. 10/03/29 09:01:32 INFO mapred.LocalJobRunner: 10/03/29 09:01:32 INFO mapred.Merger: Merging 2 sorted segments 10/03/29 09:01:32 INFO mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 939 bytes 10/03/29 09:01:32 INFO mapred.LocalJobRunner: 10/03/29 09:01:32 INFO util.NativeCodeLoader: Loaded the native-hadoop library 10/03/29 09:01:32 INFO zlib.ZlibFactory: Successfully loaded initialized native-zlib library 10/03/29 09:01:32 INFO datajoin.job: key: A.a11 this.largestNumOfValues: 3 10/03/29 09:01:32 INFO mapred.TaskRunner: Task:attempt_local_0001_r_00_0 is done. And is in the process of commiting 10/03/29 09:01:32 INFO mapred.LocalJobRunner: 10/03/29 09:01:32 INFO mapred.TaskRunner: Task attempt_local_0001_r_00_0 is allowed to commit now 10/03/29 09:01:32 INFO mapred.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_00_0' to file:/opt/kindsight/hadoop-0.20.2/build/contrib/datajoin/output 10/03/29 09:01:32 INFO mapred.LocalJobRunner: actuallyCollectedCount5 collectedCount 7 groupCount 6 reduce 10/03/29 09:01:32 INFO mapred.TaskRunner: Task 'attempt_local_0001_r_00_0' done. [r...@tyu-linux datajoin]# date Mon Mar 29 09:02:37 PDT 2010 It took a minute between the last INFO log and exit of DataJoinJob. Cheers On Mon, Mar 29, 2010 at 8:26 AM, M B machac...@gmail.com wrote: Sorry, I should have mentioned that I tried that as well and it also gives an error: $ p...@hadoop01:~/hadoop_tests$ hadoop jar -libjars ./samplejoin.jar /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input datajoin/output Text 1 org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text Exception in thread main java.io.IOException: Error opening job jar: -libjars at org.apache.hadoop.util.RunJar.main(RunJar.java:90) Caused by: java.util.zip.ZipException: error in opening zip file
Re: ClassNotFoundException with contrib/join example
Under hadoop-0.20.2/src/contrib/data_join, run ant jar-examples You may need to rename the jars (hadoop-\$\{version\}-datajoin-examples.jar): [r...@tyu-linux datajoin]# ls classes examples hadoop-0.20.2-datajoin-examples.jar hadoop-0.20.2-datajoin.jar input output test On Mon, Mar 29, 2010 at 1:59 PM, M B machac...@gmail.com wrote: I don't see hadoop-0.20.2-datajoin-examples.jar in the build/contrib/datajoin directory. Is that a jar you created separately? I tried creating one, but it still doesn't run (the mappers show the same error of missing the classes). had...@hadoop01:/opt/hadoop-0.20.2/build/contrib/datajoin$ ls classes examples test On Mon, Mar 29, 2010 at 9:26 AM, Ted Yu yuzhih...@gmail.com wrote: I can run the sample (I created the input files according to contrib/data_join/src/examples/org/apache/hadoop/contrib/utils/join/README.txt): [r...@tyu-linux datajoin]# pwd /opt/ks/hadoop-0.20.2/build/contrib/datajoin [r...@tyu-linux datajoin]# /opt/ks/hadoop-0.20.2/bin/hadoop jar hadoop-0.20.2-datajoin-examples.jar org.apache.hadoop.contrib.utils.join.DataJoinJob input output Text 1 org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text Using TextInputFormat: Text Using TextOutputFormat: Text 10/03/29 09:01:30 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 10/03/29 09:01:30 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 10/03/29 09:01:30 INFO mapred.FileInputFormat: Total input paths to process : 2 Job job_local_0001 is submitted Job job_local_0001 is still running. 10/03/29 09:01:30 INFO mapred.FileInputFormat: Total input paths to process : 2 10/03/29 09:01:31 INFO mapred.MapTask: numReduceTasks: 1 10/03/29 09:01:31 INFO mapred.MapTask: io.sort.mb = 100 10/03/29 09:01:31 INFO mapred.MapTask: data buffer = 79691776/99614720 10/03/29 09:01:31 INFO mapred.MapTask: record buffer = 262144/327680 10/03/29 09:01:31 INFO mapred.MapTask: Starting flush of map output 10/03/29 09:01:31 INFO mapred.MapTask: Finished spill 0 10/03/29 09:01:32 INFO mapred.TaskRunner: Task:attempt_local_0001_m_00_0 is done. And is in the process of commiting 10/03/29 09:01:32 INFO mapred.LocalJobRunner: collectedCount6 totalCount 6 10/03/29 09:01:32 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_00_0' done. 10/03/29 09:01:32 INFO mapred.MapTask: numReduceTasks: 1 10/03/29 09:01:32 INFO mapred.MapTask: io.sort.mb = 100 10/03/29 09:01:32 INFO mapred.MapTask: data buffer = 79691776/99614720 10/03/29 09:01:32 INFO mapred.MapTask: record buffer = 262144/327680 10/03/29 09:01:32 INFO mapred.MapTask: Starting flush of map output 10/03/29 09:01:32 INFO mapred.MapTask: Finished spill 0 10/03/29 09:01:32 INFO mapred.TaskRunner: Task:attempt_local_0001_m_01_0 is done. And is in the process of commiting 10/03/29 09:01:32 INFO mapred.LocalJobRunner: collectedCount5 totalCount 5 10/03/29 09:01:32 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_01_0' done. 10/03/29 09:01:32 INFO mapred.LocalJobRunner: 10/03/29 09:01:32 INFO mapred.Merger: Merging 2 sorted segments 10/03/29 09:01:32 INFO mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 939 bytes 10/03/29 09:01:32 INFO mapred.LocalJobRunner: 10/03/29 09:01:32 INFO util.NativeCodeLoader: Loaded the native-hadoop library 10/03/29 09:01:32 INFO zlib.ZlibFactory: Successfully loaded initialized native-zlib library 10/03/29 09:01:32 INFO datajoin.job: key: A.a11 this.largestNumOfValues: 3 10/03/29 09:01:32 INFO mapred.TaskRunner: Task:attempt_local_0001_r_00_0 is done. And is in the process of commiting 10/03/29 09:01:32 INFO mapred.LocalJobRunner: 10/03/29 09:01:32 INFO mapred.TaskRunner: Task attempt_local_0001_r_00_0 is allowed to commit now 10/03/29 09:01:32 INFO mapred.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_00_0' to file:/opt/kindsight/hadoop-0.20.2/build/contrib/datajoin/output 10/03/29 09:01:32 INFO mapred.LocalJobRunner: actuallyCollectedCount5 collectedCount 7 groupCount 6 reduce 10/03/29 09:01:32 INFO mapred.TaskRunner: Task 'attempt_local_0001_r_00_0' done. [r...@tyu-linux datajoin]# date Mon Mar 29 09:02:37 PDT 2010 It took a minute between the last INFO log and exit of DataJoinJob. Cheers On Mon, Mar 29, 2010 at 8:26 AM, M B machac...@gmail.com wrote: Sorry, I should have mentioned that I tried that as well and it also gives an error: $ p...@hadoop01:~/hadoop_tests$ hadoop jar -libjars ./samplejoin.jar /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar
Re: ClassNotFoundException with contrib/join example
ah, thanks, that got it. now I'm at the same point you are - part-0.deflate is there and is not readable. Seems like I should see text output, right? On Mon, Mar 29, 2010 at 2:04 PM, Ted Yu yuzhih...@gmail.com wrote: Under hadoop-0.20.2/src/contrib/data_join, run ant jar-examples You may need to rename the jars (hadoop-\$\{version\}-datajoin-examples.jar): [r...@tyu-linux datajoin]# ls classes examples hadoop-0.20.2-datajoin-examples.jar hadoop-0.20.2-datajoin.jar input output test On Mon, Mar 29, 2010 at 1:59 PM, M B machac...@gmail.com wrote: I don't see hadoop-0.20.2-datajoin-examples.jar in the build/contrib/datajoin directory. Is that a jar you created separately? I tried creating one, but it still doesn't run (the mappers show the same error of missing the classes). had...@hadoop01:/opt/hadoop-0.20.2/build/contrib/datajoin$ ls classes examples test On Mon, Mar 29, 2010 at 9:26 AM, Ted Yu yuzhih...@gmail.com wrote: I can run the sample (I created the input files according to contrib/data_join/src/examples/org/apache/hadoop/contrib/utils/join/README.txt): [r...@tyu-linux datajoin]# pwd /opt/ks/hadoop-0.20.2/build/contrib/datajoin [r...@tyu-linux datajoin]# /opt/ks/hadoop-0.20.2/bin/hadoop jar hadoop-0.20.2-datajoin-examples.jar org.apache.hadoop.contrib.utils.join.DataJoinJob input output Text 1 org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text Using TextInputFormat: Text Using TextOutputFormat: Text 10/03/29 09:01:30 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 10/03/29 09:01:30 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 10/03/29 09:01:30 INFO mapred.FileInputFormat: Total input paths to process : 2 Job job_local_0001 is submitted Job job_local_0001 is still running. 10/03/29 09:01:30 INFO mapred.FileInputFormat: Total input paths to process : 2 10/03/29 09:01:31 INFO mapred.MapTask: numReduceTasks: 1 10/03/29 09:01:31 INFO mapred.MapTask: io.sort.mb = 100 10/03/29 09:01:31 INFO mapred.MapTask: data buffer = 79691776/99614720 10/03/29 09:01:31 INFO mapred.MapTask: record buffer = 262144/327680 10/03/29 09:01:31 INFO mapred.MapTask: Starting flush of map output 10/03/29 09:01:31 INFO mapred.MapTask: Finished spill 0 10/03/29 09:01:32 INFO mapred.TaskRunner: Task:attempt_local_0001_m_00_0 is done. And is in the process of commiting 10/03/29 09:01:32 INFO mapred.LocalJobRunner: collectedCount6 totalCount 6 10/03/29 09:01:32 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_00_0' done. 10/03/29 09:01:32 INFO mapred.MapTask: numReduceTasks: 1 10/03/29 09:01:32 INFO mapred.MapTask: io.sort.mb = 100 10/03/29 09:01:32 INFO mapred.MapTask: data buffer = 79691776/99614720 10/03/29 09:01:32 INFO mapred.MapTask: record buffer = 262144/327680 10/03/29 09:01:32 INFO mapred.MapTask: Starting flush of map output 10/03/29 09:01:32 INFO mapred.MapTask: Finished spill 0 10/03/29 09:01:32 INFO mapred.TaskRunner: Task:attempt_local_0001_m_01_0 is done. And is in the process of commiting 10/03/29 09:01:32 INFO mapred.LocalJobRunner: collectedCount5 totalCount 5 10/03/29 09:01:32 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_01_0' done. 10/03/29 09:01:32 INFO mapred.LocalJobRunner: 10/03/29 09:01:32 INFO mapred.Merger: Merging 2 sorted segments 10/03/29 09:01:32 INFO mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 939 bytes 10/03/29 09:01:32 INFO mapred.LocalJobRunner: 10/03/29 09:01:32 INFO util.NativeCodeLoader: Loaded the native-hadoop library 10/03/29 09:01:32 INFO zlib.ZlibFactory: Successfully loaded initialized native-zlib library 10/03/29 09:01:32 INFO datajoin.job: key: A.a11 this.largestNumOfValues: 3 10/03/29 09:01:32 INFO mapred.TaskRunner: Task:attempt_local_0001_r_00_0 is done. And is in the process of commiting 10/03/29 09:01:32 INFO mapred.LocalJobRunner: 10/03/29 09:01:32 INFO mapred.TaskRunner: Task attempt_local_0001_r_00_0 is allowed to commit now 10/03/29 09:01:32 INFO mapred.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_00_0' to file:/opt/kindsight/hadoop-0.20.2/build/contrib/datajoin/output 10/03/29 09:01:32 INFO mapred.LocalJobRunner: actuallyCollectedCount 5 collectedCount 7 groupCount 6 reduce 10/03/29 09:01:32 INFO mapred.TaskRunner: Task 'attempt_local_0001_r_00_0' done. [r...@tyu-linux datajoin]# date Mon Mar 29 09:02:37 PDT 2010 It took a minute between the last INFO log and exit of DataJoinJob. Cheers On Mon, Mar
Re: ClassNotFoundException with contrib/join example
DataJoinJob is contained in hadoop-0.20.2-datajoin.jar which is in your HADOOP_CLASSPATH I think you should specify samplejoin.jar using -libjars instead of putting it directly after jar command: hadoop jar hadoop-0.20.2-datajoin.jar org.apache.hadoop.contrib.utils.join.DataJoinJob -libjars ./samplejoin.jar ... (same as your example) Cheers On Fri, Mar 26, 2010 at 3:24 PM, M B machac...@gmail.com wrote: I may be having a setup issue with classpaths, would appreciate some help. I created a jar with all the Sample* classes in contrib/DataJoin. Here is the listing of my samplejoin.jar file: zip.vim version v22 Browsing zipfile /home/hadoop/hadoop_tests/samplejoin.jar Select a file with cursor and press ENTER META-INF/ META-INF/MANIFEST.MF org/ org/apache/ org/apache/hadoop/ org/apache/hadoop/contrib/ org/apache/hadoop/contrib/utils/ org/apache/hadoop/contrib/utils/join/ org/apache/hadoop/contrib/utils/join/SampleDataJoinReducer.class org/apache/hadoop/contrib/utils/join/SampleTaggedMapOutput.class org/apache/hadoop/contrib/utils/join/SampleDataJoinMapper.class When I go to run this, things start to run, but every Map try errors out with: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Here is the command: hadoop jar ./samplejoin.jar org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input datajoin/output Text 1 org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text This is a new install of 0.20.2. HADOOP_CLASSPATH is set to: /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar Any help would be appreciated.
Re: ClassNotFoundException with contrib/join example
Then use the syntax given by http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/util/GenericOptionsParser.html : $ bin/hadoop jar -libjars ./samplejoin.jar /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input ... On Fri, Mar 26, 2010 at 5:10 PM, M B machac...@gmail.com wrote: Sorry, but where exactly do I include the libjars option? I tried to put it where you stated (after the DataJoinJob class), but it just comes back with usage information (as if the option is not valid): $ p...@hadoop01:~/hadoop_tests$ hadoop jar /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar org.apache.hadoop.contrib.utils.join.DataJoinJob -libjars ./samplejoin.jar datajoin/input datajoin/output Text 1 org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text *usage: DataJoinJob inputdirs outputdir map_input_file_format numofParts mapper_class reducer_class map_output_value_class output_value_class [maxNumOfValuesPerGroup [descriptionOfJob]]]* It seems like it's not taking the option for some reason, like it's failing an argument check in DataJoinJob - does that not use the standard args or something? On Fri, Mar 26, 2010 at 4:38 PM, Ted Yu yuzhih...@gmail.com wrote: DataJoinJob is contained in hadoop-0.20.2-datajoin.jar which is in your HADOOP_CLASSPATH I think you should specify samplejoin.jar using -libjars instead of putting it directly after jar command: hadoop jar hadoop-0.20.2-datajoin.jar org.apache.hadoop.contrib.utils.join.DataJoinJob -libjars ./samplejoin.jar ... (same as your example) Cheers On Fri, Mar 26, 2010 at 3:24 PM, M B machac...@gmail.com wrote: I may be having a setup issue with classpaths, would appreciate some help. I created a jar with all the Sample* classes in contrib/DataJoin. Here is the listing of my samplejoin.jar file: zip.vim version v22 Browsing zipfile /home/hadoop/hadoop_tests/samplejoin.jar Select a file with cursor and press ENTER META-INF/ META-INF/MANIFEST.MF org/ org/apache/ org/apache/hadoop/ org/apache/hadoop/contrib/ org/apache/hadoop/contrib/utils/ org/apache/hadoop/contrib/utils/join/ org/apache/hadoop/contrib/utils/join/SampleDataJoinReducer.class org/apache/hadoop/contrib/utils/join/SampleTaggedMapOutput.class org/apache/hadoop/contrib/utils/join/SampleDataJoinMapper.class When I go to run this, things start to run, but every Map try errors out with: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Here is the command: hadoop jar ./samplejoin.jar org.apache.hadoop.contrib.utils.join.DataJoinJob datajoin/input datajoin/output Text 1 org.apache.hadoop.contrib.utils.join.SampleDataJoinMapper org.apache.hadoop.contrib.utils.join.SampleDataJoinReducer org.apache.hadoop.contrib.utils.join.SampleTaggedMapOutput Text This is a new install of 0.20.2. HADOOP_CLASSPATH is set to: /opt/hadoop-0.20.2/contrib/datajoin/hadoop-0.20.2-datajoin.jar Any help would be appreciated.