ClassNotFoundException
Hi, I process this command: ./hadoop jar /home/userme/hd.jar org.postdirekt.hadoop.WordCount gutenberg gutenberberg-output and get this why? Because I have org.postdirekt.hadoop.Map in the jar File. 10/12/28 15:28:30 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_0, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:28:41 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_1, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:28:53 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_2, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:29:09 INFO mapreduce.Job: Job complete: job_201012281524_0002 10/12/28 15:29:09 INFO mapreduce.Job: Counters: 7 Job Counters Data-local map tasks=4 Total time spent by all maps waiting after reserving slots (ms)=0 Total time spent by all reduces waiting after reserving slots (ms)=0 Failed map tasks=1 SLOTS_MILLIS_MAPS=45636 SLOTS_MILLIS_REDUCES=0 Launched map tasks=4
Re: ClassNotFoundException
jar -tvf the jar file and double check that it is a class that is listed. Can't be in an included jar file. Sent from my mobile. Please excuse the typos. On 2010-12-28, at 7:58 AM, Cavus,M.,Fa. Post Direkt m.ca...@postdirekt.de wrote: Hi, I process this command: ./hadoop jar /home/userme/hd.jar org.postdirekt.hadoop.WordCount gutenberg gutenberberg-output and get this why? Because I have org.postdirekt.hadoop.Map in the jar File. 10/12/28 15:28:30 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_0, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:28:41 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_1, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:28:53 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_2, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:29:09 INFO mapreduce.Job: Job complete: job_201012281524_0002 10/12/28 15:29:09 INFO mapreduce.Job: Counters: 7 Job Counters Data-local map tasks=4 Total time spent by all maps waiting after reserving slots (ms)=0 Total time spent by all reduces waiting after reserving slots (ms)=0 Failed map tasks=1 SLOTS_MILLIS_MAPS=45636 SLOTS_MILLIS_REDUCES=0 Launched map tasks=4
RE: ClassNotFoundException
What must I do James? -Original Message- From: James Seigel [mailto:ja...@tynt.com] Sent: Tuesday, December 28, 2010 4:03 PM To: common-user@hadoop.apache.org Subject: Re: ClassNotFoundException jar -tvf the jar file and double check that it is a class that is listed. Can't be in an included jar file. Sent from my mobile. Please excuse the typos. On 2010-12-28, at 7:58 AM, Cavus,M.,Fa. Post Direkt m.ca...@postdirekt.de wrote: Hi, I process this command: ./hadoop jar /home/userme/hd.jar org.postdirekt.hadoop.WordCount gutenberg gutenberberg-output and get this why? Because I have org.postdirekt.hadoop.Map in the jar File. 10/12/28 15:28:30 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_0, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:28:41 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_1, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:28:53 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_2, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:29:09 INFO mapreduce.Job: Job complete: job_201012281524_0002 10/12/28 15:29:09 INFO mapreduce.Job: Counters: 7 Job Counters Data-local map tasks=4 Total time spent by all maps waiting after reserving slots (ms)=0 Total time spent by all reduces waiting after reserving slots (ms)=0
Re: ClassNotFoundException
Just run this and make sure you really have the class file in jar jar -tvf | grep org.postdirekt.hadoop.Map if you don't get any output, the you don't have the class file in your jar + Praveen On Dec 28, 2010, at 9:12 AM, Cavus,M.,Fa. Post Direkt wrote: What must I do James? -Original Message- From: James Seigel [mailto:ja...@tynt.com] Sent: Tuesday, December 28, 2010 4:03 PM To: common-user@hadoop.apache.org Subject: Re: ClassNotFoundException jar -tvf the jar file and double check that it is a class that is listed. Can't be in an included jar file. Sent from my mobile. Please excuse the typos. On 2010-12-28, at 7:58 AM, Cavus,M.,Fa. Post Direkt m.ca...@postdirekt.de wrote: Hi, I process this command: ./hadoop jar /home/userme/hd.jar org.postdirekt.hadoop.WordCount gutenberg gutenberberg-output and get this why? Because I have org.postdirekt.hadoop.Map in the jar File. 10/12/28 15:28:30 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_0, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:28:41 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_1, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:28:53 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_2, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:29:09 INFO mapreduce.Job: Job complete: job_201012281524_0002 10/12/28 15:29:09
RE: ClassNotFoundException
Hi Praveen, I get this 2398 Mon Dec 27 16:19:16 CET org/postdirekt/hadoop/Map.class -Original Message- From: Praveen Bathala [mailto:pbatha...@gmail.com] Sent: Tuesday, December 28, 2010 4:17 PM To: common-user@hadoop.apache.org Subject: Re: ClassNotFoundException Just run this and make sure you really have the class file in jar jar -tvf | grep org.postdirekt.hadoop.Map if you don't get any output, the you don't have the class file in your jar + Praveen On Dec 28, 2010, at 9:12 AM, Cavus,M.,Fa. Post Direkt wrote: What must I do James? -Original Message- From: James Seigel [mailto:ja...@tynt.com] Sent: Tuesday, December 28, 2010 4:03 PM To: common-user@hadoop.apache.org Subject: Re: ClassNotFoundException jar -tvf the jar file and double check that it is a class that is listed. Can't be in an included jar file. Sent from my mobile. Please excuse the typos. On 2010-12-28, at 7:58 AM, Cavus,M.,Fa. Post Direkt m.ca...@postdirekt.de wrote: Hi, I process this command: ./hadoop jar /home/userme/hd.jar org.postdirekt.hadoop.WordCount gutenberg gutenberberg-output and get this why? Because I have org.postdirekt.hadoop.Map in the jar File. 10/12/28 15:28:30 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_0, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:28:41 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_1, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.m 10/12/28 15:28:53 INFO mapreduce.Job: Task Id : attempt_201012281524_0002_m_00_2, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1128) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContex tImpl.java:167) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:612) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:328) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio n.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.lang.ClassNotFoundException: org.postdirekt.hadoop.Map at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native
RE: ClassNotFoundException
I'm using hadoop-0.20.2 and I see this for my map/reduce class com/ngc/asoc/recommend/Predict$Counter.class com/ngc/asoc/recommend/Predict$R.class com/ngc/asoc/recommend/Predict$M.class com/ngc/asoc/recommend/Predict.class I'm a java idiot so I don't know why they appear but perhaps you have similar? Michael D. Black Senior Scientist Advanced Analytics Directorate Northrop Grumman Information Systems From: Cavus,M.,Fa. Post Direkt [mailto:m.ca...@postdirekt.de] Sent: Tue 12/28/2010 9:21 AM To: common-user@hadoop.apache.org Subject: EXTERNAL:RE: ClassNotFoundException Hi Praveen, I get this 2398 Mon Dec 27 16:19:16 CET org/postdirekt/hadoop/Map.class
Re: ClassNotFoundException
In your job driving class (WordCount as per that command), have you specified the jar by calling the Job.setJarByClass() [or on Stable API, JobConf.setJarByClass()]? I'm not sure if hadoop.util.RunJar automatically sends the jar across for distribution to TaskTrackers. On Tue, Dec 28, 2010 at 8:27 PM, Cavus,M.,Fa. Post Direkt m.ca...@postdirekt.de wrote: Hi, I process this command: ./hadoop jar /home/userme/hd.jar org.postdirekt.hadoop.WordCount gutenberg gutenberberg-output -- Harsh J www.harshj.com
Re: where the the cloudera hbase rpm's?
Great, thank you. Mark On Tue, Dec 28, 2010 at 12:18 PM, Eric Sammer esam...@cloudera.com wrote: Mark: For Cloudera / CDH specific questions, please use the cdh-user list at https://groups.google.com/a/cloudera.org/group/cdh-user/topics Thanks. On Tue, Dec 28, 2010 at 1:13 PM, Mark Kerzner markkerz...@gmail.com wrote: 1) On each server, install the core HBase RPMs: hbase, hbase-native, hbase-master, hbase-regionserver, hbase-zookeeper, hbase-conf-pseudo, hbase-docs. *I do this: yum list | grep cloudera | grep hbase* * * *and nothing happens. But I do have other packages from Cloudera* * * * yum list | grep cloudera cloudera-desktop.i386 0.3.0-1 cloudera-cdh2 cloudera-desktop.x86_640.3.0-1 cloudera-cdh2 cloudera-desktop-plugins.noarch0.3.0-1 cloudera-cdh2 hadoop-0.18.noarch 0.18.3+76.2-1 cloudera-cdh2 hadoop-0.18-conf-pseudo.noarch 0.18.3+76.2-1 cloudera-cdh2 hadoop-0.18-datanode.noarch0.18.3+76.2-1 cloudera-cdh2 hadoop-0.18-debuginfo.i386 0.18.3+71-1 cloudera-cdh2 hadoop-0.18-docs.noarch0.18.3+76.2-1 cloudera-cdh2 hadoop-0.18-jobtracker.noarch 0.18.3+76.2-1 cloudera-cdh2 hadoop-0.18-libhdfs.i386 0.18.3+76.2-1 cloudera-cdh2 hadoop-0.18-libhdfs.x86_64 0.18.3+76.2-1 cloudera-cdh2 hadoop-0.18-namenode.noarch0.18.3+76.2-1 cloudera-cdh2 hadoop-0.18-native.i3860.18.3+76.2-1 cloudera-cdh2 hadoop-0.18-native.x86_64 0.18.3+76.2-1 cloudera-cdh2 hadoop-0.18-pipes.i386 0.18.3+76.2-1 cloudera-cdh2 hadoop-0.18-pipes.x86_64 0.18.3+76.2-1 cloudera-cdh2 hadoop-0.18-secondarynamenode.noarch 0.18.3+76.2-1 cloudera-cdh2 hadoop-0.18-source.noarch 0.18.3+76.2-1 cloudera-cdh2 hadoop-0.18-tasktracker.noarch 0.18.3+76.2-1 cloudera-cdh2 hadoop-0.20-conf-pseudo.noarch 0.20.1+169.113-1 cloudera-cdh2 hadoop-0.20-conf-pseudo-desktop.noarch 0.3.0-1 cloudera-cdh2 hadoop-0.20-datanode.noarch0.20.1+169.113-1 cloudera-cdh2 hadoop-0.20-debuginfo.i386 0.20.1+169.113-1 cloudera-cdh2 hadoop-0.20-debuginfo.x86_64 0.20.1+169.113-1 cloudera-cdh2 Thank you * -- Eric Sammer twitter: esammer data: www.cloudera.com
Re: Hadoop RPC call response post processing
Hi Ted, I don't think the problem is allocation but garbage collection. When the gc kicks in everything freezes. Of course changing the gc algorithm helps a little. Stefan On Dec 27, 2010, at 11:21 PM, Ted Dunning wrote: I would be very surprised if allocation itself is the problem as opposed to good old fashioned excess copying. It is very hard to write an allocator faster than the java generational gc, especially if you are talking about objects that are ephemeral. Have you looked at the tenuring distribution? On Mon, Dec 27, 2010 at 8:07 PM, Stefan Groschupf s...@101tec.com wrote: Hi All, I'm browsing the RPC code since quite a while now trying to find any entry point / interceptor slot that allows me to handle a RPC call response writable after it was send over the wire. Does anybody has an idea how break into the RPC code from outside. All the interesting methods are private. :( Background: Heavy use of the RPC allocates hugh amount of Writable objects. We saw in multiple systems that the garbage collect can get so busy that the jvm almost freezes for seconds. Things like zookeeper sessions time out in that cases. My idea is to create an object pool for writables. Borrowing an object from the pool is simple since this happen in our custom code, though we do know when the writable return was send over the wire and can be returned into the pool. A dirty hack would be to overwrite the write(out) method in the writable, assuming that is the last thing done with the writable, though turns out that this method is called in other cases too, e.g. to measure throughput. Any ideas? Thanks, Stefan
RE: UI doesn't work
James said: Is the job tracker running on that machine?YES Is there a firewall in the way? I don't think so, because it used to work for me. How can I check that? Harsh said: Did you do any ant operation on your release copy of Hadoop prior to starting it, by the way? NO, I get the following error: BUILD FAILED /cs/sandbox/student/maha/hadoop-0.20.2/build.xml:316: Unable to find a javac compiler; com.sun.tools.javac.Main is not on the classpath. Perhaps JAVA_HOME does not point to the JDK. It is currently set to /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre I had to change JAVA_HOME to point to -- /usr/lib/jvm/jre-1.6.0-openjdk because I used to get an error when trying to run a jar file. The error was: bin/hadoop: line 258: /etc/alternatives/java/bin/java: Not a directory bin/hadoop: line 289: /etc/alternatives/java/bin/java: Not a directory bin/hadoop: line 289: exec: /etc/alternatives/java/bin/java: cannot execute: Not a directory Adarsh said: logs of namenode + jobtracker namenode log [m...@speed logs]$ cat hadoop-maha-namenode-speed.cs.ucsb.edu.log 2010-12-28 12:23:25,006 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG: / STARTUP_MSG: Starting NameNode STARTUP_MSG: host = speed.cs.ucsb.edu/128.111.43.50 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.2 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 / 2010-12-28 12:23:25,126 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=NameNode, port=9000 2010-12-28 12:23:25,130 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: speed.cs.ucsb.edu/128.111.43.50:9000 2010-12-28 12:23:25,133 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null 2010-12-28 12:23:25,134 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext 2010-12-28 12:23:25,258 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=maha,grad 2010-12-28 12:23:25,258 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup 2010-12-28 12:23:25,258 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true 2010-12-28 12:23:25,269 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.spi.NullContext 2010-12-28 12:23:25,270 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStatusMBean 2010-12-28 12:23:25,316 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files = 6 2010-12-28 12:23:25,323 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files under construction = 0 2010-12-28 12:23:25,323 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 551 loaded in 0 seconds. 2010-12-28 12:23:25,323 INFO org.apache.hadoop.hdfs.server.common.Storage: Edits file /tmp/hadoop-maha/dfs/name/current/edits of size 4 edits # 0 loaded in 0 seconds. 2010-12-28 12:23:25,358 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 551 saved in 0 seconds. 2010-12-28 12:23:25,711 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 542 msecs 2010-12-28 12:23:25,715 INFO org.apache.hadoop.hdfs.StateChange: STATE* Safe mode ON. The ratio of reported blocks 0. has not reached the threshold 0.9990. Safe mode will be turned off automatically. 2010-12-28 12:23:25,834 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2010-12-28 12:23:25,901 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50070 2010-12-28 12:23:25,902 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50070 webServer.getConnectors()[0].getLocalPort() returned 50070 2010-12-28 12:23:25,902 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50070 2010-12-28 12:23:25,902 INFO org.mortbay.log: jetty-6.1.14 2010-12-28 12:23:26,360 INFO org.mortbay.log: Started selectchannelconnec...@0.0.0.0:50070 2010-12-28 12:23:26,360 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Web-server up at: 0.0.0.0:50070 2010-12-28 12:23:26,360 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
Re: Hadoop RPC call response post processing
Hi Todd, Right, that is the code I'm looking into. Though Responder is inner private class and is created responder = new Responder(); It would be great if the Responder implementation could be configured. Do you have any idea how to overwrite the Responder? Thanks, Stefan On Dec 27, 2010, at 8:21 PM, Todd Lipcon wrote: Hi Stefan, Sounds interesting. Maybe you're looking for o.a.h.ipc.Server$Responder? -Todd On Mon, Dec 27, 2010 at 8:07 PM, Stefan Groschupf s...@101tec.com wrote: Hi All, I'm browsing the RPC code since quite a while now trying to find any entry point / interceptor slot that allows me to handle a RPC call response writable after it was send over the wire. Does anybody has an idea how break into the RPC code from outside. All the interesting methods are private. :( Background: Heavy use of the RPC allocates hugh amount of Writable objects. We saw in multiple systems that the garbage collect can get so busy that the jvm almost freezes for seconds. Things like zookeeper sessions time out in that cases. My idea is to create an object pool for writables. Borrowing an object from the pool is simple since this happen in our custom code, though we do know when the writable return was send over the wire and can be returned into the pool. A dirty hack would be to overwrite the write(out) method in the writable, assuming that is the last thing done with the writable, though turns out that this method is called in other cases too, e.g. to measure throughput. Any ideas? Thanks, Stefan -- Todd Lipcon Software Engineer, Cloudera
Re: UI doesn't work
For job tracker go to port 50030 see if that helps James Sent from my mobile. Please excuse the typos. On 2010-12-28, at 1:36 PM, maha m...@umail.ucsb.edu wrote: James said: Is the job tracker running on that machine?YES Is there a firewall in the way? I don't think so, because it used to work for me. How can I check that? Harsh said: Did you do any ant operation on your release copy of Hadoop prior to starting it, by the way? NO, I get the following error: BUILD FAILED /cs/sandbox/student/maha/hadoop-0.20.2/build.xml:316: Unable to find a javac compiler; com.sun.tools.javac.Main is not on the classpath. Perhaps JAVA_HOME does not point to the JDK. It is currently set to /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre I had to change JAVA_HOME to point to -- /usr/lib/jvm/jre-1.6.0-openjdk because I used to get an error when trying to run a jar file. The error was: bin/hadoop: line 258: /etc/alternatives/java/bin/java: Not a directory bin/hadoop: line 289: /etc/alternatives/java/bin/java: Not a directory bin/hadoop: line 289: exec: /etc/alternatives/java/bin/java: cannot execute: Not a directory Adarsh said: logs of namenode + jobtracker namenode log [m...@speed logs]$ cat hadoop-maha-namenode-speed.cs.ucsb.edu.log 2010-12-28 12:23:25,006 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG: / STARTUP_MSG: Starting NameNode STARTUP_MSG: host = speed.cs.ucsb.edu/128.111.43.50 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.2 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 / 2010-12-28 12:23:25,126 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=NameNode, port=9000 2010-12-28 12:23:25,130 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: speed.cs.ucsb.edu/128.111.43.50:9000 2010-12-28 12:23:25,133 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null 2010-12-28 12:23:25,134 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext 2010-12-28 12:23:25,258 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=maha,grad 2010-12-28 12:23:25,258 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup 2010-12-28 12:23:25,258 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true 2010-12-28 12:23:25,269 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.spi.NullContext 2010-12-28 12:23:25,270 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStatusMBean 2010-12-28 12:23:25,316 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files = 6 2010-12-28 12:23:25,323 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files under construction = 0 2010-12-28 12:23:25,323 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 551 loaded in 0 seconds. 2010-12-28 12:23:25,323 INFO org.apache.hadoop.hdfs.server.common.Storage: Edits file /tmp/hadoop-maha/dfs/name/current/edits of size 4 edits # 0 loaded in 0 seconds. 2010-12-28 12:23:25,358 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 551 saved in 0 seconds. 2010-12-28 12:23:25,711 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 542 msecs 2010-12-28 12:23:25,715 INFO org.apache.hadoop.hdfs.StateChange: STATE* Safe mode ON. The ratio of reported blocks 0. has not reached the threshold 0.9990. Safe mode will be turned off automatically. 2010-12-28 12:23:25,834 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2010-12-28 12:23:25,901 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50070 2010-12-28 12:23:25,902 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50070 webServer.getConnectors()[0].getLocalPort() returned 50070 2010-12-28 12:23:25,902 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50070 2010-12-28 12:23:25,902 INFO org.mortbay.log: jetty-6.1.14 2010-12-28 12:23:26,360 INFO org.mortbay.log: Started
Re: UI doesn't work
Hi James, I'm accessing --- http://speed.cs.ucsb.edu:50030/ for the job tracker and port: 50070 for the name node just like Hadoop quick start. Did you mean to change the port in my mapred-site.xml file ? property namemapred.job.tracker/name valuespeed.cs.ucsb.edu:9001/value /property Maha On Dec 28, 2010, at 1:01 PM, James Seigel wrote: For job tracker go to port 50030 see if that helps James Sent from my mobile. Please excuse the typos. On 2010-12-28, at 1:36 PM, maha m...@umail.ucsb.edu wrote: James said: Is the job tracker running on that machine?YES Is there a firewall in the way? I don't think so, because it used to work for me. How can I check that? Harsh said: Did you do any ant operation on your release copy of Hadoop prior to starting it, by the way? NO, I get the following error: BUILD FAILED /cs/sandbox/student/maha/hadoop-0.20.2/build.xml:316: Unable to find a javac compiler; com.sun.tools.javac.Main is not on the classpath. Perhaps JAVA_HOME does not point to the JDK. It is currently set to /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre I had to change JAVA_HOME to point to -- /usr/lib/jvm/jre-1.6.0-openjdk because I used to get an error when trying to run a jar file. The error was: bin/hadoop: line 258: /etc/alternatives/java/bin/java: Not a directory bin/hadoop: line 289: /etc/alternatives/java/bin/java: Not a directory bin/hadoop: line 289: exec: /etc/alternatives/java/bin/java: cannot execute: Not a directory Adarsh said: logs of namenode + jobtracker namenode log [m...@speed logs]$ cat hadoop-maha-namenode-speed.cs.ucsb.edu.log 2010-12-28 12:23:25,006 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG: / STARTUP_MSG: Starting NameNode STARTUP_MSG: host = speed.cs.ucsb.edu/128.111.43.50 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.2 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 / 2010-12-28 12:23:25,126 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=NameNode, port=9000 2010-12-28 12:23:25,130 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: speed.cs.ucsb.edu/128.111.43.50:9000 2010-12-28 12:23:25,133 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null 2010-12-28 12:23:25,134 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext 2010-12-28 12:23:25,258 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=maha,grad 2010-12-28 12:23:25,258 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup 2010-12-28 12:23:25,258 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true 2010-12-28 12:23:25,269 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.spi.NullContext 2010-12-28 12:23:25,270 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStatusMBean 2010-12-28 12:23:25,316 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files = 6 2010-12-28 12:23:25,323 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files under construction = 0 2010-12-28 12:23:25,323 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 551 loaded in 0 seconds. 2010-12-28 12:23:25,323 INFO org.apache.hadoop.hdfs.server.common.Storage: Edits file /tmp/hadoop-maha/dfs/name/current/edits of size 4 edits # 0 loaded in 0 seconds. 2010-12-28 12:23:25,358 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 551 saved in 0 seconds. 2010-12-28 12:23:25,711 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading FSImage in 542 msecs 2010-12-28 12:23:25,715 INFO org.apache.hadoop.hdfs.StateChange: STATE* Safe mode ON. The ratio of reported blocks 0. has not reached the threshold 0.9990. Safe mode will be turned off automatically. 2010-12-28 12:23:25,834 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2010-12-28 12:23:25,901 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener
Re: UI doesn't work
Nope, just on my iPhone I thought you'd tried a different port ( bad memory :) ) Try accessing it with an ip address you get from doing an ipconfig on the machine. Then look at the logs and see if there are any errors or indications that it is being hit properly. Does your browser follow redirects properly? As well try clearing the cache on your browser. Sorry for checking out the obvious stuff but sometimes it is :). Cheers James Sent from my mobile. Please excuse the typos. On 2010-12-28, at 2:30 PM, maha m...@umail.ucsb.edu wrote: Hi James, I'm accessing --- http://speed.cs.ucsb.edu:50030/ for the job tracker and port: 50070 for the name node just like Hadoop quick start. Did you mean to change the port in my mapred-site.xml file ? property namemapred.job.tracker/name valuespeed.cs.ucsb.edu:9001/value /property Maha On Dec 28, 2010, at 1:01 PM, James Seigel wrote: For job tracker go to port 50030 see if that helps James Sent from my mobile. Please excuse the typos. On 2010-12-28, at 1:36 PM, maha m...@umail.ucsb.edu wrote: James said: Is the job tracker running on that machine?YES Is there a firewall in the way? I don't think so, because it used to work for me. How can I check that? Harsh said: Did you do any ant operation on your release copy of Hadoop prior to starting it, by the way? NO, I get the following error: BUILD FAILED /cs/sandbox/student/maha/hadoop-0.20.2/build.xml:316: Unable to find a javac compiler; com.sun.tools.javac.Main is not on the classpath. Perhaps JAVA_HOME does not point to the JDK. It is currently set to /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0/jre I had to change JAVA_HOME to point to -- /usr/lib/jvm/jre-1.6.0-openjdk because I used to get an error when trying to run a jar file. The error was: bin/hadoop: line 258: /etc/alternatives/java/bin/java: Not a directory bin/hadoop: line 289: /etc/alternatives/java/bin/java: Not a directory bin/hadoop: line 289: exec: /etc/alternatives/java/bin/java: cannot execute: Not a directory Adarsh said: logs of namenode + jobtracker namenode log [m...@speed logs]$ cat hadoop-maha-namenode-speed.cs.ucsb.edu.log 2010-12-28 12:23:25,006 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG: / STARTUP_MSG: Starting NameNode STARTUP_MSG: host = speed.cs.ucsb.edu/128.111.43.50 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.2 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 / 2010-12-28 12:23:25,126 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=NameNode, port=9000 2010-12-28 12:23:25,130 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: speed.cs.ucsb.edu/128.111.43.50:9000 2010-12-28 12:23:25,133 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null 2010-12-28 12:23:25,134 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics: Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext 2010-12-28 12:23:25,258 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=maha,grad 2010-12-28 12:23:25,258 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup 2010-12-28 12:23:25,258 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true 2010-12-28 12:23:25,269 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics: Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.spi.NullContext 2010-12-28 12:23:25,270 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStatusMBean 2010-12-28 12:23:25,316 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files = 6 2010-12-28 12:23:25,323 INFO org.apache.hadoop.hdfs.server.common.Storage: Number of files under construction = 0 2010-12-28 12:23:25,323 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 551 loaded in 0 seconds. 2010-12-28 12:23:25,323 INFO org.apache.hadoop.hdfs.server.common.Storage: Edits file /tmp/hadoop-maha/dfs/name/current/edits of size 4 edits # 0 loaded in 0 seconds. 2010-12-28 12:23:25,358 INFO org.apache.hadoop.hdfs.server.common.Storage: Image file of size 551 saved in 0 seconds. 2010-12-28 12:23:25,711 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Finished loading
SV:
Recently I often bought some products from a business company. Very cost-effective and convenient, if you are free, you can go to browse: okayele.com , enrich a shopping choice for yourself wonderful life
RE:FW
Recently I often bought some products from a business company. Very cost-effective and convenient, if you are free, you can go to browse: okayele.com , enrich a shopping choice for yourself wonderful life
Re: UI doesn't work
Thanks James, you think those are obvious stuff, but they are not to me! Here is the update: 1- I cleared Browser cache 2- I used IP address for masters/slaves/mapred-core.xml/core-site.xml which still identifies it as (( speed.cs.ucsb.edu/128.111.43.50 )) in logs. 3- Namenode page (( http://128.111.43.50:50030/ )) redirected to -- (( http://128.111.43.50:50070/dfshealth.jsp))which shows the 404 Error. Is that a correct redirection? 4- log for JobTracker shows something new : 2010-12-28 14:15:11,870 INFO org.apache.hadoop.mapred.JobTracker: STARTUP_MSG: / STARTUP_MSG: Starting JobTracker STARTUP_MSG: host = speed.cs.ucsb.edu/128.111.43.50 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.2 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 / 2010-12-28 14:15:11,983 INFO org.apache.hadoop.mapred.JobTracker: Scheduler configured with (memSizeForMapSlotOnJT, memSizeForReduceSlotOnJT, limitMaxMemForMapTasks, limitMaxMemForReduceTasks) (-1, -1, -1, -1) 2010-12-28 14:15:12,033 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=JobTracker, port=9001 2010-12-28 14:15:12,096 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2010-12-28 14:15:12,290 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50030 2010-12-28 14:15:12,291 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50030 webServer.getConnectors()[0].getLocalPort() returned 50030 2010-12-28 14:15:12,291 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50030 2010-12-28 14:15:12,291 INFO org.mortbay.log: jetty-6.1.14 2010-12-28 14:18:28,261 INFO org.mortbay.log: Started selectchannelconnec...@0.0.0.0:50030 2010-12-28 14:18:28,265 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 2010-12-28 14:18:28,266 INFO org.apache.hadoop.mapred.JobTracker: JobTracker up at: 9001 2010-12-28 14:18:28,266 INFO org.apache.hadoop.mapred.JobTracker: JobTracker webserver: 50030 2010-12-28 14:18:28,513 INFO org.apache.hadoop.mapred.JobTracker: Cleaning up the system directory 2010-12-28 14:18:28,577 INFO org.apache.hadoop.mapred.CompletedJobStatusStore: Completed job store is inactive 2010-12-28 14:18:28,667 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting 2010-12-28 14:18:28,668 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 9001: starting 2010-12-28 14:18:28,668 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9001: starting 2010-12-28 14:18:28,668 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 9001: starting 2010-12-28 14:18:28,672 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 9001: starting 2010-12-28 14:18:28,672 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 9001: starting 2010-12-28 14:18:28,672 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 9001: starting 2010-12-28 14:18:28,672 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 9001: starting 2010-12-28 14:18:28,672 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 9001: starting 2010-12-28 14:18:28,672 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 9001: starting 2010-12-28 14:18:28,673 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 9001: starting 2010-12-28 14:18:28,673 INFO org.apache.hadoop.mapred.JobTracker: Starting RUNNING 2010-12-28 14:18:28,673 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 9001: starting 2010-12-28 14:18:28,684 WARN org.apache.hadoop.mapred.JobTracker: Serious problem, cannot find record of 'previous' heartbeat for 'tracker_pinky.cs.ucsb.edu:localhost/127.0.0.1:56875'; reinitializing the tasktracker 2010-12-28 14:18:28,684 WARN org.apache.hadoop.ipc.Server: IPC Server Responder, call getProtocolVersion(org.apache.hadoop.mapred.JobSubmissionProtocol, 20) from 128.111.43.50:59775: output error @ This might be because I forced to leave SAFEMODE ? @ 2010-12-28 14:18:28,696 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9001 caught: java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:144) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:342) at org.apache.hadoop.ipc.Server.channelWrite(Server.java:1195) at org.apache.hadoop.ipc.Server.access$1900(Server.java:77) at org.apache.hadoop.ipc.Server$Responder.processResponse(Server.java:613) at org.apache.hadoop.ipc.Server$Responder.doRespond(Server.java:677) at
help for using mapreduce to run different code?
Hi, all Whether Hadoop supports the map function running different code? If yes, how to realize this? Thanks in advance! -- Regards, Jander
Re: how to run jobs every 30 minutes?
I've been using Cascading to act as make for my Hadoop processes for quite some time. Unfortunately, even the most recent distribution of Cascading was written against the deprecated Hadoop APIs (JobConf) that I'm looking to replace. Does anyone have an alternative? On Tue, Dec 14, 2010 at 18:02, Chris K Wensel ch...@wensel.net wrote: Cascading also has the ability to only run 'stale' processes. Think 'make' file. When re-running a job where only one file of many has changed, this is a big win.
Re: help for using mapreduce to run different code?
Not sure what you mean. Can you write custom code for your map functions?: yes Cheers James Sent from my mobile. Please excuse the typos. On 2010-12-28, at 3:54 PM, Jander g jande...@gmail.com wrote: Hi, all Whether Hadoop supports the map function running different code? If yes, how to realize this? Thanks in advance! -- Regards, Jander
Re: Hadoop RPC call response post processing
Knowing the tenuring distribution will tell a lot about that exact issue. Ephemeral collections take on average less than one instruction per allocation and the allocation itself is generally only a single instruction. For ephemeral garbage, it is extremely unlikely that you can beat that. So the real question is whether you are actually creating so much garbage that you are over-whelming the collector or whether the data is much longer lived than it should be. *That* can cause lots of collection costs. To tell how long data lives, you need to get the tenuring distribution: -XX:+PrintTenuringDistribution Prints details about the tenuring distribution to standard out. It can be used to show this threshold and the ages of objects in the new generation. It is also useful for observing the lifetime distribution of an application. On Tue, Dec 28, 2010 at 11:59 AM, Stefan Groschupf s...@101tec.com wrote: I don't think the problem is allocation but garbage collection.
Re: help for using mapreduce to run different code?
if you mean running different code in different mappers, I recommend using an if statement. On Tue, Dec 28, 2010 at 2:53 PM, Jander g jande...@gmail.com wrote: Whether Hadoop supports the map function running different code? If yes, how to realize this?
Re: help for using mapreduce to run different code?
Hi James, Thanks for your attention. Suppose there are only 2 map running in Hadoop cluster, I want to using one map to sort and another to wordcount in the same time in the same Hadoop cluster. On Wed, Dec 29, 2010 at 6:58 AM, James Seigel ja...@tynt.com wrote: Not sure what you mean. Can you write custom code for your map functions?: yes Cheers James Sent from my mobile. Please excuse the typos. On 2010-12-28, at 3:54 PM, Jander g jande...@gmail.com wrote: Hi, all Whether Hadoop supports the map function running different code? If yes, how to realize this? Thanks in advance! -- Regards, Jander -- Thanks, Jander
Re: help for using mapreduce to run different code?
Yes, that is what I mean. But what the condition is if I want to using one map to sort and another to wordcount in the same time in the same Hadoop cluster. I have no idea. Thanks, Jander On Wed, Dec 29, 2010 at 7:08 AM, Ted Dunning tdunn...@maprtech.com wrote: if you mean running different code in different mappers, I recommend using an if statement. On Tue, Dec 28, 2010 at 2:53 PM, Jander g jande...@gmail.com wrote: Whether Hadoop supports the map function running different code? If yes, how to realize this? -- Thanks, Jander
Re: how to run jobs every 30 minutes?
Good quote. On Tue, Dec 28, 2010 at 3:46 PM, Chris K Wensel ch...@wensel.net wrote: deprecated is the new stable. https://issues.apache.org/jira/browse/MAPREDUCE-1734 ckw On Dec 28, 2010, at 2:56 PM, Jimmy Wan wrote: I've been using Cascading to act as make for my Hadoop processes for quite some time. Unfortunately, even the most recent distribution of Cascading was written against the deprecated Hadoop APIs (JobConf) that I'm looking to replace. Does anyone have an alternative? On Tue, Dec 14, 2010 at 18:02, Chris K Wensel ch...@wensel.net wrote: Cascading also has the ability to only run 'stale' processes. Think 'make' file. When re-running a job where only one file of many has changed, this is a big win. -- Chris K Wensel ch...@concurrentinc.com http://www.concurrentinc.com -- Concurrent, Inc. offers mentoring, support, and licensing for Cascading
Re: help for using mapreduce to run different code?
Hi Jander, You mean write Map in another language? like python or C, then yes. Check this http://hadoop.apache.org/common/docs/r0.18.0/streaming.html for Hadoop Streaming. Maha On Dec 28, 2010, at 2:53 PM, Jander g wrote: Hi, all Whether Hadoop supports the map function running different code? If yes, how to realize this? Thanks in advance! -- Regards, Jander
HDFS disk consumption.
Is setting dfs.replication to 1 sufficient to stop replication? How do I verify that? I have a pseudo cluster running 0.21.0. It seems that the hdfs disk consumption triples the amount of data stored. Thanks, Jane
Re: Hadoop/Elastic MR on AWS
Unfortunately I can't publish the exact numbers however here are the various things we considered First off our data trends. We gathered our current data size and plotted a future growth trend for the next few years. We then finalized on a archival strategy to understand how much data needs to be on the cluster on a rotating basis. We crunch our data often (meaning as we get them) so computing power is not an issue and the cluster size was mainly driven by our data size that needs to be readily available and replication strategy. We factored in compression use on older rotating data. Once we had the above numbers we could decide on our cluster infrastructure size and type of hardware needed. For local cluster we factored in hardware, warranty, regular networking stuff for cluster that size, data center costs, support manpower. We also factored in a NAS and bandwidth costs to replicate cluster data to another data center for active replication. For EMR costs we compared a reserved instance cluster (nodes reserved for 3years with similar hardware config as above) with above cluster size vs nodes on the fly. We factored in S3 costs to store the above calculated rotating data and bandwidth costs for data coming in and coming out. One thing to note is Amazon EMR costs are above normal EC2 instance costs. For example if you run a job in EMR with 4 nodes and the job overall takes 1hr then total EMR cost (excluding any data transfer costs) = 4*1*{EMR /hour} + 4*1*EC2 /hour cost. Hopefully that makes sense. I am sure missing a few things above but that's the jist of it. - Sudhir On 12/27/10 9:22 PM, common-user-digest-h...@hadoop.apache.org common-user-digest-h...@hadoop.apache.org wrote: From: Dave Viner davevi...@gmail.com Date: Mon, 27 Dec 2010 10:23:37 -0800 To: common-user@hadoop.apache.org Subject: Re: Hadoop/Elastic MR on AWS Hi Sudhir, Can you publish your findings around pricing, and how you calculated the various aspects? This is great information. Thanks Dave Viner On Mon, Dec 27, 2010 at 10:17 AM, Sudhir Vallamkondu sudhir.vallamko...@icrossing.com wrote: We recently crossed this bridge and here are some insights. We did an extensive study comparing costs and benchmarking local vs EMR for our current needs and future trend. - Scalability you get with EMR is unmatched although you need to look at your requirement and decide this is something you need. - When using EMR its cheaper to use reserved instances vs nodes on the fly. You can always add more nodes when required. I suggest looking at your current computing needs and reserve instances for a year or two and use these to run EMR and add nodes at peak needs. In your cost estimation you will need to factor in the data transfer time/costs unless you are dealing with public datasets on S3 - EMR fared similar to local cluster on CPU benchmarks (we used MRBench to benchmark map/reduce) however IO benchmarks were slow on EMR (used DFSIO benchmark). For IO intensive jobs you will need to add more nodes to compensate this. - When compared to local cluster, you will need to factor the time it takes for the EMR cluster to setup when starting a job. This like data transfer time, cluster replication time etc - EMR API is very flexible however you will need to build a custom interface on top of it to suit your job management and monitoring needs - EMR bootstrap actions can satisfy most of your native lib needs so no drawbacks there. -- Sudhir On 12/26/10 5:26 AM, common-user-digest-h...@hadoop.apache.org common-user-digest-h...@hadoop.apache.org wrote: From: Otis Gospodnetic otis_gospodne...@yahoo.com Date: Fri, 24 Dec 2010 04:41:46 -0800 (PST) To: common-user@hadoop.apache.org Subject: Re: Hadoop/Elastic MR on AWS Hello Amandeep, - Original Message From: Amandeep Khurana ama...@gmail.com To: common-user@hadoop.apache.org Sent: Fri, December 10, 2010 1:14:45 AM Subject: Re: Hadoop/Elastic MR on AWS Mark, Using EMR makes it very easy to start a cluster and add/reduce capacity as and when required. There are certain optimizations that make EMR an attractive choice as compared to building your own cluster out. Using EMR Could you please point out what optimizations you are referring to? Thanks, Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - HBase Hadoop ecosystem search :: http://search-hadoop.com/ also ensures you are using a production quality, stable system backed by the EMR engineers. You can always use bootstrap actions to put your own tweaked version of Hadoop in there if you want to do that. Also, you don't have to tear down your cluster after every job. You can set the alive option when you start your cluster and it will stay there even after your Hadoop job completes. If you face any issues with EMR, send me a mail offline and I'll be happy to help.
Re: Hadoop RPC call response post processing
On Tue, Dec 28, 2010 at 1:00 PM, Stefan Groschupf s...@101tec.com wrote: Hi Todd, Right, that is the code I'm looking into. Though Responder is inner private class and is created responder = new Responder(); It would be great if the Responder implementation could be configured. Do you have any idea how to overwrite the Responder? Nope, it's not currently pluggable, nor do I think there's any compelling reason to make it pluggable. It's coupled quite tightly to the implementation right now. Perhaps you can hack something in a git branch, and if it has good results on something like NNBench it could be a general contribution? -Todd On Dec 27, 2010, at 8:21 PM, Todd Lipcon wrote: Hi Stefan, Sounds interesting. Maybe you're looking for o.a.h.ipc.Server$Responder? -Todd On Mon, Dec 27, 2010 at 8:07 PM, Stefan Groschupf s...@101tec.com wrote: Hi All, I'm browsing the RPC code since quite a while now trying to find any entry point / interceptor slot that allows me to handle a RPC call response writable after it was send over the wire. Does anybody has an idea how break into the RPC code from outside. All the interesting methods are private. :( Background: Heavy use of the RPC allocates hugh amount of Writable objects. We saw in multiple systems that the garbage collect can get so busy that the jvm almost freezes for seconds. Things like zookeeper sessions time out in that cases. My idea is to create an object pool for writables. Borrowing an object from the pool is simple since this happen in our custom code, though we do know when the writable return was send over the wire and can be returned into the pool. A dirty hack would be to overwrite the write(out) method in the writable, assuming that is the last thing done with the writable, though turns out that this method is called in other cases too, e.g. to measure throughput. Any ideas? Thanks, Stefan -- Todd Lipcon Software Engineer, Cloudera -- Todd Lipcon Software Engineer, Cloudera
Re: UI doesn't work
I recently had this issue. UI links were working for some nodes meaning when I go to dfsHealth.jsp page and following some cluster data node links some would work and some would show a 404 error. I started tracing them all the way from the listening ports. Data nodes port is 50010 so do a netsat on that port to find what process is listening in. Then check that process to see if its the data node. The issue I had was somehow when I did the hadoop upgrade I had a older instance and a new instance of data node running and it was all messed up so I had to kill all hadoop processes and do a clean start. On 12/27/10 9:22 PM, common-user-digest-h...@hadoop.apache.org common-user-digest-h...@hadoop.apache.org wrote: From: Harsh J qwertyman...@gmail.com Date: Tue, 28 Dec 2010 09:51:11 +0530 To: common-user@hadoop.apache.org Subject: Re: UI doesn't work I remember facing such an issue with the JT (50030) once. None of the jsp pages would load, 'cept the index. It was some odd issue with the webapps not getting loaded right while startup. Don't quite remember how it got solved. Did you do any ant operation on your release copy of Hadoop prior to starting it, by the way? On Tue, Dec 28, 2010 at 5:15 AM, maha m...@umail.ucsb.edu wrote: Hi, I get Error 404 when I try to use hadoop UI to monitor my job execution. I'm using Hadoop-0.20.2 and the following are parts of my configuration files. in Core-site.xml: namefs.default.name/name valuehdfs://speed.cs.ucsb.edu:9000/value in mapred-site.xml: namemapred.job.tracker/name valuespeed.cs.ucsb.edu:9001/value when I try to open: http://speed.cs.ucsb.edu:50070/ I get the 404 Error. Any ideas? Thank you, Maha -- Harsh J www.harshj.com iCrossing Privileged and Confidential Information This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information of iCrossing. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
Re: Hadoop RPC call response post processing
Are you connecting to this JVM with RMI? RMI does a very nasty thing with garbage collection: it forces a blocking collection every 60 seconds. Really. You have to change this with a system property. On Tue, Dec 28, 2010 at 5:56 PM, Todd Lipcon t...@cloudera.com wrote: On Tue, Dec 28, 2010 at 1:00 PM, Stefan Groschupf s...@101tec.com wrote: Hi Todd, Right, that is the code I'm looking into. Though Responder is inner private class and is created responder = new Responder(); It would be great if the Responder implementation could be configured. Do you have any idea how to overwrite the Responder? Nope, it's not currently pluggable, nor do I think there's any compelling reason to make it pluggable. It's coupled quite tightly to the implementation right now. Perhaps you can hack something in a git branch, and if it has good results on something like NNBench it could be a general contribution? -Todd On Dec 27, 2010, at 8:21 PM, Todd Lipcon wrote: Hi Stefan, Sounds interesting. Maybe you're looking for o.a.h.ipc.Server$Responder? -Todd On Mon, Dec 27, 2010 at 8:07 PM, Stefan Groschupf s...@101tec.com wrote: Hi All, I'm browsing the RPC code since quite a while now trying to find any entry point / interceptor slot that allows me to handle a RPC call response writable after it was send over the wire. Does anybody has an idea how break into the RPC code from outside. All the interesting methods are private. :( Background: Heavy use of the RPC allocates hugh amount of Writable objects. We saw in multiple systems that the garbage collect can get so busy that the jvm almost freezes for seconds. Things like zookeeper sessions time out in that cases. My idea is to create an object pool for writables. Borrowing an object from the pool is simple since this happen in our custom code, though we do know when the writable return was send over the wire and can be returned into the pool. A dirty hack would be to overwrite the write(out) method in the writable, assuming that is the last thing done with the writable, though turns out that this method is called in other cases too, e.g. to measure throughput. Any ideas? Thanks, Stefan -- Todd Lipcon Software Engineer, Cloudera -- Todd Lipcon Software Engineer, Cloudera -- Lance Norskog goks...@gmail.com
Re: help for using mapreduce to run different code?
Hi Jander, If I understand what u want , u would like to run the map instances of two different mapreduces (so obviously different mapper codes) simultaneously on the same machine. If I am correct, it has got more to do with the number of simultaneous mapper instances setting (I guess its default 2 or 4). And there should be a way to divide the map instances among the two MR modules (to fill up the slot of 4)u want to run together. Please correct me if I am wrong. Wanted to try clearing the air regarding the Query :) :) . Matthew On Wed, Dec 29, 2010 at 5:47 AM, maha m...@umail.ucsb.edu wrote: Hi Jander, You mean write Map in another language? like python or C, then yes. Check this http://hadoop.apache.org/common/docs/r0.18.0/streaming.html for Hadoop Streaming. Maha On Dec 28, 2010, at 2:53 PM, Jander g wrote: Hi, all Whether Hadoop supports the map function running different code? If yes, how to realize this? Thanks in advance! -- Regards, Jander
Re: Hadoop RPC call response post processing
Hi Todd, Thanks for the feedback. Nope, it's not currently pluggable, nor do I think there's any compelling reason to make it pluggable. Well, one could argue with an interceptor / filter it would be very easy to add compression or encryption to RPC. But since the nutch days the code base was never architected in extendable or modular way. Perhaps you can hack something in a git branch, and if it has good results on something like NNBench it could be a general contribution? Thanks - I pass on that offer. The days waiting a half year to get a patch into the codebase are behind me. :) I think I will just replace hadoop RPC with netty. Cheers, Stefan
Re: HDFS Structure
FileInputFormat takes care of line boundaries in splits, you don't need to worry about that. Each mapper works on a FileSplit, which contains the starting offset and the length from there. These things are computed for it with line boundaries in mind (and the extra bytes are pulled from the DataNode that has it). Similarly, in SequenceFiles, it is done using a special Sync byte embedded in between logical blocks of data. On Wed, Dec 29, 2010 at 10:27 AM, shanmukhan battinapati shanmukha...@gmail.com wrote: Hi, I have a small doubt about the how HDFS manages the files internally. Assume like I have a NameNode and 2 DataNodes. I have inserted a csv file of size 80MB into HDFS using 'hadoop copyFromLocal' command. Then how this file will be stored in HDFS? Will it be split into two parts of size 64MB(Default chunk size) and remaining 16Mb and copied to the 2 DataNodes? If that is the case, if I am doing some map-reduce on the two dataNodes, as the data is not line oriented I may get unexpected results. How to solve this type of issues? Please help me. Thanks Regards Shanmukhan.B -- Harsh J www.harshj.com
Re: help for using mapreduce to run different code?
Have a look at MultipleInputs On Wed, Dec 29, 2010 at 4:39 AM, Jander g jande...@gmail.com wrote: Hi James, Thanks for your attention. Suppose there are only 2 map running in Hadoop cluster, I want to using one map to sort and another to wordcount in the same time in the same Hadoop cluster. On Wed, Dec 29, 2010 at 6:58 AM, James Seigel ja...@tynt.com wrote: Not sure what you mean. Can you write custom code for your map functions?: yes Cheers James Sent from my mobile. Please excuse the typos. On 2010-12-28, at 3:54 PM, Jander g jande...@gmail.com wrote: Hi, all Whether Hadoop supports the map function running different code? If yes, how to realize this? Thanks in advance! -- Regards, Jander -- Thanks, Jander -- Harsh J www.harshj.com
Re: UI doesn't work
Thanks for the tip , I'll try it right away. But a quick clarification, what I did is remotely connect to one node and mark it as a master and slave. Then, before starting Hadoop, 'jps' will show only 'jps' but after starting hadoop 'jps' will show all the hadoop deamons. Isn't this a clean start?? Maha On Dec 28, 2010, at 6:02 PM, Sudhir Vallamkondu wrote: I recently had this issue. UI links were working for some nodes meaning when I go to dfsHealth.jsp page and following some cluster data node links some would work and some would show a 404 error. I started tracing them all the way from the listening ports. Data nodes port is 50010 so do a netsat on that port to find what process is listening in. Then check that process to see if its the data node. The issue I had was somehow when I did the hadoop upgrade I had a older instance and a new instance of data node running and it was all messed up so I had to kill all hadoop processes and do a clean start. On 12/27/10 9:22 PM, common-user-digest-h...@hadoop.apache.org common-user-digest-h...@hadoop.apache.org wrote: From: Harsh J qwertyman...@gmail.com Date: Tue, 28 Dec 2010 09:51:11 +0530 To: common-user@hadoop.apache.org Subject: Re: UI doesn't work I remember facing such an issue with the JT (50030) once. None of the jsp pages would load, 'cept the index. It was some odd issue with the webapps not getting loaded right while startup. Don't quite remember how it got solved. Did you do any ant operation on your release copy of Hadoop prior to starting it, by the way? On Tue, Dec 28, 2010 at 5:15 AM, maha m...@umail.ucsb.edu wrote: Hi, I get Error 404 when I try to use hadoop UI to monitor my job execution. I'm using Hadoop-0.20.2 and the following are parts of my configuration files. in Core-site.xml: namefs.default.name/name valuehdfs://speed.cs.ucsb.edu:9000/value in mapred-site.xml: namemapred.job.tracker/name valuespeed.cs.ucsb.edu:9001/value when I try to open: http://speed.cs.ucsb.edu:50070/ I get the 404 Error. Any ideas? Thank you, Maha -- Harsh J www.harshj.com iCrossing Privileged and Confidential Information This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information of iCrossing. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.