Re: MR2 Job over LZO data
Can you run MR jobs (not pig job) which takes Lzo Files as input ? If you can not run MR jobs. You may want to check the lzo compression configuration in core-site.xml. Make sure the dynamic library is in HADOOP_HOME/lib/native/ Here is a FAQ about how to configure lzo https://code.google.com/a/apache-extras.org/p/hadoop-gpl-compression/wiki/FAQ?redir=1 On Sat, Mar 8, 2014 at 12:04 AM, Viswanathan J jayamviswanat...@gmail.comwrote: Hi, Getting the below error while running pig job in hadoop-2.x, Caused by: java.io.IOException: No codec for file found 2639 at com.twitter.elephantbird.mapreduce.input.MultiInputFormat.determineFileFormat(MultiInputFormat.java:176) 2640 at com.twitter.elephantbird.mapreduce.input.MultiInputFormat.createRecordReader(MultiInputFormat.java:88) 2641 at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:256) Have copied the respective lzo jars to lib folders, but facing this issue. pls help. On Fri, Mar 7, 2014 at 7:53 PM, German Florez-Larrahondo german...@samsung.com wrote: King Here is my raw log of installing Hadoop LZO. This works on 2.2.0 and 2.3.0 I hope this helps ./g *Where to get Hadoop LZO* https://github.com/twitter/hadoop-lzo http://asmarterplanet.com/studentsfor/blog/2013/11/hadoop-cluster-module-lzo-compression.html *Requirements* On cents: sudo yum install lzo* -- /usr/lib64/liblzo2.so.2 On ubuntu: sudo apt-get install liblzo -- on X86: /usr/lib64/liblzo2.so.2 *Clone:* git clone https://github.com/twitter/hadoop-lzo.git Follow instructions on README.md from this github site, basically cd hadoop-lzo * mvn clean package test* *To enable this at run time do:* a. Copy the library to the hadoop/share/common (if you don't want to modify classpaths by putting the library somewhere else) cp lzo..././target/hadoop-lzo-0.4.20-SNAPSHOT.jar .. hadoop/share/hadoop/common/ a. Copy /usr/lib64/liblzo2.so.2 to .. Hadoop/lib/native/ *From:* Gordon Wang [mailto:gw...@gopivotal.com] *Sent:* Thursday, March 06, 2014 11:50 PM *To:* user@hadoop.apache.org *Subject:* Re: MR2 Job over LZO data You can try to get the source code https://github.com/twitter/hadoop-lzo and then compile it against hadoop 2.2.0. In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0 On Thu, Mar 6, 2014 at 6:29 PM, KingDavies kingdav...@gmail.com wrote: Running on Hadoop 2.2.0 The Java MR2 job works as expected on an uncompressed data source using the TextInputFormat.class. But when using the LZO format the job fails: import com.hadoop.mapreduce.LzoTextInputFormat; job.setInputFormatClass(LzoTextInputFormat.class); Dependencies from the maven repository: http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/ Also tried with elephant-bird-core 4.4 The same data can be queried fine from within Hive(0.12) on the same cluster. The exception: Exception in thread main java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected at com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:62) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340) at com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:101) at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) at com.cloudreach.DataQuality.Main.main(Main.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) I believe the issue is related to the changes in Hadoop 2, but where can I find a H2 compatible version? Thanks -- Regards Gordon Wang -- Regards, Viswa.J -- Regards Gordon Wang
RE: MR2 Job over LZO data
King Here is my raw log of installing Hadoop LZO. This works on 2.2.0 and 2.3.0 I hope this helps ./g Where to get Hadoop LZO https://github.com/twitter/hadoop-lzo http://asmarterplanet.com/studentsfor/blog/2013/11/hadoop-cluster-module-lzo -compression.html Requirements On cents: sudo yum install lzo* -- /usr/lib64/liblzo2.so.2.. On ubuntu: sudo apt-get install liblzo -- on X86: /usr/lib64/liblzo2.so.2 Clone: git clone https://github.com/twitter/hadoop-lzo.git Follow instructions on README.md from this github site, basically cd hadoop-lzo mvn clean package test To enable this at run time do: a. Copy the library to the hadoop/share/common (if you don't want to modify classpaths by putting the library somewhere else) cp lzo././target/hadoop-lzo-0.4.20-SNAPSHOT.jar .. hadoop/share/hadoop/common/ a. Copy /usr/lib64/liblzo2.so.2 to .. Hadoop/lib/native/ From: Gordon Wang [mailto:gw...@gopivotal.com] Sent: Thursday, March 06, 2014 11:50 PM To: user@hadoop.apache.org Subject: Re: MR2 Job over LZO data You can try to get the source code https://github.com/twitter/hadoop-lzo and then compile it against hadoop 2.2.0. In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0 On Thu, Mar 6, 2014 at 6:29 PM, KingDavies kingdav...@gmail.com wrote: Running on Hadoop 2.2.0 The Java MR2 job works as expected on an uncompressed data source using the TextInputFormat.class. But when using the LZO format the job fails: import com.hadoop.mapreduce.LzoTextInputFormat; job.setInputFormatClass(LzoTextInputFormat.class); Dependencies from the maven repository: http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/ Also tried with elephant-bird-core 4.4 The same data can be queried fine from within Hive(0.12) on the same cluster. The exception: Exception in thread main java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected at com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:6 2) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFor mat.java:340) at com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:10 1) at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:49 1) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java :392) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja va:1491) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) at com.cloudreach.DataQuality.Main.main(Main.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57 ) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl .java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) I believe the issue is related to the changes in Hadoop 2, but where can I find a H2 compatible version? Thanks -- Regards Gordon Wang
Re: MR2 Job over LZO data
Hi, Getting the below error while running pig job in hadoop-2.x, Caused by: java.io.IOException: No codec for file found 2639 at com.twitter.elephantbird.mapreduce.input.MultiInputFormat.determineFileFormat(MultiInputFormat.java:176) 2640 at com.twitter.elephantbird.mapreduce.input.MultiInputFormat.createRecordReader(MultiInputFormat.java:88) 2641 at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:256) Have copied the respective lzo jars to lib folders, but facing this issue. pls help. On Fri, Mar 7, 2014 at 7:53 PM, German Florez-Larrahondo german...@samsung.com wrote: King Here is my raw log of installing Hadoop LZO. This works on 2.2.0 and 2.3.0 I hope this helps ./g *Where to get Hadoop LZO* https://github.com/twitter/hadoop-lzo http://asmarterplanet.com/studentsfor/blog/2013/11/hadoop-cluster-module-lzo-compression.html *Requirements* On cents: sudo yum install lzo* -- /usr/lib64/liblzo2.so.2 On ubuntu: sudo apt-get install liblzo -- on X86: /usr/lib64/liblzo2.so.2 *Clone:* git clone https://github.com/twitter/hadoop-lzo.git Follow instructions on README.md from this github site, basically cd hadoop-lzo * mvn clean package test* *To enable this at run time do:* a. Copy the library to the hadoop/share/common (if you don't want to modify classpaths by putting the library somewhere else) cp lzo..././target/hadoop-lzo-0.4.20-SNAPSHOT.jar .. hadoop/share/hadoop/common/ a. Copy /usr/lib64/liblzo2.so.2 to .. Hadoop/lib/native/ *From:* Gordon Wang [mailto:gw...@gopivotal.com] *Sent:* Thursday, March 06, 2014 11:50 PM *To:* user@hadoop.apache.org *Subject:* Re: MR2 Job over LZO data You can try to get the source code https://github.com/twitter/hadoop-lzo and then compile it against hadoop 2.2.0. In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0 On Thu, Mar 6, 2014 at 6:29 PM, KingDavies kingdav...@gmail.com wrote: Running on Hadoop 2.2.0 The Java MR2 job works as expected on an uncompressed data source using the TextInputFormat.class. But when using the LZO format the job fails: import com.hadoop.mapreduce.LzoTextInputFormat; job.setInputFormatClass(LzoTextInputFormat.class); Dependencies from the maven repository: http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/ Also tried with elephant-bird-core 4.4 The same data can be queried fine from within Hive(0.12) on the same cluster. The exception: Exception in thread main java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected at com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:62) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340) at com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:101) at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) at com.cloudreach.DataQuality.Main.main(Main.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) I believe the issue is related to the changes in Hadoop 2, but where can I find a H2 compatible version? Thanks -- Regards Gordon Wang -- Regards, Viswa.J
Re: MR2 Job over LZO data
May be you can try download the LZO class and rebuild it against Hadoop 2.2.0; If build success, you should be good to go; if failed, then maybe you need to wait for the LZO guys to update their code. Regards, *Stanley Shi,* On Thu, Mar 6, 2014 at 6:29 PM, KingDavies kingdav...@gmail.com wrote: Running on Hadoop 2.2.0 The Java MR2 job works as expected on an uncompressed data source using the TextInputFormat.class. But when using the LZO format the job fails: import com.hadoop.mapreduce.LzoTextInputFormat; job.setInputFormatClass(LzoTextInputFormat.class); Dependencies from the maven repository: http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/ Also tried with elephant-bird-core 4.4 The same data can be queried fine from within Hive(0.12) on the same cluster. The exception: Exception in thread main java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected at com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:62) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340) at com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:101) at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) at com.cloudreach.DataQuality.Main.main(Main.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) I believe the issue is related to the changes in Hadoop 2, but where can I find a H2 compatible version? Thanks
Re: MR2 Job over LZO data
You can try to get the source code https://github.com/twitter/hadoop-lzo and then compile it against hadoop 2.2.0. In my memory, as long as rebuild it, lzo should work with hadoop 2.2.0 On Thu, Mar 6, 2014 at 6:29 PM, KingDavies kingdav...@gmail.com wrote: Running on Hadoop 2.2.0 The Java MR2 job works as expected on an uncompressed data source using the TextInputFormat.class. But when using the LZO format the job fails: import com.hadoop.mapreduce.LzoTextInputFormat; job.setInputFormatClass(LzoTextInputFormat.class); Dependencies from the maven repository: http://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/ Also tried with elephant-bird-core 4.4 The same data can be queried fine from within Hive(0.12) on the same cluster. The exception: Exception in thread main java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected at com.hadoop.mapreduce.LzoTextInputFormat.listStatus(LzoTextInputFormat.java:62) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340) at com.hadoop.mapreduce.LzoTextInputFormat.getSplits(LzoTextInputFormat.java:101) at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:491) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:508) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:392) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) at com.cloudreach.DataQuality.Main.main(Main.java:42) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) I believe the issue is related to the changes in Hadoop 2, but where can I find a H2 compatible version? Thanks -- Regards Gordon Wang