Hi,

The hadoop-gpl-packaging jars have been tested using the standard apache pig 
0.8.0 release, I've not tested this with the cloudera cdh pig.
Its important that you verify you've got lzo setup on all of the datanodes.

The configuration we use is:
 For each data node:
   Install lzo (yum install lzo.x86_64) 
   Configure:
     $HADOOP_HOME/conf/mapred-site.xml  (its important to set -Djava.library 
change the path to where you've got the native libs )
                <property>
                 <name>mapred.child.java.opts</name>
                 <value>-Xmx2048m -Xms1024m 
-Djava.library.path=/opt/hadoopgpl/native/Linux-amd64-64</value>
                 <final>true</final>
                </property>
      For intermediate compression setup do:
        <property>
                 <name>mapred.compress.map.output</name>
                 <value>true</value>
                 <final>true</final>
          </property>
          <property>
                <name>mapred.output.compression.codec</name>
                <value>com.hadoop.compression.lzo.LzoCodec</value>
          </property>
          <property>
                <name>mapred.map.output.compression.codec</name>
                <value>com.hadoop.compression.lzo.LzoCodec</value>
        </property>

     $HADOOP_HOME/conf/core-site.xml
         <property>
                 <name>io.compression.codecs</name>                             
<value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec</value>
                 <description>A list of the compression codec classes that can 
be used for compression/decompression.</description>
          </property>
          <property>
          <name>io.compression.codec.lzo.class</name>
            <value>com.hadoop.compression.lzo.LzoCodec</value>
        </property>


You can also set the library property in $PIG_HOME/conf/pig.properties if the 
property mapred.child.java.opts it is not declared final on your cluster nodes.
 e.g. 
   mapred.child.java.opts=-Xmx2048m -Xms1024m 
-Djava.library.path=/opt/hadoopgpl/native/Linux-amd64-64
   (again change the path to where you've got the native libraries from the 
hadoop-gpl-packaging)


Cheers,
 Gerrit


-----Original Message-----
From: Dmitriy Ryaboy [mailto:dvrya...@gmail.com] 
Sent: Wednesday, February 16, 2011 7:21 PM
To: user@pig.apache.org
Cc: Kris Coward
Subject: Re: LzoTokenizedStorage working, but now I can't get the data back out 
with LzoTokenizedLoader

"no codec" means you didn't get LZO set up right on the cluster.
There are instructions on the wiki of the googlecode project for hadoop lzo.
D

On Wed, Feb 16, 2011 at 11:05 AM, Kris Coward <k...@melon.org> wrote:

>
> After a bunch of fiddling around (including some pretty heavy use of the
> secretDebugCmd--thanks), I finally got the LzoTokenizedStorage working,
> but now I'm having problems with the LzoTokenizedLoader.
>
> I'm still using pig 0.8.0-CDH3B4-SNAPSHOT, and for storage, have only
> seemed to have luck with the hirohanin jarfile in hadoop-gpl; registered
> all the other jarfiles from that package (i.e. google-collect-1.0.jar,
> hadoop-lzo-0.4.8.jar, protobuf-java-2.3.0.jar, slf4j-api-1.5.8.jar,
> slf4j-log4j12-1.5.10.jar, and yamlbeans-0.9.3.jar) and am now getting
> the error ans stack trace:
>
> ERROR 2997: Unable to recreate exception from backed error:
> java.io.IOException: No codec for file
>
> hdfs://localhost/rawfiles/cf938c112909470baac12a1655e4d705/1295791200/16173945/pst/part-m-00000.lzo
> not found, cannot run
>
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable
> to open iterator for alias test. Backend error : Unable to recreate
> exception from backed error: java.io.IOException: No codec for file
>
> hdfs://localhost/rawfiles/cf938c112909470baac12a1655e4d705/1295791200/16173945/pst/part-m-00000.lzo
> not found, cannot run
>        at org.apache.pig.PigServer.openIterator(PigServer.java:742)
>        at
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
>        at
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
>        at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
>        at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
>        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
>        at org.apache.pig.Main.run(Main.java:465)
>        at org.apache.pig.Main.main(Main.java:107)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR
> 2997: Unable to recreate exception from backed error:
> java.io.IOException: No codec for file
>
> hdfs://localhost/rawfiles/cf938c112909470baac12a1655e4d705/1295791200/16173945/pst/part-m-00000.lzo
> not found, cannot run
>        at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:221)
>        at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:151)
>        at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:337)
>        at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:378)
>        at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1198)
>        at org.apache.pig.PigServer.storeEx(PigServer.java:874)
>        at org.apache.pig.PigServer.store(PigServer.java:816)
>        at org.apache.pig.PigServer.openIterator(PigServer.java:728)
>        ... 7 more
>
> I'd think that with all the jarfiles I've loaded in (and the fact that
> LzoTokenizedStorage worked fine), that there should be a codec kicking
> around, so I'm kinda confused here..
>
> Any help would be appreciated.
>
> Thanks,
> Kris
>
> On Sun, Feb 13, 2011 at 03:00:13PM -0800, Dmitriy Ryaboy wrote:
> > You are still getting an IncompatibleClassChangeError?
> > That usually indicates there's something wrong with the classpath,
> > like perhaps your version of Pig is not the one your UDFs (in this
> > case, EB) was compiled against.
> >
> > Can you check your classpath? You can print out exactly what Pig is
> > running by passing the "-secretDebugCmd" flag to bin/pig.
> >
> > D
> >
> > On Sun, Feb 13, 2011 at 11:00 AM, Kris Coward <k...@melon.org> wrote:
> > >
> > > Upgrading to pig 0.8.0-CDH3B4-SNAPSHOT, hasn't solved the problem, even
> > > trying with each of of the 0.8.0 jar files (the hadoopgpl package used
> > > provides 3 of them, one tagged as hirohanin's, another as gerritjvv's
> > > and another as yours). I'm going to keep banging away at this, but
> would
> > > be happy to hear any other ideas that might help.
> > >
> > > Thanks,
> > > Kris
> > >
> > > On Fri, Feb 11, 2011 at 11:15:56AM -0800, Dmitriy Ryaboy wrote:
> > >> The 0.7 branch of EB is not a real thing, it's hirohanin's first
> > >> attempt at porting that has a billion bugs.
> > >> Most things in it do not, in fact, work.
> > >> Use the 0.8 branch and Pig 0.8...
> > >>
> > >> D
> > >>
> > >> On Fri, Feb 11, 2011 at 10:47 AM, Kris Coward <k...@melon.org> wrote:
> > >> >
> > >> > So in the interest of being a little less i/o bound, and saving a
> whole
> > >> > mess of disk, I've started using
> > >> > com.twitter.elephantbird.pig.store.LzoTokenizedStorage for
> storage... or
> > >> > more accurately, will be using it as soon as I stop getting the
> > >> > following error (with stack trace):
> > >> >
> > >> > ERROR 2998: Unhandled internal error. Implementing class
> > >> >
> > >> > java.lang.IncompatibleClassChangeError: Implementing class
> > >> >        at java.lang.ClassLoader.defineClass1(Native Method)
> > >> >        at
> java.lang.ClassLoader.defineClassCond(ClassLoader.java:632)
> > >> >        at java.lang.ClassLoader.defineClass(ClassLoader.java:616)
> > >> >        at
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
> > >> >        at
> java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
> > >> >        at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
> > >> >        at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
> > >> >        at java.security.AccessController.doPrivileged(Native Method)
> > >> >        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> > >> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> > >> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> > >> >        at java.lang.ClassLoader.defineClass1(Native Method)
> > >> >        at
> java.lang.ClassLoader.defineClassCond(ClassLoader.java:632)
> > >> >        at java.lang.ClassLoader.defineClass(ClassLoader.java:616)
> > >> >        at
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
> > >> >        at
> java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
> > >> >        at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
> > >> >        at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
> > >> >        at java.security.AccessController.doPrivileged(Native Method)
> > >> >        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> > >> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
> > >> >        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
> > >> >        at java.lang.Class.forName0(Native Method)
> > >> >        at java.lang.Class.forName(Class.java:247)
> > >> >        at
> org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:422)
> > >> >        at
> org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:452)
> > >> >        at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.NonEvalFuncSpec(QueryParser.java:5087)
> > >> >        at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.StoreClause(QueryParser.java:3568)
> > >> >        at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1369)
> > >> >        at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:911)
> > >> >        at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:724)
> > >> >        at
> org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
> > >> >        at
> org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1336)
> > >> >        at
> org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1286)
> > >> >        at org.apache.pig.PigServer.registerQuery(PigServer.java:460)
> > >> >        at
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:738)
> > >> >        at
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324)
> > >> >        at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:163)
> > >> >        at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:139)
> > >> >        at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
> > >> >        at org.apache.pig.Main.main(Main.java:414)
> > >> >
> > >> > Anyhow, the host machine is running CentOS, with the Cloudera
> > >> > distribution of hadoop and pig (pig version: Apache Pig version
> 0.7.0+16),
> > >> > and elephant-bird from the hadoop-gpl-packing RPM. The tweaks
> described
> > >> > at http://code.google.com/p/hadoop-gpl-packing/#Using_in_Pig have
> been
> > >> > applied, and the class still seems to be failing to load.
> > >> >
> > >> > Anyone have any idea what the problem might be (or how to solve it)?
> > >> >
> > >> > Thanks,
> > >> > Kris
> > >> >
> > >> >
> > >> > --
> > >> > Kris Coward
> http://unripe.melon.org/
> > >> > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
> > >> >
> > >
> > > --
> > > Kris Coward
> http://unripe.melon.org/
> > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
> > >
>
> --
> Kris Coward                                     http://unripe.melon.org/
> GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
>

Reply via email to