io.compression.codecs was the clue in this case, I had set mapred.compress.map.output , but not that one. Now I have done so, and the error is gone.
Thanks! Regards, Bas On Sun, Apr 15, 2012 at 8:19 PM, Edward Capriolo <edlinuxg...@gmail.com> wrote: > You need three things. 1 install snappy to a place the system can pick > it out automatically or add it to your java.library.path > > Then add the full name of the codec to io.compression.codecs. > > hive> set io.compression.codecs; > io.compression.codecs=org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.SnappyCodec > > Edward > > > On Sun, Apr 15, 2012 at 8:36 AM, Bas Hickendorff > <hickendorff...@gmail.com> wrote: >> Hello Jay, >> >> My input is just a csv file (created it myself), so I am sure it is >> not compressed in any way. Also, the same input works when I use the >> standalone example (using the hadoop executable in the bin folder). >> When I try to integrate it in a larger java program it fails.... :( >> >> Regards, >> >> Bas >> >> On Sun, Apr 15, 2012 at 2:30 PM, JAX <jayunit...@gmail.com> wrote: >>> That is odd---- why would it crash when your m/r job did not rely on snappy? >>> >>> One possibility : Maybe because your input is snappy compressed, Hadoop is >>> detecting that compression, and trying to use the snappy codec to >>> decompress.? >>> >>> Jay Vyas >>> MMSB >>> UCHC >>> >>> On Apr 15, 2012, at 5:08 AM, Bas Hickendorff <hickendorff...@gmail.com> >>> wrote: >>> >>>> Hello John, >>>> >>>> I did restart them (in fact, I did a full reboot of the machine). The >>>> error is still there. >>>> >>>> I guess my question is: is it expected that Hadoop needs to do >>>> something with the Snappycodec when mapred.compress.map.output is set >>>> to false? >>>> >>>> Regards, >>>> >>>> Bas >>>> >>>> On Sun, Apr 15, 2012 at 12:04 PM, john smith <js1987.sm...@gmail.com> >>>> wrote: >>>>> Can you restart tasktrackers once and run the job again? It refreshes the >>>>> class path. >>>>> >>>>> On Sun, Apr 15, 2012 at 11:58 AM, Bas Hickendorff >>>>> <hickendorff...@gmail.com>wrote: >>>>> >>>>>> Thanks. >>>>>> >>>>>> The native snappy libraries I have installed. However, I use the >>>>>> normal jars that you get when downloading Hadoop, I am not compiling >>>>>> Hadoop myself. >>>>>> >>>>>> I do not want to use the snappy codec (I don't care about compression >>>>>> at the moment), but it seems it is needed anyway? I added this to the >>>>>> mapred-site.xml: >>>>>> >>>>>> <property> >>>>>> <name>mapred.compress.map.output</name> >>>>>> <value>false</value> >>>>>> </property> >>>>>> >>>>>> But it still fails with the error of my previous email (SnappyCodec not >>>>>> found). >>>>>> >>>>>> Regards, >>>>>> >>>>>> Bas >>>>>> >>>>>> >>>>>> On Sat, Apr 14, 2012 at 6:30 PM, Vinod Kumar Vavilapalli >>>>>> <vino...@hortonworks.com> wrote: >>>>>>> >>>>>>> Hadoop has integrated snappy via installed native libraries instead of >>>>>> snappy-java.jar (ref https://issues.apache.org/jira/browse/HADOOP-7206) >>>>>>> - You need to have the snappy system libraries (snappy and >>>>>> snappy-devel) installed before you compile hadoop. (RPMs are available on >>>>>> the web, http://pkgs.org/centos-5-rhel-5/epel-i386/21/ for example) >>>>>>> - When you build hadoop, you will need to compile the native >>>>>> libraries(by passing -Dcompile.native=true to ant) to avail snappy >>>>>> support. >>>>>>> - You also need to make sure that snappy system library is available on >>>>>> the library path for all mapreduce tasks at runtime. Usually if you >>>>>> install >>>>>> them on /usr/lib or /usr/local/lib, it should work. >>>>>>> >>>>>>> HTH, >>>>>>> +Vinod >>>>>>> >>>>>>> On Apr 14, 2012, at 4:36 AM, Bas Hickendorff wrote: >>>>>>> >>>>>>>> Hello, >>>>>>>> >>>>>>>> When I start a map-reduce job, it starts, and after a short while, >>>>>>>> fails with the error below (SnappyCodec not found). >>>>>>>> >>>>>>>> I am currently starting the job from other Java code (so the Hadoop >>>>>>>> executable in the bin directory is not used anymore), but in principle >>>>>>>> this seems to work (in the admin of the Jobtracker the job shows up >>>>>>>> when it starts). However after a short while the map task fails with: >>>>>>>> >>>>>>>> >>>>>>>> java.lang.IllegalArgumentException: Compression codec >>>>>>>> org.apache.hadoop.io.compress.SnappyCodec not found. >>>>>>>> at >>>>>> org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:96) >>>>>>>> at >>>>>> org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:134) >>>>>>>> at >>>>>> org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:62) >>>>>>>> at >>>>>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:522) >>>>>>>> at >>>>>>>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) >>>>>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) >>>>>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255) >>>>>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>>>>> at javax.security.auth.Subject.doAs(Subject.java:416) >>>>>>>> at >>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) >>>>>>>> at org.apache.hadoop.mapred.Child.main(Child.java:249) >>>>>>>> Caused by: java.lang.ClassNotFoundException: >>>>>>>> org.apache.hadoop.io.compress.SnappyCodec >>>>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:217) >>>>>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:205) >>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:321) >>>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) >>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:266) >>>>>>>> at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:334) >>>>>>>> at java.lang.Class.forName0(Native Method) >>>>>>>> at java.lang.Class.forName(Class.java:264) >>>>>>>> at >>>>>> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820) >>>>>>>> at >>>>>> org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:89) >>>>>>>> ... 10 more >>>>>>>> >>>>>>>> >>>>>>>> I confirmed that the SnappyCodec class is present in the >>>>>>>> hadoop-core-1.0.2.jar, and the snappy-java-1.0.4.1.jar is present as >>>>>>>> well. The directory of those jars is on the HADOOP_CLASSPATH, but it >>>>>>>> seems it still cannot find it. I also checked that the config files of >>>>>>>> Hadoop are read. I run all nodes on localhost. >>>>>>>> >>>>>>>> Any suggestions on what could be the cause of the issue? >>>>>>>> >>>>>>>> Regards, >>>>>>>> >>>>>>>> Bas >>>>>>> >>>>>>