Re: Accessing HDFS files from an servlet
i am getting this error on running that jsp as servlet Apr 15, 2012 2:34:07 PM org.apache.catalina.core.AprLifecycleListener init INFO: Loaded APR based Apache Tomcat Native library 1.1.22. Apr 15, 2012 2:34:07 PM org.apache.catalina.core.AprLifecycleListener init INFO: APR capabilities: IPv6 [false], sendfile [true], accept filters [false], random [true]. Apr 15, 2012 2:34:07 PM org.apache.tomcat.util.digester.SetPropertiesRule begin WARNING: [SetPropertiesRule]{Server/Service/Engine/Host/Context} Setting property 'source' to 'org.eclipse.jst.j2ee.server:UserInterface' did not find a matching property. Apr 15, 2012 2:34:08 PM org.apache.coyote.AbstractProtocol init INFO: Initializing ProtocolHandler [http-apr-8080] Apr 15, 2012 2:34:08 PM org.apache.coyote.AbstractProtocol init INFO: Initializing ProtocolHandler [ajp-apr-8009] Apr 15, 2012 2:34:08 PM org.apache.catalina.startup.Catalina load INFO: Initialization processed in 1985 ms Apr 15, 2012 2:34:08 PM org.apache.catalina.core.StandardService startInternal INFO: Starting service Catalina Apr 15, 2012 2:34:08 PM org.apache.catalina.core.StandardEngine startInternal INFO: Starting Servlet Engine: Apache Tomcat/7.0.26 Apr 15, 2012 2:34:09 PM org.apache.coyote.AbstractProtocol start INFO: Starting ProtocolHandler [http-apr-8080] Apr 15, 2012 2:34:09 PM org.apache.coyote.AbstractProtocol start INFO: Starting ProtocolHandler [ajp-apr-8009] Apr 15, 2012 2:34:09 PM org.apache.catalina.startup.Catalina start INFO: Server startup in 846 ms Apr 15, 2012 2:34:10 PM org.apache.catalina.core.ApplicationContext log INFO: Marking servlet DisplayFile as unavailable Apr 15, 2012 2:34:10 PM org.apache.catalina.core.StandardWrapperValve invoke SEVERE: Allocate exception for servlet DisplayFile java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1701) at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1546) at java.lang.Class.getDeclaredConstructors0(Native Method) at java.lang.Class.privateGetDeclaredConstructors(Unknown Source) at java.lang.Class.getConstructor0(Unknown Source) at java.lang.Class.newInstance0(Unknown Source) at java.lang.Class.newInstance(Unknown Source) at org.apache.catalina.core.DefaultInstanceManager.newInstance(DefaultInstanceManager.java:125) at org.apache.catalina.core.StandardWrapper.loadServlet(StandardWrapper.java:1136) at org.apache.catalina.core.StandardWrapper.allocate(StandardWrapper.java:857) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:135) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:169) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:168) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:927) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:987) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:579) at org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:1805) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source)
Re: Issue with loading the Snappy Codec
Thanks. The native snappy libraries I have installed. However, I use the normal jars that you get when downloading Hadoop, I am not compiling Hadoop myself. I do not want to use the snappy codec (I don't care about compression at the moment), but it seems it is needed anyway? I added this to the mapred-site.xml: property namemapred.compress.map.output/name valuefalse/value /property But it still fails with the error of my previous email (SnappyCodec not found). Regards, Bas On Sat, Apr 14, 2012 at 6:30 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: Hadoop has integrated snappy via installed native libraries instead of snappy-java.jar (ref https://issues.apache.org/jira/browse/HADOOP-7206) - You need to have the snappy system libraries (snappy and snappy-devel) installed before you compile hadoop. (RPMs are available on the web, http://pkgs.org/centos-5-rhel-5/epel-i386/21/ for example) - When you build hadoop, you will need to compile the native libraries(by passing -Dcompile.native=true to ant) to avail snappy support. - You also need to make sure that snappy system library is available on the library path for all mapreduce tasks at runtime. Usually if you install them on /usr/lib or /usr/local/lib, it should work. HTH, +Vinod On Apr 14, 2012, at 4:36 AM, Bas Hickendorff wrote: Hello, When I start a map-reduce job, it starts, and after a short while, fails with the error below (SnappyCodec not found). I am currently starting the job from other Java code (so the Hadoop executable in the bin directory is not used anymore), but in principle this seems to work (in the admin of the Jobtracker the job shows up when it starts). However after a short while the map task fails with: java.lang.IllegalArgumentException: Compression codec org.apache.hadoop.io.compress.SnappyCodec not found. at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:96) at org.apache.hadoop.io.compress.CompressionCodecFactory.init(CompressionCodecFactory.java:134) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:62) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:522) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.io.compress.SnappyCodec at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:334) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820) at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:89) ... 10 more I confirmed that the SnappyCodec class is present in the hadoop-core-1.0.2.jar, and the snappy-java-1.0.4.1.jar is present as well. The directory of those jars is on the HADOOP_CLASSPATH, but it seems it still cannot find it. I also checked that the config files of Hadoop are read. I run all nodes on localhost. Any suggestions on what could be the cause of the issue? Regards, Bas
Re: Issue with loading the Snappy Codec
Can you restart tasktrackers once and run the job again? It refreshes the class path. On Sun, Apr 15, 2012 at 11:58 AM, Bas Hickendorff hickendorff...@gmail.comwrote: Thanks. The native snappy libraries I have installed. However, I use the normal jars that you get when downloading Hadoop, I am not compiling Hadoop myself. I do not want to use the snappy codec (I don't care about compression at the moment), but it seems it is needed anyway? I added this to the mapred-site.xml: property namemapred.compress.map.output/name valuefalse/value /property But it still fails with the error of my previous email (SnappyCodec not found). Regards, Bas On Sat, Apr 14, 2012 at 6:30 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: Hadoop has integrated snappy via installed native libraries instead of snappy-java.jar (ref https://issues.apache.org/jira/browse/HADOOP-7206) - You need to have the snappy system libraries (snappy and snappy-devel) installed before you compile hadoop. (RPMs are available on the web, http://pkgs.org/centos-5-rhel-5/epel-i386/21/ for example) - When you build hadoop, you will need to compile the native libraries(by passing -Dcompile.native=true to ant) to avail snappy support. - You also need to make sure that snappy system library is available on the library path for all mapreduce tasks at runtime. Usually if you install them on /usr/lib or /usr/local/lib, it should work. HTH, +Vinod On Apr 14, 2012, at 4:36 AM, Bas Hickendorff wrote: Hello, When I start a map-reduce job, it starts, and after a short while, fails with the error below (SnappyCodec not found). I am currently starting the job from other Java code (so the Hadoop executable in the bin directory is not used anymore), but in principle this seems to work (in the admin of the Jobtracker the job shows up when it starts). However after a short while the map task fails with: java.lang.IllegalArgumentException: Compression codec org.apache.hadoop.io.compress.SnappyCodec not found. at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:96) at org.apache.hadoop.io.compress.CompressionCodecFactory.init(CompressionCodecFactory.java:134) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:62) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:522) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.io.compress.SnappyCodec at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:334) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820) at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:89) ... 10 more I confirmed that the SnappyCodec class is present in the hadoop-core-1.0.2.jar, and the snappy-java-1.0.4.1.jar is present as well. The directory of those jars is on the HADOOP_CLASSPATH, but it seems it still cannot find it. I also checked that the config files of Hadoop are read. I run all nodes on localhost. Any suggestions on what could be the cause of the issue? Regards, Bas
Re: Issue with loading the Snappy Codec
Hello John, I did restart them (in fact, I did a full reboot of the machine). The error is still there. I guess my question is: is it expected that Hadoop needs to do something with the Snappycodec when mapred.compress.map.output is set to false? Regards, Bas On Sun, Apr 15, 2012 at 12:04 PM, john smith js1987.sm...@gmail.com wrote: Can you restart tasktrackers once and run the job again? It refreshes the class path. On Sun, Apr 15, 2012 at 11:58 AM, Bas Hickendorff hickendorff...@gmail.comwrote: Thanks. The native snappy libraries I have installed. However, I use the normal jars that you get when downloading Hadoop, I am not compiling Hadoop myself. I do not want to use the snappy codec (I don't care about compression at the moment), but it seems it is needed anyway? I added this to the mapred-site.xml: property namemapred.compress.map.output/name valuefalse/value /property But it still fails with the error of my previous email (SnappyCodec not found). Regards, Bas On Sat, Apr 14, 2012 at 6:30 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: Hadoop has integrated snappy via installed native libraries instead of snappy-java.jar (ref https://issues.apache.org/jira/browse/HADOOP-7206) - You need to have the snappy system libraries (snappy and snappy-devel) installed before you compile hadoop. (RPMs are available on the web, http://pkgs.org/centos-5-rhel-5/epel-i386/21/ for example) - When you build hadoop, you will need to compile the native libraries(by passing -Dcompile.native=true to ant) to avail snappy support. - You also need to make sure that snappy system library is available on the library path for all mapreduce tasks at runtime. Usually if you install them on /usr/lib or /usr/local/lib, it should work. HTH, +Vinod On Apr 14, 2012, at 4:36 AM, Bas Hickendorff wrote: Hello, When I start a map-reduce job, it starts, and after a short while, fails with the error below (SnappyCodec not found). I am currently starting the job from other Java code (so the Hadoop executable in the bin directory is not used anymore), but in principle this seems to work (in the admin of the Jobtracker the job shows up when it starts). However after a short while the map task fails with: java.lang.IllegalArgumentException: Compression codec org.apache.hadoop.io.compress.SnappyCodec not found. at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:96) at org.apache.hadoop.io.compress.CompressionCodecFactory.init(CompressionCodecFactory.java:134) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:62) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:522) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.io.compress.SnappyCodec at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:334) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820) at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:89) ... 10 more I confirmed that the SnappyCodec class is present in the hadoop-core-1.0.2.jar, and the snappy-java-1.0.4.1.jar is present as well. The directory of those jars is on the HADOOP_CLASSPATH, but it seems it still cannot find it. I also checked that the config files of Hadoop are read. I run all nodes on localhost. Any suggestions on what could be the cause of the issue? Regards, Bas
hadoop namenode -format : at boot time?
Dear all, I was wondering if it is possible to format the HDFS at boot time. I have some VM's that are pre-set and pre-configured with Hadoop (datanodes [slaves] and a namenode [master]), and I'm looking for a way to obtain a cluster from them out of the box, as they're launched (including the namenode). I currently have the following boot-time init script on my namenode: #!/bin/bash # # # Starts a Hadoop Master # # chkconfig: 2345 90 10 # description: Hadoop master . /etc/rc.status . /home/oneadmin/mountpoint/hadoop-0.20.2/conf/hadoop-env.sh export HPATH=/home/oneadmin/mountpoint/hadoop-0.20.2 export HLOCK=/tmp RETVAL=0 PIDFILE=$HLOCK/hadoop-hdfs-master.pid desc=Hadoop Master daemon start() { echo -n $Starting $desc (hadoop): /sbin/startproc -u 1001 $HPATH/bin/hadoop namenode -format /sbin/startproc -u 1001 $HPATH/bin/start-dfs.sh $1 /sbin/startproc -u 1001 $HPATH/bin/start-mapred.sh $1 RETVAL=$? echo [ $RETVAL -eq 0 ] touch $HLOCK/hadoop-master return $RETVAL } stop() { echo -n $Stopping $desc (hadoop): /sbin/startproc -u 1001 $HPATH/bin/stop-all.sh RETVAL=$? sleep 5 echo [ $RETVAL -eq 0 ] rm -f $HLOCK/hadoop-master $PIDFILE } checkstatus(){ jps |grep NameNode } restart() { stop start } format() { /sbin/startproc -u 1001 $HPATH/bin/hadoop master -format } case $1 in start) start ;; upgrade) upgrade ;; format) format ;; stop) stop ;; status) checkstatus ;; restart) restart ;; *) echo $Usage: $0 {start|stop|status|restart|try-restart} exit 1 esac exit $RETVAL As you can see, I included the format command in the starting function, too, however it is not working. All the Hadoop processes except NameNode start, and the HDFS isn't being formatted. Is it possible to obtain such functionality that I'm looking for? Any suggestions would be highly appreciated. Thank you, Lehel Biro.
Re: Issue with loading the Snappy Codec
That is odd why would it crash when your m/r job did not rely on snappy? One possibility : Maybe because your input is snappy compressed, Hadoop is detecting that compression, and trying to use the snappy codec to decompress.? Jay Vyas MMSB UCHC On Apr 15, 2012, at 5:08 AM, Bas Hickendorff hickendorff...@gmail.com wrote: Hello John, I did restart them (in fact, I did a full reboot of the machine). The error is still there. I guess my question is: is it expected that Hadoop needs to do something with the Snappycodec when mapred.compress.map.output is set to false? Regards, Bas On Sun, Apr 15, 2012 at 12:04 PM, john smith js1987.sm...@gmail.com wrote: Can you restart tasktrackers once and run the job again? It refreshes the class path. On Sun, Apr 15, 2012 at 11:58 AM, Bas Hickendorff hickendorff...@gmail.comwrote: Thanks. The native snappy libraries I have installed. However, I use the normal jars that you get when downloading Hadoop, I am not compiling Hadoop myself. I do not want to use the snappy codec (I don't care about compression at the moment), but it seems it is needed anyway? I added this to the mapred-site.xml: property namemapred.compress.map.output/name valuefalse/value /property But it still fails with the error of my previous email (SnappyCodec not found). Regards, Bas On Sat, Apr 14, 2012 at 6:30 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: Hadoop has integrated snappy via installed native libraries instead of snappy-java.jar (ref https://issues.apache.org/jira/browse/HADOOP-7206) - You need to have the snappy system libraries (snappy and snappy-devel) installed before you compile hadoop. (RPMs are available on the web, http://pkgs.org/centos-5-rhel-5/epel-i386/21/ for example) - When you build hadoop, you will need to compile the native libraries(by passing -Dcompile.native=true to ant) to avail snappy support. - You also need to make sure that snappy system library is available on the library path for all mapreduce tasks at runtime. Usually if you install them on /usr/lib or /usr/local/lib, it should work. HTH, +Vinod On Apr 14, 2012, at 4:36 AM, Bas Hickendorff wrote: Hello, When I start a map-reduce job, it starts, and after a short while, fails with the error below (SnappyCodec not found). I am currently starting the job from other Java code (so the Hadoop executable in the bin directory is not used anymore), but in principle this seems to work (in the admin of the Jobtracker the job shows up when it starts). However after a short while the map task fails with: java.lang.IllegalArgumentException: Compression codec org.apache.hadoop.io.compress.SnappyCodec not found. at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:96) at org.apache.hadoop.io.compress.CompressionCodecFactory.init(CompressionCodecFactory.java:134) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:62) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:522) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.io.compress.SnappyCodec at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:334) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820) at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:89) ... 10 more I confirmed that the SnappyCodec class is present in the hadoop-core-1.0.2.jar, and the snappy-java-1.0.4.1.jar is present as well. The directory of those jars is on the HADOOP_CLASSPATH, but it seems it still cannot find it. I also checked that the config files of Hadoop are read. I run all nodes on localhost. Any suggestions on what could be the cause of the issue? Regards, Bas
Re: getting UnknownHostException
hi Madhu, After doing Modification in /ets/host it's working fine Thanks a lot :) Kind Regards Sijit Dhamale (+91 9970086652) On Fri, Apr 13, 2012 at 10:49 AM, madhu phatak phatak@gmail.com wrote: Please check contents of /etc/hosts for the hostname and ipaddress mapping. On Thu, Apr 12, 2012 at 11:11 PM, Sujit Dhamale sujitdhamal...@gmail.com wrote: Hi Friends , i am getting UnknownHostException while executing Hadoop Word count program getting below details from job tracker Web page *User:* sujit *Job Name:* word count *Job File:* hdfs://localhost:54310/app/hadoop/tmp/mapred/staging/sujit/.staging/job_201204112234_0002/job.xml http://localhost:50030/jobconf.jsp?jobid=job_201204112234_0002 *Submit Host:* sujit.(null) *Submit Host Address:* 127.0.1.1 *Job-ACLs: All users are allowed* *Job Setup:*None *Status:* Failed *Failure Info:*Job initialization failed: java.net.UnknownHostException: sujit.(null) is not a valid Inet address at org.apache.hadoop.net. NetUtils.verifyHostnames(NetUtils.java:569) at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:711) at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:4207) at org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:79) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) *Started at:* Wed Apr 11 22:36:46 IST 2012 *Failed at:* Wed Apr 11 22:36:47 IST 2012 *Failed in:* 0sec *Job Cleanup:*None Can some one help me how to resolve this issue . i tried with : http://wiki.apache.org/hadoop/UnknownHost but still not able to resolve issue , please help me out . Hadoop Version: hadoop-1.0.1.tar.gz java version 1.6.0_30 Operating System : Ubuntu 11.10 *Note *: All node were up before starting execution of Program Kind Regards Sujit Dhamale http://wiki.apache.org/hadoop/UnknownHost -- https://github.com/zinnia-phatak-dev/Nectar
Re: Issue with loading the Snappy Codec
You need three things. 1 install snappy to a place the system can pick it out automatically or add it to your java.library.path Then add the full name of the codec to io.compression.codecs. hive set io.compression.codecs; io.compression.codecs=org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.SnappyCodec Edward On Sun, Apr 15, 2012 at 8:36 AM, Bas Hickendorff hickendorff...@gmail.com wrote: Hello Jay, My input is just a csv file (created it myself), so I am sure it is not compressed in any way. Also, the same input works when I use the standalone example (using the hadoop executable in the bin folder). When I try to integrate it in a larger java program it fails :( Regards, Bas On Sun, Apr 15, 2012 at 2:30 PM, JAX jayunit...@gmail.com wrote: That is odd why would it crash when your m/r job did not rely on snappy? One possibility : Maybe because your input is snappy compressed, Hadoop is detecting that compression, and trying to use the snappy codec to decompress.? Jay Vyas MMSB UCHC On Apr 15, 2012, at 5:08 AM, Bas Hickendorff hickendorff...@gmail.com wrote: Hello John, I did restart them (in fact, I did a full reboot of the machine). The error is still there. I guess my question is: is it expected that Hadoop needs to do something with the Snappycodec when mapred.compress.map.output is set to false? Regards, Bas On Sun, Apr 15, 2012 at 12:04 PM, john smith js1987.sm...@gmail.com wrote: Can you restart tasktrackers once and run the job again? It refreshes the class path. On Sun, Apr 15, 2012 at 11:58 AM, Bas Hickendorff hickendorff...@gmail.comwrote: Thanks. The native snappy libraries I have installed. However, I use the normal jars that you get when downloading Hadoop, I am not compiling Hadoop myself. I do not want to use the snappy codec (I don't care about compression at the moment), but it seems it is needed anyway? I added this to the mapred-site.xml: property namemapred.compress.map.output/name valuefalse/value /property But it still fails with the error of my previous email (SnappyCodec not found). Regards, Bas On Sat, Apr 14, 2012 at 6:30 PM, Vinod Kumar Vavilapalli vino...@hortonworks.com wrote: Hadoop has integrated snappy via installed native libraries instead of snappy-java.jar (ref https://issues.apache.org/jira/browse/HADOOP-7206) - You need to have the snappy system libraries (snappy and snappy-devel) installed before you compile hadoop. (RPMs are available on the web, http://pkgs.org/centos-5-rhel-5/epel-i386/21/ for example) - When you build hadoop, you will need to compile the native libraries(by passing -Dcompile.native=true to ant) to avail snappy support. - You also need to make sure that snappy system library is available on the library path for all mapreduce tasks at runtime. Usually if you install them on /usr/lib or /usr/local/lib, it should work. HTH, +Vinod On Apr 14, 2012, at 4:36 AM, Bas Hickendorff wrote: Hello, When I start a map-reduce job, it starts, and after a short while, fails with the error below (SnappyCodec not found). I am currently starting the job from other Java code (so the Hadoop executable in the bin directory is not used anymore), but in principle this seems to work (in the admin of the Jobtracker the job shows up when it starts). However after a short while the map task fails with: java.lang.IllegalArgumentException: Compression codec org.apache.hadoop.io.compress.SnappyCodec not found. at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:96) at org.apache.hadoop.io.compress.CompressionCodecFactory.init(CompressionCodecFactory.java:134) at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:62) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:522) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.io.compress.SnappyCodec at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at
Basic setup questions on Ubuntu
I am a newbie to Unix/Hadoop and have basic questions about CDH3 setup. I installed CDH3 on Ubuntu 11.0 Unix box. I want to setup a sudo cluster where I can run my pig jobs under mapreduce mode. How do I achieve that? 1. I couldd not find the core-site.xml. hdfs-site.xml and mapred-site.xml files with all default parameters set? Where are these located. (I see the files under example-conf. dir, but I guess they are example files) 2. I see several config files under /usr/lib/hadoop/conf. But all of them are empty files, with the comments that these can be used to override the configuration, but these are read-only files. What is the intention of these files being read-only. Many Thanks, Prashant
upload hang at DFSClient$DFSOutputStream.close(3488)
Hi, I use hadoop cloudera 0.20.2-cdh3u0. I have a program which uploads local files to HDFS every hour. Basically, I open a gzip input stream by in= new GZIPInputStream(fin); And write to HDFS file. After less than two days, it will hang. It hangs at FSDataOutputStream.close(86). Here is the stack: State: WAITING Running 16660 ms (user 13770 ms) blocked 11276 times for ms waiting 11209 times for ms LockName: java.util.LinkedList@f1ca0de LockOwnerId: -1 java.lang.Object.wait(-2) java.lang.Object.wait(485) org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.waitForAckedSeqno(3468) org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.flushInternal(3457) org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(3549) org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(3488) org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(61) org.apache.hadoop.fs.FSDataOutputStream.close(86) org.apache.hadoop.io.IOUtils.copyBytes(59) org.apache.hadoop.io.IOUtils.copyBytes(74) Any suggestion to avoid this issue? It seems this is a bug in hadoop. I found this issue is less severe when my upload server do one upload at a time, instead of using multiple concurrent uploads. Thanks, Mingxi
Re: Basic setup questions on Ubuntu
Prashant, Post your questions to cdh-u...@cloudera.org. Follow CDH3 installation guide. After installing package and individual components you need to configure all configuration files like core-site.xml, hdfs-site.xml etc. Thanks Manish Sent from my BlackBerry, pls excuse typo -Original Message- From: shan s mysub...@gmail.com Date: Mon, 16 Apr 2012 02:49:51 To: common-user@hadoop.apache.org Reply-To: common-user@hadoop.apache.org Subject: Basic setup questions on Ubuntu I am a newbie to Unix/Hadoop and have basic questions about CDH3 setup. I installed CDH3 on Ubuntu 11.0 Unix box. I want to setup a sudo cluster where I can run my pig jobs under mapreduce mode. How do I achieve that? 1. I couldd not find the core-site.xml. hdfs-site.xml and mapred-site.xml files with all default parameters set? Where are these located. (I see the files under example-conf. dir, but I guess they are example files) 2. I see several config files under /usr/lib/hadoop/conf. But all of them are empty files, with the comments that these can be used to override the configuration, but these are read-only files. What is the intention of these files being read-only. Many Thanks, Prashant
RE: upload hang at DFSClient$DFSOutputStream.close(3488)
Hi Mingxi, In your thread dump, did you check DataStreamer thread? is it running? If DataStreamer thread is not running, then this issue would be mostly same as HDFS-2850. Did you find any OOME in your clients? Regards, Uma From: Mingxi Wu [mingxi...@turn.com] Sent: Monday, April 16, 2012 7:25 AM To: common-user@hadoop.apache.org Subject: upload hang at DFSClient$DFSOutputStream.close(3488) Hi, I use hadoop cloudera 0.20.2-cdh3u0. I have a program which uploads local files to HDFS every hour. Basically, I open a gzip input stream by in= new GZIPInputStream(fin); And write to HDFS file. After less than two days, it will hang. It hangs at FSDataOutputStream.close(86). Here is the stack: State: WAITING Running 16660 ms (user 13770 ms) blocked 11276 times for ms waiting 11209 times for ms LockName: java.util.LinkedList@f1ca0de LockOwnerId: -1 java.lang.Object.wait(-2) java.lang.Object.wait(485) org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.waitForAckedSeqno(3468) org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.flushInternal(3457) org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(3549) org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(3488) org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(61) org.apache.hadoop.fs.FSDataOutputStream.close(86) org.apache.hadoop.io.IOUtils.copyBytes(59) org.apache.hadoop.io.IOUtils.copyBytes(74) Any suggestion to avoid this issue? It seems this is a bug in hadoop. I found this issue is less severe when my upload server do one upload at a time, instead of using multiple concurrent uploads. Thanks, Mingxi