Help running Hadoop 2.0.5 with Snappy compression
Hi, I'm trying to run some MapReduce jobs on Hadoop 2.0.5 framework using Snappy compression. I built Hadoop with -Pnative, installed it and Snappy on all 3 machines (master+2 slaves) and copied .so files as required to $HADOOP_HOME/lib/native Also I added the following to $HADOOP_CONF_DIR/mapred-site.xml: property namemapreduce.map.output.compress/name valuetrue/value /property property namemapred.map.output.compress.codec/name valueorg.apache.hadoop.io.compress.SnappyCodec/value /property And this I added to core-site.xml: property nameio.compression.codecs/name value org.apache.hadoop.io.compress.GzipCodec, org.apache.hadoop.io.compress.DefaultCodec, org.apache.hadoop.io.compress.BZip2Codec, org.apache.hadoop.io.compress.SnappyCodec /value /property I then ran the following jobs: Pi: bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar pi 8 2000 Teragen Terasort: bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar teragen 10 /in bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.5-alpha.jar terasort /in /out But when grepping the logs I can't seem to find any sign of Snappy: [eladi@r-zorro003 hadoop-2.0.5-alpha]$ grep -r Snappy /data1/elad/logs/* [eladi@r-zorro003 hadoop-2.0.5-alpha]$ grep -r compress /data1/elad/logs/* /data1/elad/logs/application_1379926544427_0001/container_1379926544427_0001_01_10/syslog:2013-09-23 11:56:24,182 INFO [fetcher#5] org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully loaded initialized native-zlib library /data1/elad/logs/application_1379926544427_0001/container_1379926544427_0001_01_10/syslog:2013-09-23 11:56:24,183 INFO [fetcher#5] org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor [.deflate] /data1/elad/logs/application_1379926544427_0001/container_1379926544427_0001_01_10/syslog:2013-09-23 11:56:24,331 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor [.deflate] It seems as if Zlib is being loaded instead of Snappy. What am I missing? Thanks, Elad Itzhakian
RE: Error while configuring HDFS fedration
Ports in use may result from actual processes using them, or just ghost processes. The second error may be caused by inconsistent permissions on different nodes, and/or a format is needed on DFS. I suggest the following: 1. sbin/stop-dfs.sh sbin/stop-yarn.sh 2. sudo killall java (on all nodes) 3. sudo chmod -R 755 /home/lab/hadoop-2.1.0-beta/tmp/dfs (on all nodes) 4. sudo rm -rf /home/lab/hadoop-2.1.0-beta/tmp/dfs/* (on all nodes) 5. bin/hdfs namenode -format -force 6. sbin/start-dfs.sh sbin/start-yarn.sh Then see if you get that error again. From: Manickam P [mailto:manicka...@outlook.com] Sent: Monday, September 23, 2013 4:44 PM To: user@hadoop.apache.org Subject: Error while configuring HDFS fedration Guys, I'm trying to configure HDFS federation with 2.1.0 beta version. I am having 3 machines in that i want to have two name nodes and one data node. I have done the other thing like password less ssh and host entries properly. when i start the cluster i'm getting the below error. In node one i'm getting this error. java.net.BindException: Port in use: lab-hadoop.eng.com:50070 In another node i'm getting this error. org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /home/lab/hadoop-2.1.0-beta/tmp/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible. My core-site xml has the below. configuration property namefs.default.name/name valuehdfs://10.101.89.68:9000/value /property property namehadoop.tmp.dir/name value/home/lab/hadoop-2.1.0-beta/tmp/value /property /configuration My hdfs-site xml has the below. configuration property namedfs.replication/name value2/value /property property namedfs.permissions/name valuefalse/value /property property namedfs.federation.nameservices/name valuens1,ns2/value /property property namedfs.namenode.rpc-address.ns1/name value10.101.89.68:9001/value /property property namedfs.namenode.http-address.ns1/name value10.101.89.68:50070/value /property property namedfs.namenode.secondary.http-address.ns1/name value10.101.89.68:50090/value /property property namedfs.namenode.rpc-address.ns2/name value10.101.89.69:9001/value /property property namedfs.namenode.http-address.ns2/name value10.101.89.69:50070/value /property property namedfs.namenode.secondary.http-address.ns2/name value10.101.89.69:50090/value /property /configuration Please help me to fix this error. Thanks, Manickam P