RE: Unable to use hadoop cluster on the cloud
Yes all versions are the same. I installed hadoop on master and copied the folder to other machines so the install should be identical. BTW I resolved the below issue by making all slaves visible to the client machine. Previously only master was visible to the client. However I have another problem when running the job. For some reason the slave host names are resolved to different names than the ones I gave in hadoop config. Each vm has two nic cards. I configured eth1 as hslave1, hslave2 eth0 is not used by hadoop at all but when running the job it is somehow resolving slave name as machine name instead of hslave1 and that is causing communication issues between slaves I think. I even specified property mapred.tasktracker.dns.interface to eth1 but that didn't help. I saw that one can also specifiy slave.host.name property but I am trying to avoid that since that would change the hadoop install on each slave. Any thoughts to resolve the issue would be appreciated... Praveen From: ext icebergs [hkm...@gmail.com] Sent: Saturday, March 05, 2011 9:24 PM To: common-user@hadoop.apache.org Subject: Re: Unable to use hadoop cluster on the cloud Are you sure that all the version of hadoop are the same? 2011/3/4 praveen.pe...@nokia.com Thanks Adarsh for the reply. Just to clarify the issue a bit, I am able to do all operations (-copyFromLocal, -get -rmr etc) from the master node. So I am confident that the communication between all hadoop machines is fine. But when I do the same operation from another machine that also has same hadoop config, I get below errors. However I can do -lsr and it lists the files correctly. Praveen -Original Message- From: ext Adarsh Sharma [mailto:adarsh.sha...@orkash.com] Sent: Friday, March 04, 2011 12:12 AM To: common-user@hadoop.apache.org Subject: Re: Unable to use hadoop cluster on the cloud Hi Praveen, Check through ssh ping that your datanodes are communicating with each other or not. Cheers, Adarsh praveen.pe...@nokia.com wrote: Hello all, I installed hadoop0.20.2 on physical machines and everything works like a charm. Now I installed hadoop using the same hadoop-install gz file on the cloud. Installation seems fine. I can even copy files to hdfs from master machine. But when I try to do it from another non hadoop machine, I get following error. I did googling and lot of people got this error but could not find any solution. Also I didn't see any exceptions in the hadoop logs. Any thoughts? $ /usr/local/hadoop-0.20.2/bin/hadoop fs -copyFromLocal Merchandising-ear.tar.gz /tmp/hadoop-test/Merchandising-ear.tar.gz 11/03/03 21:58:50 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.ConnectException: Connection timed out 11/03/03 21:58:50 INFO hdfs.DFSClient: Abandoning block blk_-8243207628973732008_1005 11/03/03 21:58:50 INFO hdfs.DFSClient: Waiting to find target node: xx.xx.12:50010 11/03/03 21:59:17 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.ConnectException: Connection timed out 11/03/03 21:59:17 INFO hdfs.DFSClient: Abandoning block blk_2852127666568026830_1005 11/03/03 21:59:17 INFO hdfs.DFSClient: Waiting to find target node: xx.xx.16.12:50010 11/03/03 21:59:44 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.ConnectException: Connection timed out 11/03/03 21:59:44 INFO hdfs.DFSClient: Abandoning block blk_2284836193463265901_1005 11/03/03 21:59:44 INFO hdfs.DFSClient: Waiting to find target node: xx.xx.16.12:50010 11/03/03 22:00:11 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.ConnectException: Connection timed out 11/03/03 22:00:11 INFO hdfs.DFSClient: Abandoning block blk_-5600915414055250488_1005 11/03/03 22:00:11 INFO hdfs.DFSClient: Waiting to find target node: xx.xx.16.11:50010 11/03/03 22:00:17 WARN hdfs.DFSClient: DataStreamer Exception: java.io.IOException: Unable to create new block. at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2845) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSC lient.java:2288) 11/03/03 22:00:17 WARN hdfs.DFSClient: Error Recovery for block blk_-5600915414055250488_1005 bad datanode[0] nodes == null 11/03/03 22:00:17 WARN hdfs.DFSClient: Could not get block locations. Source file /tmp/hadoop-test/Merchandising-ear.tar.gz - Aborting... copyFromLocal: Connection timed out 11/03/03 22:00:17 ERROR hdfs.DFSClient: Exception closing file /tmp/hadoop-test/Merchandising-ear.tar.gz : java.net.ConnectException: Connection timed out java.net.ConnectException: Connection timed out at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at
File access pattern on HDFS?
Hi, Is there a mechanism to get the list of files accessed on HDFS at the NameNode? Thanks! --- Gautam
Re: LZO Packager on Centos 5.5 fails
Are you doing a 64bit build or doesn't it matter? Thanks -Pete On Sat, 05 Mar 2011 21:35:30 -0800, Saurabh Dutta saurabh.du...@impetus.co.in wrote: Hi, Try and open the file toddlipcon-hadoop-lzo.spec in your SPECS directory and replace the '-' with '_' in line number 3. See if this works for you. Thanks, Saurabh From: phil young [phil.wills.yo...@gmail.com] Sent: Sunday, March 06, 2011 4:27 AM To: hadoop-common-user Subject: LZO Packager on Centos 5.5 fails I'm trying to install LZO on CDH3B3, using the LZO Packager tool here ( https://github.com/toddlipcon/hadoop-lzo-packager) I'm running the following on Centos 5.5. I'm not an expert on RPM building. Does anyone have any suggestions about how to debug this? hadoop-lzo-packager]# ./run.sh + ANT_VERSION=1.8.1 + ANT_TARBALL=apache-ant-1.8.1-bin.tar.gz + ANT_TARBALL_URL= http://www.gtlib.gatech.edu/pub/apache/ant/binaries/apache-ant-1.8.1-bin.tar.gz + '[' -n '' ']' + '[' -n '' ']' + SRC_PROJECT=github + RELEASE=1 ++ getent passwd root ++ cut -d: -f5 ++ cut -d, -f1 + PACKAGER=root ++ hostname -f + HOST=pyoung-dev.tripadvisor.com + PACKAGER_EMAIL=r...@pyoung-dev.tripadvisor.com + HADOOP_HOME=/usr/lib/hadoop +++ dirname ./run.sh ++ readlink -f . + BINDIR=/root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager + mkdir -p build + setup_github + GITHUB_ACCOUNT=toddlipcon + GITHUB_BRANCH=master + PACKAGE_HOMEPAGE=http://github.com/toddlipcon/hadoop-lzo + TARURL=http://github.com/toddlipcon/hadoop-lzo/tarball/master ++ ls /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/toddlipcon-hadoop-lzo-0.4.9-3-g2bd0d5b.tar.gz /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/toddlipcon-hadoop-lzo-20110305174723.0.4.9-3-g2bd0d5b.tar.gz + '[' -z '/root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/toddlipcon-hadoop-lzo-0.4.9-3-g2bd0d5b.tar.gz /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/toddlipcon-hadoop-lzo-20110305174723.0.4.9-3-g2bd0d5b.tar.gz' ']' ++ ls -1 /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/toddlipcon-hadoop-lzo-0.4.9-3-g2bd0d5b.tar.gz /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/toddlipcon-hadoop-lzo-20110305174723.0.4.9-3-g2bd0d5b.tar.gz ++ head -1 + ORIG_TAR=/root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/toddlipcon-hadoop-lzo-0.4.9-3-g2bd0d5b.tar.gz ++ expr match /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/toddlipcon-hadoop-lzo-0.4.9-3-g2bd0d5b.tar.gz '.*hadoop-lzo-\(.*\).tar.gz' + GIT_HASH=0.4.9-3-g2bd0d5b + echo 'Git hash: 0.4.9-3-g2bd0d5b' Git hash: 0.4.9-3-g2bd0d5b + NAME=toddlipcon-hadoop-lzo ++ date +%Y%m%d%H%M%S + VERSION=20110305174749.0.4.9-3-g2bd0d5b + pushd /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/ + mkdir toddlipcon-hadoop-lzo-20110305174749.0.4.9-3-g2bd0d5b/ + tar -C toddlipcon-hadoop-lzo-20110305174749.0.4.9-3-g2bd0d5b/ --strip-components=1 -xzf /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/toddlipcon-hadoop-lzo-0.4.9-3-g2bd0d5b.tar.gz + tar czf toddlipcon-hadoop-lzo-20110305174749.0.4.9-3-g2bd0d5b.tar.gz toddlipcon-hadoop-lzo-20110305174749.0.4.9-3-g2bd0d5b/ + popd + CHECKOUT_TAR=/root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/toddlipcon-hadoop-lzo-20110305174749.0.4.9-3-g2bd0d5b.tar.gz + TOPDIR=/root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/topdir + CHECKOUT=/root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/toddlipcon-hadoop-lzo-20110305174749.0.4.9-3-g2bd0d5b + checkout_github + echo -n + '[' '!' -e /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/toddlipcon-hadoop-lzo-20110305174749.0.4.9-3-g2bd0d5b.tar.gz ']' + '[' -n '' ']' + '[' -z '' ']' + rm -Rf /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/topdir + mkdir -p /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/topdir + cd /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/topdir/ + mkdir SOURCES BUILD SPECS SRPMS RPMS BUILDROOT + cat /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/template.spec + do_substs + sed ' s,@PACKAGE_NAME@,toddlipcon-hadoop-lzo,g; s,@PACKAGE_HOMEPAGE@,http://github.com/toddlipcon/hadoop-lzo,g; s,@VERSION@,20110305174749.0.4.9-3-g2bd0d5b,g; s,@RELEASE@,1,g; s,@PACKAGER@,root,g; s,@PACKAGER_EMAIL@,r...@pyoung-dev.tripadvisor.com,g; s,@HADOOP_HOME@,/usr/lib/hadoop,g; ' + cp /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/toddlipcon-hadoop-lzo-20110305174749.0.4.9-3-g2bd0d5b.tar.gz /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/topdir/SOURCES + pushd /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/topdir/SPECS ++ pwd ++ pwd + rpmbuild --buildroot /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/topdir/SPECS/../BUILDROOT --define '_topdir /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/topdir/SPECS/..' -ba toddlipcon-hadoop-lzo.spec error: line 3: Illegal char '-' in version: Version: 20110305174749.0.4.9-3-g2bd0d5b
Re: Unable to use hadoop cluster on the cloud
praveen.pe...@nokia.com wrote: Thanks Adarsh for the reply. Just to clarify the issue a bit, I am able to do all operations (-copyFromLocal, -get -rmr etc) from the master node. So I am confident that the communication between all hadoop machines is fine. But when I do the same operation from another machine that also has same hadoop config, I get below errors. However I can do -lsr and it lists the files correctly. Praveen, Your error is due to communication problem between your datanodes i.e Datanode1 couldn't able to place the replica of a block into coresponding datanode2. U mention the HDFS commands. Simply check from datanode1 as ssh datanode2_ip or ping datanode2_ip Best Rgds, Adarsh Praveen -Original Message- From: ext Adarsh Sharma [mailto:adarsh.sha...@orkash.com] Sent: Friday, March 04, 2011 12:12 AM To: common-user@hadoop.apache.org Subject: Re: Unable to use hadoop cluster on the cloud Hi Praveen, Check through ssh ping that your datanodes are communicating with each other or not. Cheers, Adarsh praveen.pe...@nokia.com wrote: Hello all, I installed hadoop0.20.2 on physical machines and everything works like a charm. Now I installed hadoop using the same hadoop-install gz file on the cloud. Installation seems fine. I can even copy files to hdfs from master machine. But when I try to do it from another non hadoop machine, I get following error. I did googling and lot of people got this error but could not find any solution. Also I didn't see any exceptions in the hadoop logs. Any thoughts? $ /usr/local/hadoop-0.20.2/bin/hadoop fs -copyFromLocal Merchandising-ear.tar.gz /tmp/hadoop-test/Merchandising-ear.tar.gz 11/03/03 21:58:50 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.ConnectException: Connection timed out 11/03/03 21:58:50 INFO hdfs.DFSClient: Abandoning block blk_-8243207628973732008_1005 11/03/03 21:58:50 INFO hdfs.DFSClient: Waiting to find target node: xx.xx.12:50010 11/03/03 21:59:17 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.ConnectException: Connection timed out 11/03/03 21:59:17 INFO hdfs.DFSClient: Abandoning block blk_2852127666568026830_1005 11/03/03 21:59:17 INFO hdfs.DFSClient: Waiting to find target node: xx.xx.16.12:50010 11/03/03 21:59:44 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.ConnectException: Connection timed out 11/03/03 21:59:44 INFO hdfs.DFSClient: Abandoning block blk_2284836193463265901_1005 11/03/03 21:59:44 INFO hdfs.DFSClient: Waiting to find target node: xx.xx.16.12:50010 11/03/03 22:00:11 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.ConnectException: Connection timed out 11/03/03 22:00:11 INFO hdfs.DFSClient: Abandoning block blk_-5600915414055250488_1005 11/03/03 22:00:11 INFO hdfs.DFSClient: Waiting to find target node: xx.xx.16.11:50010 11/03/03 22:00:17 WARN hdfs.DFSClient: DataStreamer Exception: java.io.IOException: Unable to create new block. at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2845) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSC lient.java:2288) 11/03/03 22:00:17 WARN hdfs.DFSClient: Error Recovery for block blk_-5600915414055250488_1005 bad datanode[0] nodes == null 11/03/03 22:00:17 WARN hdfs.DFSClient: Could not get block locations. Source file /tmp/hadoop-test/Merchandising-ear.tar.gz - Aborting... copyFromLocal: Connection timed out 11/03/03 22:00:17 ERROR hdfs.DFSClient: Exception closing file /tmp/hadoop-test/Merchandising-ear.tar.gz : java.net.ConnectException: Connection timed out java.net.ConnectException: Connection timed out at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2870) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2826) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSC lient.java:2288) [C4554954_admin@c4554954vl03 relevancy]$
Re: LZO Packager on Centos 5.5 fails
Thanks for the responses. Saurabh, Thanks for the response. I got LZO installed without the packager, but I do appreciate the help as the rpm method seems like a better way to go. Pete, The clusters are all 64-bit machines. That doesn't matter at this point because I was able to get this to work without the packager by reading many, many web pages. I had wanted to use the packager because I assumed it would resolve open questions like: if the code evolved from GoogleCode-KevinWeil-ToddLipconn, and the GoogleCode says to use branch 0.1 because the trunk is for the new Hadoop APIs (which Hive doesn't use yet), then what's the right thing to install. I do have LZO working now with Hive. Tables can now be compressed as LZO, with Indexes, with data validated via lzop directly, and I can use intermediate LZO compression. That took more work than I anticipated, so I'm still interested in the packager. To sum up what I've done: I replaced wget with wget --no-check-certificate in run.sh. I did replace the string in the file you mentioned, but the only file with that name is in the build directory [root@adhocmaster01an hadoop-lzo-packager]# find ./ -name *toddlipcon-hadoop-lzo.spec* ./build/topdir/SPECS/toddlipcon-hadoop-lzo.spec Unsurprisingly, executing run.sh again yielded the original error message. It seems I should be able to change the version tag upstream somewhere, but I don't have enough time now to research SPEC files and how deep this rabbit hole might go. I do appreciate the body of work that this is all built on and your responses, but I am surprised that there isn't a more comprehensive set of How-Tos. I'll assist in writing up something like that when I get a chance. Are other people using CDH3B3 with 64-bit CentOS/RedHat, or is there something I'm missing? Thanks for the assistance. On Sun, Mar 6, 2011 at 9:25 PM, Pete Haidinyak javam...@cox.net wrote: Are you doing a 64bit build or doesn't it matter? Thanks -Pete On Sat, 05 Mar 2011 21:35:30 -0800, Saurabh Dutta saurabh.du...@impetus.co.in wrote: Hi, Try and open the file toddlipcon-hadoop-lzo.spec in your SPECS directory and replace the '-' with '_' in line number 3. See if this works for you. Thanks, Saurabh From: phil young [phil.wills.yo...@gmail.com] Sent: Sunday, March 06, 2011 4:27 AM To: hadoop-common-user Subject: LZO Packager on Centos 5.5 fails I'm trying to install LZO on CDH3B3, using the LZO Packager tool here ( https://github.com/toddlipcon/hadoop-lzo-packager) I'm running the following on Centos 5.5. I'm not an expert on RPM building. Does anyone have any suggestions about how to debug this? hadoop-lzo-packager]# ./run.sh + ANT_VERSION=1.8.1 + ANT_TARBALL=apache-ant-1.8.1-bin.tar.gz + ANT_TARBALL_URL= http://www.gtlib.gatech.edu/pub/apache/ant/binaries/apache-ant-1.8.1-bin.tar.gz + '[' -n '' ']' + '[' -n '' ']' + SRC_PROJECT=github + RELEASE=1 ++ getent passwd root ++ cut -d: -f5 ++ cut -d, -f1 + PACKAGER=root ++ hostname -f + HOST=pyoung-dev.tripadvisor.com + PACKAGER_EMAIL=r...@pyoung-dev.tripadvisor.com + HADOOP_HOME=/usr/lib/hadoop +++ dirname ./run.sh ++ readlink -f . + BINDIR=/root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager + mkdir -p build + setup_github + GITHUB_ACCOUNT=toddlipcon + GITHUB_BRANCH=master + PACKAGE_HOMEPAGE=http://github.com/toddlipcon/hadoop-lzo + TARURL=http://github.com/toddlipcon/hadoop-lzo/tarball/master ++ ls /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/toddlipcon-hadoop-lzo-0.4.9-3-g2bd0d5b.tar.gz /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/toddlipcon-hadoop-lzo-20110305174723.0.4.9-3-g2bd0d5b.tar.gz + '[' -z '/root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/toddlipcon-hadoop-lzo-0.4.9-3-g2bd0d5b.tar.gz /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/toddlipcon-hadoop-lzo-20110305174723.0.4.9-3-g2bd0d5b.tar.gz' ']' ++ ls -1 /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/toddlipcon-hadoop-lzo-0.4.9-3-g2bd0d5b.tar.gz /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/toddlipcon-hadoop-lzo-20110305174723.0.4.9-3-g2bd0d5b.tar.gz ++ head -1 + ORIG_TAR=/root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/toddlipcon-hadoop-lzo-0.4.9-3-g2bd0d5b.tar.gz ++ expr match /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/toddlipcon-hadoop-lzo-0.4.9-3-g2bd0d5b.tar.gz '.*hadoop-lzo-\(.*\).tar.gz' + GIT_HASH=0.4.9-3-g2bd0d5b + echo 'Git hash: 0.4.9-3-g2bd0d5b' Git hash: 0.4.9-3-g2bd0d5b + NAME=toddlipcon-hadoop-lzo ++ date +%Y%m%d%H%M%S + VERSION=20110305174749.0.4.9-3-g2bd0d5b + pushd /root/pwy/lzo_research/toddlipcon/hadoop-lzo-packager/build/ + mkdir toddlipcon-hadoop-lzo-20110305174749.0.4.9-3-g2bd0d5b/ + tar -C toddlipcon-hadoop-lzo-20110305174749.0.4.9-3-g2bd0d5b/ --strip-components=1 -xzf
Re: hadoop installation problem(single-node)
sorry for responding late. Thanks for ur help I tried your command but the result is same can u tell me what should i do ___ If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/hadoop-installation-problem-single-node-tp2613742p2644953.html To unsubscribe from hadoop installation problem(single-node), visit http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=2613742code=Y29yZS11c2VyQGhhZG9vcC5hcGFjaGUub3JnfDI2MTM3NDJ8LTk1MzA2MTk5NA==
kmeans
Hii everyone... I m trying to run kmeans example using mahout 0.4...Can anyone plzzz guide me through this...any help is really appreciated... Thanks Manish