Re: Any daemon?
You can look at BlockPoolSliceScanner#scan method. This is in trunk code. You can find this logic in DataBlockScanner#run in earlier versions. Regards, Uma - Original Message - From: kartheek muthyala kartheek0...@gmail.com Date: Monday, November 7, 2011 7:31 pm Subject: Any daemon? To: common-user@hadoop.apache.org Hi all, I am interested in knowing, if there is any background daemon in hadoopwhich runs for regular periods checking if all the data copies(blocks as listed in block map) do exist and are not corrupted?. Can you please point me to that piece of code in hadoop? Thanks, Kartheek.
Re: under cygwin JUST tasktracker run by cyg_server user, Permission denied .....
in 205, code is different than trace Which version are you using? I just verified the code in older versions, http://mail-archives.apache.org/mod_mbox/hadoop-common-commits/201109.mbox/%3c20110902221116.d0b192388...@eris.apache.org%3E below is the code snippet. +boolean rv = true; + +// read perms +rv = f.setReadable(group.implies(FsAction.READ), false); +checkReturnValue(rv, f, permission); if rv is false then it throws the below error. Can you please create a simple program with the below path and try call setReadable with the user where task tracker starts. Then we can get to know what error it is giving. look at the javadoc http://download.oracle.com/javase/6/docs/api/java/io/File.html#setReadable(boolean,%20boolean) setReadable public boolean setReadable(boolean readable, boolean ownerOnly)Sets the owner's or everybody's read permission for this abstract pathname. Parameters: readable - If true, sets the access permission to allow read operations; if false to disallow read operations ownerOnly - If true, the read permission applies only to the owner's read permission; otherwise, it applies to everybody. If the underlying file system can not distinguish the owner's read permission from that of others, then the permission will apply to everybody, regardless of this value. Returns: true if and only if the operation succeeded. The operation will fail if the user does not have permission to change the access permissions of this abstract pathname. If readable is false and the underlying file system does not implement a read permission, then the operation will fail. I am not sure how to provide the athentications in Cygwin. Please make sure you should have rights to change the permissions with the user. If i get some more info, i will update you. i sent it to mapreduce user and cced to common Regards, Uma *- Original Message - From: Masoud mas...@agape.hanyang.ac.kr Date: Friday, November 4, 2011 7:01 am Subject: Re: under cygwin JUST tasktracker run by cyg_server user, Permission denied . To: common-user@hadoop.apache.org Dear Uma, as you know when we use start-all.sh command, all the outputs saved in log files, when i check the tasktracker log file, i see the below error message and its shutdown. im really confused, its more than 4 days im working in this issue and tried different ways but no result.^^ BS. Masoud On 11/03/2011 08:34 PM, Uma Maheswara Rao G 72686 wrote: it wont disply any thing on console. If you get any error while exceuting the command, then only it will disply on console. In your case it might executed successfully. Still you are facing same problem with TT startup? Regards, Uma - Original Message - From: Masoudmas...@agape.hanyang.ac.kr Date: Thursday, November 3, 2011 7:02 am Subject: Re: under cygwin JUST tasktracker run by cyg_server user, Permission denied . To: common-user@hadoop.apache.org Hi, thanks for info, i checked that report, seems same with mine but no specific solution mentioned. Yes, i changed this folder permission via cygwin,NO RESULT. Im really confused. ... any idea please ...? Thanks, B.S On 11/01/2011 05:38 PM, Uma Maheswara Rao G 72686 wrote: Looks, that is permissions related issue on local dirs There is an issue filed in mapred, related to this problem https://issues.apache.org/jira/browse/MAPREDUCE-2921 Can you please provide permissions explicitely and try? Regards, Uma - Original Message - From: Masoudmas...@agape.hanyang.ac.kr Date: Tuesday, November 1, 2011 1:19 pm Subject: Re: under cygwin JUST tasktracker run by cyg_server user, Permission denied . To: common-user@hadoop.apache.org Sure, ^^ when I run {namenode -fromat} it makes dfs in c:/tmp/ administrator_hadoop/ after that by running start-all.sh every thing is OK, all daemons run except tasktracker. My current user in administrator, but tacktracker runs by cyg_server user that made by cygwin in installation time;This is a part of log file: 2011-11-01 14:26:54,463 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as cyg_server 2011-11-01 14:26:54,463 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /tmp/hadoop-cyg_server/mapred/local 2011-11-01 14:26:54,479 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.io.IOException: Failed to set permissions of path: \tmp\hadoop- cyg_server\mapred\local\ttprivate to 0700 at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:680) at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:653) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:483) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java
Re: Packets-Block
- Original Message - From: kartheek muthyala kartheek0...@gmail.com Date: Thursday, November 3, 2011 11:23 am Subject: Packets-Block To: common-user@hadoop.apache.org Hi all, I need some info related to the code section which handles the followingoperations. Basically DataXceiver.c on the client side transmits the block in packetsand Actually DataXceiver will run only in DN. Whenever you create a file DataStreamer thread will start in DFSClient. Whenever application writing the bytes, they will be enqueued into dataQueue. Streamer thread will pick the packets from dataqueue and write on to the pipeline sockets. Also it will write the opcodes to tell the DN about the kind of operation. on the data node side we have DataXceiver.c and BlockReceiver.c files which take care of writing these packets in order to a block file until the last packet for the block is received. I want some info around this area DataXceiverServer will run and listen for the requests. For every request it receives, it will create DataXceiver thread and pass the info to it. Based on the opcode it will create BlockReceiver or BlockSender objects and give the control to it. where in BlockReceiver.c , i have seen a PacketResponder class and a BlockReceiver class where in two places you are finalizing the block (What i understood by finalizing is that when the last packet for the block is received, you are closing the block file). In PacketResponder class in two places you are using finalizeBlock() function, one in lastDataNodeRun()function and the other in run() method and in BlockReceiver.c you are using finalizeBlock() in receiveBlock() function. I understood from the commentsthat the finalizeBlock() call from run() method is done for the datanode with which client directly interacts and finalizeBlock() call from receiveBlock() function is done for all the datanodes where the block is sent for replication. As part replication, if one block has received by DN and also block length will be know before itself. So, receivePacket() invocation in while loop itself can read the complete block. So, after reading, it need to finalize the block to add into volumesMap. But i didn't understand why there is a finalizeBlock() call from lastDataNodeRun() function. This call will be for current writes from client/DN, it will not know the actual size untill client says that is last packet in current block. finalizeBlock will be called if the packet is lastPacket for that block. finalizeBlock will add the replica into volumesMap. Also if the packet is last one, then it needs to close all the blocks files in DN which were opened for writes. Can someone explain me about this? I may be wrong at most of the places of my understanding of the workflow. Correct me if i am wrong. Thanks, Kartheek Regards, Uma
Re: Packets-Block
Hello Karthik, see inline - Original Message - From: kartheek muthyala kartheek0...@gmail.com Date: Thursday, November 3, 2011 4:02 pm Subject: Re: Packets-Block To: common-user@hadoop.apache.org Thanks Uma for the prompt reply. I have one more doubt, as i can see block class contains only metadata information like Timestamp, length. But the actual data is in the streams.What I cannot understand is that where is the data getting written from streams to blockfile.(which function is taking care of this? ). Yes, block will contains all the information like blockID, generation timestamp, number of bytes... Block is writable, so that we can transfer them through network. ( ex: DN will send block reports,...etc ). Actual data will in disk with the name of blk_block id So, using this block id, we can identify the block name directly. When the block is created at the DN side, volumes map will maintans replicaBeingWritten objs with this block ID information . You can see the code in BlockReceiver constructor, i.e, once it gets the replicaInfo, it will call creatStreams on that replicainfo. So, that will create the FileOutPutStreams. Regards, Uma ~Kartheek. On Thu, Nov 3, 2011 at 12:55 PM, Uma Maheswara Rao G 72686 mahesw...@huawei.com wrote: - Original Message - From: kartheek muthyala kartheek0...@gmail.com Date: Thursday, November 3, 2011 11:23 am Subject: Packets-Block To: common-user@hadoop.apache.org Hi all, I need some info related to the code section which handles the followingoperations. Basically DataXceiver.c on the client side transmits the block in packetsand Actually DataXceiver will run only in DN. Whenever you create a file DataStreamer thread will start in DFSClient. Whenever application writing the bytes, they will be enqueued into dataQueue. Streamer thread will pick the packets from dataqueue and write on to the pipeline sockets. Also it will write the opcodes to tell the DN about the kind of operation. on the data node side we have DataXceiver.c and BlockReceiver.c files which take care of writing these packets in order to a block file until the last packet for the block is received. I want some info around this area DataXceiverServer will run and listen for the requests. For every request it receives, it will create DataXceiver thread and pass the info to it. Based on the opcode it will create BlockReceiver or BlockSender objects and give the control to it. where in BlockReceiver.c , i have seen a PacketResponder class and a BlockReceiver class where in two places you are finalizing the block (What i understood by finalizing is that when the last packet for the block is received, you are closing the block file). In PacketResponder class in two places you are using finalizeBlock() function, one in lastDataNodeRun()function and the other in run() method and in BlockReceiver.c you are using finalizeBlock() in receiveBlock() function. I understood from the commentsthat the finalizeBlock() call from run() method is done for the datanode with which client directly interacts and finalizeBlock() call from receiveBlock() function is done for all the datanodes where the block is sent for replication. As part replication, if one block has received by DN and also block length will be know before itself. So, receivePacket() invocation in while loop itself can read the complete block. So, after reading, it need to finalize the block to add into volumesMap. But i didn't understand why there is a finalizeBlock() call from lastDataNodeRun() function. This call will be for current writes from client/DN, it will not know the actual size untill client says that is last packet in current block. finalizeBlock will be called if the packet is lastPacket for that block. finalizeBlock will add the replica into volumesMap. Also if the packet is last one, then it needs to close all the blocks files in DN which were opened for writes. Can someone explain me about this? I may be wrong at most of the places of my understanding of the workflow. Correct me if i am wrong. Thanks, Kartheek Regards, Uma
Re: under cygwin JUST tasktracker run by cyg_server user, Permission denied .....
it wont disply any thing on console. If you get any error while exceuting the command, then only it will disply on console. In your case it might executed successfully. Still you are facing same problem with TT startup? Regards, Uma - Original Message - From: Masoud mas...@agape.hanyang.ac.kr Date: Thursday, November 3, 2011 7:02 am Subject: Re: under cygwin JUST tasktracker run by cyg_server user, Permission denied . To: common-user@hadoop.apache.org Hi, thanks for info, i checked that report, seems same with mine but no specific solution mentioned. Yes, i changed this folder permission via cygwin,NO RESULT. Im really confused. ... any idea please ...? Thanks, B.S On 11/01/2011 05:38 PM, Uma Maheswara Rao G 72686 wrote: Looks, that is permissions related issue on local dirs There is an issue filed in mapred, related to this problem https://issues.apache.org/jira/browse/MAPREDUCE-2921 Can you please provide permissions explicitely and try? Regards, Uma - Original Message - From: Masoudmas...@agape.hanyang.ac.kr Date: Tuesday, November 1, 2011 1:19 pm Subject: Re: under cygwin JUST tasktracker run by cyg_server user, Permission denied . To: common-user@hadoop.apache.org Sure, ^^ when I run {namenode -fromat} it makes dfs in c:/tmp/ administrator_hadoop/ after that by running start-all.sh every thing is OK, all daemons run except tasktracker. My current user in administrator, but tacktracker runs by cyg_server user that made by cygwin in installation time;This is a part of log file: 2011-11-01 14:26:54,463 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as cyg_server 2011-11-01 14:26:54,463 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /tmp/hadoop-cyg_server/mapred/local 2011-11-01 14:26:54,479 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.io.IOException: Failed to set permissions of path: \tmp\hadoop-cyg_server\mapred\local\ttprivate to 0700 at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:680) at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:653) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:483) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:318) at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:183) at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:741) at org.apache.hadoop.mapred.TaskTracker.init(TaskTracker.java:1463) at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3611) 2011-11-01 14:26:54,479 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG: / Thanks, BR. On 11/01/2011 04:33 PM, Uma Maheswara Rao G 72686 wrote: Can you please give some trace? - Original Message - From: Masoudmas...@agape.hanyang.ac.kr Date: Tuesday, November 1, 2011 11:08 am Subject: under cygwin JUST tasktracker run by cyg_server user, Permission denied . To: common-user@hadoop.apache.org Hi I have problem in running hadoop under cygwin 1.7 only tasktracker ran by cyg_server user and so make some problems, so any idea please??? BS. Masoud.
Re: under cygwin JUST tasktracker run by cyg_server user, Permission denied .....
Can you please give some trace? - Original Message - From: Masoud mas...@agape.hanyang.ac.kr Date: Tuesday, November 1, 2011 11:08 am Subject: under cygwin JUST tasktracker run by cyg_server user, Permission denied . To: common-user@hadoop.apache.org Hi I have problem in running hadoop under cygwin 1.7 only tasktracker ran by cyg_server user and so make some problems, so any idea please??? BS. Masoud.
Re: under cygwin JUST tasktracker run by cyg_server user, Permission denied .....
Looks, that is permissions related issue on local dirs There is an issue filed in mapred, related to this problem https://issues.apache.org/jira/browse/MAPREDUCE-2921 Can you please provide permissions explicitely and try? Regards, Uma - Original Message - From: Masoud mas...@agape.hanyang.ac.kr Date: Tuesday, November 1, 2011 1:19 pm Subject: Re: under cygwin JUST tasktracker run by cyg_server user, Permission denied . To: common-user@hadoop.apache.org Sure, ^^ when I run {namenode -fromat} it makes dfs in c:/tmp/ administrator_hadoop/ after that by running start-all.sh every thing is OK, all daemons run except tasktracker. My current user in administrator, but tacktracker runs by cyg_server user that made by cygwin in installation time;This is a part of log file: 2011-11-01 14:26:54,463 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as cyg_server 2011-11-01 14:26:54,463 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /tmp/hadoop-cyg_server/mapred/local 2011-11-01 14:26:54,479 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.io.IOException: Failed to set permissions of path: \tmp\hadoop-cyg_server\mapred\local\ttprivate to 0700 at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:680) at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:653) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:483) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:318) at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:183) at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:741) at org.apache.hadoop.mapred.TaskTracker.init(TaskTracker.java:1463) at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3611) 2011-11-01 14:26:54,479 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG: / Thanks, BR. On 11/01/2011 04:33 PM, Uma Maheswara Rao G 72686 wrote: Can you please give some trace? - Original Message - From: Masoudmas...@agape.hanyang.ac.kr Date: Tuesday, November 1, 2011 11:08 am Subject: under cygwin JUST tasktracker run by cyg_server user, Permission denied . To: common-user@hadoop.apache.org Hi I have problem in running hadoop under cygwin 1.7 only tasktracker ran by cyg_server user and so make some problems, so any idea please??? BS. Masoud.
Re: Server log files, order of importance ?
If you want to trace one particular block associated with a file, you can first check the file Name and find the NameSystem.allocateBlock: from your NN logs. here you can find the allocated blockID. After this, you just grep with this blockID from your huge logs. Take the time spamps for each operations based on this grep information. easily you can trace what happend to that block. Regards, Uma - Original Message - From: Jay Vyas jayunit...@gmail.com Date: Tuesday, November 1, 2011 3:37 am Subject: Server log files, order of importance ? To: common-user@hadoop.apache.org Hi guys :I wanted to go through each of the server logs on my hadoop (single psuedo node) vm. In particular, I want to know where to look when things go wrong (i.e. so I can more effectively debug hadoop namenode issues in the future). Can someone suggest what the most important ones to start looking at are ? -- Jay Vyas MMSB/UCHC
Re: can't format namenode....
- Original Message - From: Jay Vyas jayunit...@gmail.com Date: Saturday, October 29, 2011 8:27 pm Subject: can't format namenode To: common-user@hadoop.apache.org Hi guys : In order to fix some issues im having (recently posted), I'vedecided to try to make sure my name node is formatted But the formatting fails (see 1 below) . So... To trace the failure, I figured I would grep through all log filesfor exceptions. I've curated the results here ... does this look familiar to anyone ? Clearly, something is very wrong with my CDH hadoop installation. 1) To attempt to solve this, I figured I would format my namenode. Oddly,when I run hadoop -namenode format I get the following stack trace : 11/10/29 14:39:37 INFO namenode.NameNode: STARTUP_MSG: / STARTUP_MSG: Starting NameNode STARTUP_MSG: host = localhost.localdomain/127.0.0.1 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 0.20.2-cdh3u1 STARTUP_MSG: build = file:///tmp/topdir/BUILD/hadoop-0.20.2- cdh3u1 -r bdafb1dbffd0d5f2fbc6ee022e1c8df6500fd638; compiled by 'root' on Mon Jul 18 09:40:22 PDT 2011 / Re-format filesystem in /var/lib/hadoop-0.20/cache/hadoop/dfs/name ? (Y or N) Y 11/10/29 14:39:40 INFO util.GSet: VM type = 64-bit 11/10/29 14:39:40 INFO util.GSet: 2% max memory = 19.33375 MB 11/10/29 14:39:40 INFO util.GSet: capacity = 2^21 = 2097152 entries11/10/29 14:39:40 INFO util.GSet: recommended=2097152, actual=209715211/10/29 14:39:40 INFO namenode.FSNamesystem: fsOwner=cloudera11/10/29 14:39:40 INFO namenode.FSNamesystem: supergroup=supergroup11/10/29 14:39:40 INFO namenode.FSNamesystem: isPermissionEnabled=false11/10/29 14:39:40 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=1000 11/10/29 14:39:40 INFO namenode.FSNamesystem: isAccessTokenEnabled=falseaccessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s) 11/10/29 14:39:41 ERROR namenode.NameNode: java.io.IOException: Cannot remove current directory: /var/lib/hadoop- 0.20/cache/hadoop/dfs/name/currentat org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.clearDirectory(Storage.java:303) at org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1244) at org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1263) at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1100) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1217) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1233) 11/10/29 14:39:41 INFO namenode.NameNode: SHUTDOWN_MSG: / SHUTDOWN_MSG: Shutting down NameNode at localhost.localdomain/127.0.0.1/ Are you able to remove this directory explicitely? /var/lib/hadoop-0.20/cache/hadoop/dfs/name/current 2) Here are the exceptions (abridged , i removed repetetive parts regardingreplicated to 0 nodes instead of 1 This is because, the file is not replicated to minimum replication (1). 2011-10-28 22:36:52,669 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 8020, call addBlock(/var/lib/hadoop- 0.20/cache/mapred/mapred/system/jobtracker.info,DFSClient_- 134960056, null) from 127.0.0.1:35163: error: java.io.IOException: File /var lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1 java.io.IOException: File /var/lib/hadoop- 0.20/cache/mapred/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1 STARTUP_MSG: host = java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net java.net.UnknownHostException: gcrc15.uchc.net: gcrc15.uchc.net 2011-10-28 22:30:03,413 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamerException: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /var/lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info could only be replicated to 0 nodes, in tead of 1 .. REPEATED SEVERAL TIMES ..2011-10-28 22:36:52,716 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /var/lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info could only be replicated to 0 nodes, in tead of 1 org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /var/lib/hadoop-0.20/cache/mapred/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of
Re: Need help understanding Hadoop Architecture
Hi, Firt of all, welcome to Hadoop. - Original Message - From: panamamike panamam...@hotmail.com Date: Sunday, October 23, 2011 8:29 pm Subject: Need help understanding Hadoop Architecture To: core-u...@hadoop.apache.org I'm new to Hadoop. I've read a few articles and presentations which are directed at explaining what Hadoop is, and how it works. Currently my understanding is Hadoop is an MPP system which leverages the use of largeblock size to quickly find data. In theory, I understand how a large block size along with an MPP architecture as well as using what I'm understandingto be a massive index scheme via mapreduce can be used to find data. What I don't understand is how ,after you identify the appropriate 64MBblocksize, do you find the data you're specifically after? Does this mean the CPU has to search the entire 64MB block for the data of interest? If so, how does Hadoop know what data from that block to retrieve? I'm assuming the block is probably composed of one or more files. If not, I'm assuming the user isn't look for the entire 64MB block rather a portionof it. I am just giving breif about file system here. Distributed file system contains, NameNode, DataNode, Checkpointing nodes and DFSClient. Here NameNode will maintain the metadat about the files and blocks. Datanode holds the actual data. and it will send the heartbeats to NN.So, Namenode knows about the DN status. DFSClient is client side ligic, which will first ask the namenode to give set of DN to write the file. Then NN will add their entries in metadata and give DN list to client. Then client will write the Data to Dtatnodes directly. While reading the file also, Client will ask NN to give the block locations, then client will directly connect to DN and read the data. There are many other concepts replication, leasemonitoring...etc. I hope this will give you about initial understanding about HDFS. Please go through the below document which will explan you very clearly with the architecture diagrams. Any help indicating documentation, books, articles on the subject would be much appreciated. Here is a doc for HADOOP http://db.trimtabs.com:2080/mindterm/ebooks/Hadoop_The_Definitive_Guide_Cr.pdf Regards, Mike -- View this message in context: http://old.nabble.com/Need-help- understanding-Hadoop-Architecture-tp32705405p32705405.html Sent from the Hadoop core-user mailing list archive at Nabble.com. Regards, Uma
Re: lost data with 1 failed datanode and replication factor 3 in 6 node cluster
- Original Message - From: Ossi los...@gmail.com Date: Friday, October 21, 2011 2:57 pm Subject: lost data with 1 failed datanode and replication factor 3 in 6 node cluster To: common-user@hadoop.apache.org hi, We managed to lost data when 1 datanode broke down in a cluster of 6 datanodes with replication factor 3. As far as I know, that shouldn't happen, since each blocks should have 1 copy in 3 different hosts. So, loosing even 2 nodes should be fine. Earlier we did some tests with replication factor 2, but reverted from that: 88 2011-10-12 06:46:49 hadoop dfs -setrep -w 2 -R / 148 2011-10-12 10:22:09 hadoop dfs -setrep -w 3 -R / The lost data was generated after replication factor was set back to 3. First of all the question is how are you measuring the dataloss? Any read failure with block missing exceptions? My guess is that, you are measuring the dataloss by dfsused space. If i am correct, the dfsused space will be calculated by complete data available DNs. So, when one datanode goes down, then dfs used and ramainig also will reduce relatively. This can not be taken as data loss. Please correct me, if my understanding is wrong with the question. And even if replication factor would have been 2, data shouldn't have been lost, right? We wonder how that is possible and in what situations that could happen? br, Ossi Regards, Uma
Re: Remote Blocked Transfer count
- Original Message - From: Mark question markq2...@gmail.com Date: Saturday, October 22, 2011 5:57 am Subject: Remote Blocked Transfer count To: common-user common-user@hadoop.apache.org Hello, I wonder if there is a way to measure how many of the data blocks havetransferred over the network? Or more generally, how many times where there a connection/contact between different machines? There is a metrics available in Hadoop. Did you check them. The simplest way to configure Hadoop metrics is to funnel them into a user-configurable file on the machine running the daemon. Metrics are organized into “contexts” (Hadoop currently uses “jvm”, “dfs”, “mapred”, and “rpc”), and each context is independently configured http://www.cloudera.com/blog/2009/03/hadoop-metrics/ You can view them by JMX. I thought of checking the Namenode log file which usually shows blk_from src= to dst ... but I'm not sure if it's correct to count those lines. I wont recommend to depend on logs. Because if some one changes the log, then it will effect your application. Any ideas are helpful. Mark Regards, Uma
Re: Does hadoop support append option?
- Original Message - From: kartheek muthyala kartheek0...@gmail.com Date: Tuesday, October 18, 2011 11:54 am Subject: Re: Does hadoop support append option? To: common-user@hadoop.apache.org I am just concerned about the use case of appends in Hadoop. I know that they have provided support for appends in hadoop. But how frequently are the files getting appended? . In normal case file block details will not be persisted in edit log before closing the file. As part of close only, this will happen. If NN restart happens before closing the file, we loose this data. Consider a case, we have a very big file and data also very important, in this case, we should have an option to persist the block details frequently into editlog file rite, inorder to avoid the dataloss in case of NN restarts. To do this, DFS exposed the API called sync. This will basically persist the editlog entries to disk. To reopen the stream back again we will use append api. In trunk, this support has been refactored cleanly and handled many corner cases. APIs also provided as hflush. There is this version concept too that is maintained in the block report, according to my guess this version number is maintained to make sure that if a datanode gets disconnected once and comes back if it has a old copy of the data , then discard read requests to this data node. But if the files are not getting appended frequently does the version number remain the same?. Any typical use case can you guys point to? I am not sure, what is your exact question here. Can you please clarify more on this? ~Kartheek On Mon, Oct 17, 2011 at 12:53 PM, Uma Maheswara Rao G 72686 mahesw...@huawei.com wrote: AFAIK, append option is there in 20Append branch. Mainly supports sync. But there are some issues with that. Same has been merged to 20.205 branch and will be released soon (rc2 available). And also fixed many bugs in this branch. As per our basic testing it is pretty good as of now.Need to wait for official release. Regards, Uma - Original Message - From: bourne1900 bourne1...@yahoo.cn Date: Monday, October 17, 2011 12:37 pm Subject: Does hadoop support append option? To: common-user common-user@hadoop.apache.org I know that hadoop0.19.0 supports append option, but not stable. Does the latest version support append option? Is it stable? Thanks for help. bourne Regards, Uma
Re: could not complete file...
- Original Message - From: bourne1900 bourne1...@yahoo.cn Date: Tuesday, October 18, 2011 3:21 pm Subject: could not complete file... To: common-user common-user@hadoop.apache.org Hi, There are 20 threads which put file into HDFS ceaseless, every file is 2k. When 1 million files have finished, client begin throw coulod not complete file exception ceaseless. Could not complete file log is actually info log. This will be logged from client when closing the file. It will retry for some time (i remember 100 times) to ensure the suuceefull writes. Did you observe any write failures here? At that time, datanode is hang-up. I think maybe heart beat is lost, so namenode does not know the state of datanode. But I do not know why heart beat have lost. Is there any info can be found from log when datanode can not send heart beat? Can you check the NN UI to verify the number of live nodes. By this we can decide whether DN stopped sending heartbeats or not. Thanks and regards! bourne Regards, Uma
Re: Does hadoop support append option?
- Original Message - From: kartheek muthyala kartheek0...@gmail.com Date: Tuesday, October 18, 2011 1:31 pm Subject: Re: Does hadoop support append option? To: common-user@hadoop.apache.org Thanks Uma for the clarification of the append functionality. My second question is about the version number concept used in the blockmap. Why does it maintain this version number? sorry Karthik, As i know, there is no version number in blocks map. Are you talking about generationTimeStamp or something? can you paste the snippet where you have seen that version number, so, that i can get your question clearly. ~Kartheek On Tue, Oct 18, 2011 at 12:14 PM, Uma Maheswara Rao G 72686 mahesw...@huawei.com wrote: - Original Message - From: kartheek muthyala kartheek0...@gmail.com Date: Tuesday, October 18, 2011 11:54 am Subject: Re: Does hadoop support append option? To: common-user@hadoop.apache.org I am just concerned about the use case of appends in Hadoop. I know that they have provided support for appends in hadoop. But how frequently are the files getting appended? . In normal case file block details will not be persisted in edit log before closing the file. As part of close only, this will happen. If NN restart happens before closing the file, we loose this data. Consider a case, we have a very big file and data also very important, in this case, we should have an option to persist the block details frequently into editlog file rite, inorder to avoid the dataloss in case of NN restarts. To do this, DFS exposed the API called sync. This will basically persist the editlog entries to disk. To reopen the stream back again we will use append api. In trunk, this support has been refactored cleanly and handled many corner cases. APIs also provided as hflush. There is this version concept too that is maintained in the block report, according to my guess this version number is maintained to make sure that if a datanode gets disconnected once and comes back if it has a old copy of the data , then discard read requests to this data node. But if the files are not getting appended frequently does the version number remain the same?. Any typical use case can you guys point to? I am not sure, what is your exact question here. Can you please clarify more on this? ~Kartheek On Mon, Oct 17, 2011 at 12:53 PM, Uma Maheswara Rao G 72686 mahesw...@huawei.com wrote: AFAIK, append option is there in 20Append branch. Mainly supports sync. But there are some issues with that. Same has been merged to 20.205 branch and will be released soon (rc2 available). And also fixed many bugs in this branch. As per our basic testing it is pretty good as of now.Need to wait for official release. Regards, Uma - Original Message - From: bourne1900 bourne1...@yahoo.cn Date: Monday, October 17, 2011 12:37 pm Subject: Does hadoop support append option? To: common-user common-user@hadoop.apache.org I know that hadoop0.19.0 supports append option, but not stable.Does the latest version support append option? Is it stable? Thanks for help. bourne Regards, Uma
Re: execute hadoop job from remote web application
- Original Message - From: Oleg Ruchovets oruchov...@gmail.com Date: Tuesday, October 18, 2011 4:11 pm Subject: execute hadoop job from remote web application To: common-user@hadoop.apache.org Hi , what is the way to execute hadoop job on remote cluster. I want to execute my hadoop job from remote web application , but I didn't find any hadoop client (remote API) to do it. Please advice. Oleg You can put the Hadoop jars in your web applications classpath and find the Class JobClient and submit the jobs using it. Regards, Uma
Re: execute hadoop job from remote web application
- Original Message - From: Bejoy KS bejoy.had...@gmail.com Date: Tuesday, October 18, 2011 5:25 pm Subject: Re: execute hadoop job from remote web application To: common-user@hadoop.apache.org Oleg If you are looking at how to submit your jobs using JobClient then the below sample can give you a start. //get the configuration parameters and assigns a job name JobConf conf = new JobConf(getConf(), MyClass.class); conf.setJobName(SMS Reports); //setting key value types for mapper and reducer outputs conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(Text.class); //specifying the custom reducer class conf.setReducerClass(SmsReducer.class); //Specifying the input directories(@ runtime) and Mappers independently for inputs from multiple sources FileInputFormat.addInputPath(conf, new Path(args[0])); //Specifying the output directory @ runtime FileOutputFormat.setOutputPath(conf, new Path(args[1])); JobClient.runJob(conf); Along with the hadoop jars you may need to have the config files as well on your client. The sample is from old map reduce API. You can use the new one as well in that we use the Job instead of JobClient. Hope it helps!.. Regards Bejoy.K.S On Tue, Oct 18, 2011 at 5:00 PM, Oleg Ruchovets oruchov...@gmail.comwrote: Excellent. Can you give a small example of code. Good samle by Bejoy hope, you have access for this site. Also please go through this docs, http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Example%3A+WordCount+v2.0 Here is the wordcount example. On Tue, Oct 18, 2011 at 1:13 PM, Uma Maheswara Rao G 72686 mahesw...@huawei.com wrote: - Original Message - From: Oleg Ruchovets oruchov...@gmail.com Date: Tuesday, October 18, 2011 4:11 pm Subject: execute hadoop job from remote web application To: common-user@hadoop.apache.org Hi , what is the way to execute hadoop job on remote cluster. I want to execute my hadoop job from remote web application , but I didn't find any hadoop client (remote API) to do it. Please advice. Oleg You can put the Hadoop jars in your web applications classpath and find the Class JobClient and submit the jobs using it. Regards, Uma Regards Uma
Re: Does hadoop support append option?
AFAIK, append option is there in 20Append branch. Mainly supports sync. But there are some issues with that. Same has been merged to 20.205 branch and will be released soon (rc2 available). And also fixed many bugs in this branch. As per our basic testing it is pretty good as of now.Need to wait for official release. Regards, Uma - Original Message - From: bourne1900 bourne1...@yahoo.cn Date: Monday, October 17, 2011 12:37 pm Subject: Does hadoop support append option? To: common-user common-user@hadoop.apache.org I know that hadoop0.19.0 supports append option, but not stable. Does the latest version support append option? Is it stable? Thanks for help. bourne
Re: Is there a good way to see how full hdfs is
We can write the simple program and you can call this API. Make sure Hadoop jars presents in your class path. Just for more clarification, DN will send their stats as parts of hertbeats, So, NN will maintain all the statistics about the diskspace usage for the complete filesystem and etc... This api will give you that stats. Regards, Uma - Original Message - From: ivan.nov...@emc.com Date: Monday, October 17, 2011 9:07 pm Subject: Re: Is there a good way to see how full hdfs is To: common-user@hadoop.apache.org, mapreduce-u...@hadoop.apache.org Cc: common-...@hadoop.apache.org So is there a client program to call this? Can one write their own simple client to call this method from all diskson the cluster? How about a map reduce job to collect from all disks on the cluster? On 10/15/11 4:51 AM, Uma Maheswara Rao G 72686 mahesw...@huawei.comwrote: /** Return the disk usage of the filesystem, including total capacity, * used space, and remaining space */ public DiskStatus getDiskStatus() throws IOException { return dfs.getDiskStatus(); } DistributedFileSystem has the above API from java API side. Regards, Uma - Original Message - From: wd w...@wdicc.com Date: Saturday, October 15, 2011 4:16 pm Subject: Re: Is there a good way to see how full hdfs is To: mapreduce-u...@hadoop.apache.org hadoop dfsadmin -report On Sat, Oct 15, 2011 at 8:16 AM, Steve Lewis lordjoe2...@gmail.com wrote: We have a small cluster with HDFS running on only 8 nodes - I believe that the partition assigned to hdfs might be getting full and wonder if the web tools or java api havew a way to look at free space on hdfs -- Steven M. Lewis PhD 4221 105th Ave NE Kirkland, WA 98033 206-384-1340 (cell) Skype lordjoe_com
Re: Is there a good way to see how full hdfs is
Yes, that was deprecated in trunk If you want to use by programatically, this will be the better option as well. /** {@inheritDoc} */ @Override public FsStatus getStatus(Path p) throws IOException { statistics.incrementReadOps(1); return dfs.getDiskStatus(); } This should work for you. It will give you FileStatus object contains below APIs getCapacity, getUsed, getRemaining I would suggest you to look at the FileSystem APIs available once. I think you will get clear understanding to use. Regards, Uma - Original Message - From: ivan.nov...@emc.com Date: Monday, October 17, 2011 9:48 pm Subject: Re: Is there a good way to see how full hdfs is To: common-user@hadoop.apache.org Hi Harsh, I need access to the data programatically for system automation, and hence I do not want a monitoring tool but access to the raw data. I am more than happy to use an exposed function or client program and not an internal API. So i am still a bit confused... What is the simplest way to get at thisraw disk usage data programmatically? Is there a HDFS equivalent of du and df, or are you suggesting to just run that on the linux OS (which is perfectly doable). Cheers, Ivan On 10/17/11 9:05 AM, Harsh J ha...@cloudera.com wrote: Uma/Ivan, The DistributedFileSystem class explicitly is _not_ meant for public consumption, it is an internal one. Additionally, that method has beendeprecated. What you need is FileSystem#getStatus() if you want the summarized report via code. A job, that possibly runs du or df, is a good idea if you guarantee perfect homogeneity of path names in your cluster. But I wonder, why won't using a general monitoring tool (such as nagios) for this purpose cut it? What's the end goal here? P.s. I'd moved this conversation to hdfs-user@ earlier on, but now I see it being cross posted into mr-user, common-user, and common- dev -- Why? On Mon, Oct 17, 2011 at 9:25 PM, Uma Maheswara Rao G 72686 mahesw...@huawei.com wrote: We can write the simple program and you can call this API. Make sure Hadoop jars presents in your class path. Just for more clarification, DN will send their stats as parts of hertbeats, So, NN will maintain all the statistics about the diskspaceusage for the complete filesystem and etc... This api will give you that stats. Regards, Uma - Original Message - From: ivan.nov...@emc.com Date: Monday, October 17, 2011 9:07 pm Subject: Re: Is there a good way to see how full hdfs is To: common-user@hadoop.apache.org, mapreduce-u...@hadoop.apache.org Cc: common-...@hadoop.apache.org So is there a client program to call this? Can one write their own simple client to call this method from all diskson the cluster? How about a map reduce job to collect from all disks on the cluster? On 10/15/11 4:51 AM, Uma Maheswara Rao G 72686 mahesw...@huawei.comwrote: /** Return the disk usage of the filesystem, including total capacity, * used space, and remaining space */ public DiskStatus getDiskStatus() throws IOException { return dfs.getDiskStatus(); } DistributedFileSystem has the above API from java API side. Regards, Uma - Original Message - From: wd w...@wdicc.com Date: Saturday, October 15, 2011 4:16 pm Subject: Re: Is there a good way to see how full hdfs is To: mapreduce-u...@hadoop.apache.org hadoop dfsadmin -report On Sat, Oct 15, 2011 at 8:16 AM, Steve Lewis lordjoe2...@gmail.com wrote: We have a small cluster with HDFS running on only 8 nodes - I believe that the partition assigned to hdfs might be getting full and wonder if the web tools or java api havew a way to look at free space on hdfs -- Steven M. Lewis PhD 4221 105th Ave NE Kirkland, WA 98033 206-384-1340 (cell) Skype lordjoe_com -- Harsh J
Re: Hadoop node disk failure - reinstall question
- Original Message - From: Mayuran Yogarajah mayuran.yogara...@casalemedia.com Date: Tuesday, October 18, 2011 4:24 am Subject: Hadoop node disk failure - reinstall question To: common-user@hadoop.apache.org common-user@hadoop.apache.org One of our nodes died today, it looks like the disk containing the OS expired. I will need to reinstall the machine. Are there any known issues with using the same hostname / IP again, or is it better to give it a new IP / host name ? The second disk on the machine is still operational and contains HDFS data so I plan on mounting it. Is this ill-advised? Should I just wipe that disk too ? Copying that data to new machine would be good option. It is gain depending on the replication. If you have enough replicas in your cluster. then automatically replication will happen to new nodes. In that case you need not even worry about old data. thanks, M
Re: Unrecognized option: -jvm
You are using Which version of Hadoop ? Please check the recent discussion, which will help you related to this problem. http://search-hadoop.com/m/PPgvNPUoL2subj=Re+Starting+Datanode Regards, Uma - Original Message - From: Majid Azimi majid.merk...@gmail.com Date: Sunday, October 16, 2011 2:22 am Subject: Unrecognized option: -jvm To: common-user@hadoop.apache.org Hi guys, I'm realy new to hadoop. I have configured a single node hadoop cluster. but seems that my data node is not working. job tracker log file shows thismessage(alot of them per 10 second): 2011-10-16 00:01:15,558 WARN org.apache.hadoop.mapred.JobTracker: Retrying... 2011-10-16 00:01:15,589 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamerException: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hadoop-root/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1417) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:596) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377) at org.apache.hadoop.ipc.Client.call(Client.java:1030) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224) at $Proxy5.addBlock(Unknown Source) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy5.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3104) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2975) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2255) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2446) 2011-10-16 00:01:15,589 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block null bad datanode[0] nodes == null 2011-10-16 00:01:15,589 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Source file /tmp/hadoop- root/mapred/system/jobtracker.info- Aborting... 2011-10-16 00:01:15,590 WARN org.apache.hadoop.mapred.JobTracker: Writing to file hdfs://localhost/tmp/hadoop-root/mapred/system/jobtracker.info failed!2011-10-16 00:01:15,593 WARN org.apache.hadoop.mapred.JobTracker: FileSystem is not ready yet! 2011-10-16 00:01:15,603 WARN org.apache.hadoop.mapred.JobTracker: Failed to initialize recovery manager. org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hadoop-root/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1417) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:596) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:523) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1383) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1379) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1377) at org.apache.hadoop.ipc.Client.call(Client.java:1030) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224) at $Proxy5.addBlock(Unknown Source) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at
Re: Too much fetch failure
Are you able to ping the other node with the configured hostnames? Make sure that you should be able to ping to the other machine with the configured hostname in ect/hosts files. Regards, Uma - Original Message - From: praveenesh kumar praveen...@gmail.com Date: Sunday, October 16, 2011 6:46 pm Subject: Re: Too much fetch failure To: common-user@hadoop.apache.org try commenting 127.0.0.1 localhost line in your /etc/hosts and then restartthe cluster and then try again. Thanks, Praveenesh On Sun, Oct 16, 2011 at 2:00 PM, Humayun gmail humayun0...@gmail.comwrote: we are using hadoop on virtual box. when it is a single node then it works fine for big dataset larger than the default block size. but in case of multinode cluster (2 nodes) we are facing some problems. Like when the input dataset is smaller than the default block size(64 MB) then it works fine. but when the input dataset is larger than the default block size then it shows ‘too much fetch failure’ in reduce state. here is the output link http://paste.ubuntu.com/707517/ From the above comments , there are many users who faced this problem. different users suggested to modify the /etc/hosts file in different manner to fix the problem. but there is no ultimate solution.we need the actual solution thats why we are writing here. this is our /etc/hosts file 192.168.60.147 humayun # Added by NetworkManager 127.0.0.1 localhost.localdomain localhost ::1 humayun localhost6.localdomain6 localhost6 127.0.1.1 humayun # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters ff02::3 ip6-allhosts 192.168.60.1 master 192.168.60.2 slave
Re: Too much fetch failure
I mean, two nodes here is tasktrackers. - Original Message - From: Humayun gmail humayun0...@gmail.com Date: Sunday, October 16, 2011 7:38 pm Subject: Re: Too much fetch failure To: common-user@hadoop.apache.org yes we can ping every node (both master and slave). On 16 October 2011 19:52, Uma Maheswara Rao G 72686 mahesw...@huawei.comwrote: Are you able to ping the other node with the configured hostnames? Make sure that you should be able to ping to the other machine with the configured hostname in ect/hosts files. Regards, Uma - Original Message - From: praveenesh kumar praveen...@gmail.com Date: Sunday, October 16, 2011 6:46 pm Subject: Re: Too much fetch failure To: common-user@hadoop.apache.org try commenting 127.0.0.1 localhost line in your /etc/hosts and then restartthe cluster and then try again. Thanks, Praveenesh On Sun, Oct 16, 2011 at 2:00 PM, Humayun gmail humayun0...@gmail.comwrote: we are using hadoop on virtual box. when it is a single node then it works fine for big dataset larger than the default block size. but in case of multinode cluster (2 nodes) we are facing some problems. Like when the input dataset is smaller than the default block size(64 MB) then it works fine. but when the input dataset is larger than the default block size then it shows ‘too much fetch failure’ in reduce state. here is the output link http://paste.ubuntu.com/707517/ From the above comments , there are many users who faced this problem. different users suggested to modify the /etc/hosts file in different manner to fix the problem. but there is no ultimate solution.we need the actual solution thats why we are writing here. this is our /etc/hosts file 192.168.60.147 humayun # Added by NetworkManager 127.0.0.1 localhost.localdomain localhost ::1 humayun localhost6.localdomain6 localhost6 127.0.1.1 humayun # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters ff02::3 ip6-allhosts 192.168.60.1 master 192.168.60.2 slave
Re: hadoop input buffer size
I think below can give you more info about it. http://developer.yahoo.com/blogs/hadoop/posts/2009/08/the_anatomy_of_hadoop_io_pipel/ Nice explanation by Owen here. Regards, Uma - Original Message - From: Yang Xiaoliang yangxiaoliang2...@gmail.com Date: Wednesday, October 5, 2011 4:27 pm Subject: Re: hadoop input buffer size To: common-user@hadoop.apache.org Hi, Hadoop neither read one line each time, nor fetching dfs.block.size of lines into a buffer, Actually, for the TextInputFormat, it read io.file.buffer.size bytes of text into a buffer each time, this can be seen from the hadoop source file LineReader.java 2011/10/5 Mark question markq2...@gmail.com Hello, Correct me if I'm wrong, but when a program opens n-files at the same time to read from, and start reading from each file at a time 1 line at a time. Isn't hadoop actually fetching dfs.block.size of lines into a buffer? and not actually one line. If this is correct, I set up my dfs.block.size = 3MB and each line takes about 650 bytes only, then I would assume the performance for reading 1-4000 lines would be the same, but it isn't ! Do you know a way to find #n of lines to be read at once? Thank you, Mark
Re: How to iterate over a hdfs folder with hadoop
Yes, FileStatus class would be trhe equavalent for list. FileStstus has the API's isDir and getPath. This both api's can satify for your futher usage.:-) I think small difference would be, FileStatus will ensure the sorted order. Regards, Uma - Original Message - From: John Conwell j...@iamjohn.me Date: Monday, October 10, 2011 8:40 pm Subject: Re: How to iterate over a hdfs folder with hadoop To: common-user@hadoop.apache.org FileStatus[] files = fs.listStatus(new Path(path)); for (FileStatus fileStatus : files) { //...do stuff ehre } On Mon, Oct 10, 2011 at 8:03 AM, Raimon Bosch raimon.bo...@gmail.comwrote: Hi, I'm wondering how can I browse an hdfs folder using the classes in org.apache.hadoop.fs package. The operation that I'm looking for is 'hadoop dfs -ls' The standard file system equivalent would be: File f = new File(outputPath); if(f.isDirectory()){ String files[] = f.list(); for(String file : files){ //Do your logic } } Thanks in advance, Raimon Bosch. -- Thanks, John C
Re: Secondary namenode fsimage concept
Hi, It looks to me that, problem with your NFS. It is not supporting locks. Which version of NFS are you using? Please check your NFS locking support by writing simple program for file locking. I think NFS4 supports locking ( i did not tried). http://nfs.sourceforge.net/ A6. What are the main new features in version 4 of the NFS protocol? *NFS Versions 2 and 3 are stateless protocols, but NFS Version 4 introduces state. An NFS Version 4 client uses state to notify an NFS Version 4 server of its intentions on a file: locking, reading, writing, and so on. An NFS Version 4 server can return information to a client about what other clients have intentions on a file to allow a client to cache file data more aggressively via delegation. To help keep state consistent, more sophisticated client and server reboot recovery mechanisms are built in to the NFS Version 4 protocol. *NFS Version 4 introduces support for byte-range locking and share reservation. Locking in NFS Version 4 is lease-based, so an NFS Version 4 client must maintain contact with an NFS Version 4 server to continue extending its open and lock leases. Regards, Uma - Original Message - From: Shouguo Li the1plum...@gmail.com Date: Tuesday, October 11, 2011 2:31 am Subject: Re: Secondary namenode fsimage concept To: common-user@hadoop.apache.org hey parick i wanted to configure my cluster to write namenode metadata to multipledirectories as well: property namedfs.name.dir/name value/hadoop/var/name,/mnt/hadoop/var/name/value /property in my case, /hadoop/var/name is local directory, /mnt/hadoop/var/name is NFS volume. i took down the cluster first, then copied over files from /hadoop/var/name to /mnt/hadoop/var/name, and then tried to start up the cluster. but the cluster won't start up properly... here's the namenode log: http://pastebin.com/gmu0B7yd any ideas why it wouldn't start up? thx On Thu, Oct 6, 2011 at 6:58 PM, patrick sang silvianhad...@gmail.comwrote: I would say your namenode write metadata in local fs (where your secondary namenode will pull files), and NFS mount. property namedfs.name.dir/name value/hadoop/name,/hadoop/nfs_server_name/value /property my 0.02$ P On Thu, Oct 6, 2011 at 12:04 AM, shanmuganathan.r shanmuganatha...@zohocorp.com wrote: Hi Kai, There is no datas stored in the secondarynamenode related to the Hadoop cluster . Am I correct? If it correct means If we run the secondaryname node in separate machine then fetching , merging and transferring time is increased if the cluster has large data in the namenode fsimage file . At the time if fail over occurs , then how can we recover the nearly one hour changes in the HDFS file ? (default check point time is one hour) Thanks R.Shanmuganathan On Thu, 06 Oct 2011 12:20:28 +0530 Kai Voigtk...@123.orggt; wrote Hi, the secondary namenode only fetches the two files when a checkpointing is needed. Kai Am 06.10.2011 um 08:45 schrieb shanmuganathan.r: gt; Hi Kai, gt; gt; In the Second part I meant gt; gt; gt; Is the secondary namenode also contain the FSImage file or the two files(FSImage and EdiltLog) are transferred from the namenode at the checkpoint time. gt; gt; gt; Thanks gt; Shanmuganathan gt; gt; gt; gt; gt; gt; On Thu, 06 Oct 2011 11:37:50 +0530 Kai Voigtamp;lt;k...@123.org amp;gt; wrote gt; gt; gt; Hi, gt; gt; you're correct when saying the namenode hosts the fsimage file and the edits log file. gt; gt; The fsimage file contains a snapshot of the HDFS metadata (a filename to blocks list mapping). Whenever there is a change to HDFS, it will be appended to the edits file. Think of it as a database transaction log, where changes will not be applied to the datafile, but appended to a log. gt; gt; To prevent the edits file growing infinitely, the secondary namenode periodically pulls these two files, and the namenode starts writing changes to a new edits file. Then, the secondary namenode merges the changes from the edits file with the old snapshot from the fsimage file and creates an updated fsimage file. This updated fsimage file is then copied to the namenode. gt; gt; Then, the entire cycle starts again. To answer your question: The namenode has both files, even if the secondary namenode is running on a different machine. gt; gt; Kai gt; gt; Am 06.10.2011 um 07:57 schrieb shanmuganathan.r: gt; gt; amp;gt; gt; amp;gt; Hi All, gt; amp;gt; gt; amp;gt; I have a doubt in hadoop secondary namenode concept . Please correct if the following statements are wrong . gt; amp;gt; gt; amp;gt; gt; amp;gt; The namenode hosts the fsimage and edit log files.
Re: Error using hadoop distcp
Distcp will run as mapreduce job. Here tasktrackers required the hostname mappings to contact to other nodes. Please configure the mapping correctly in both the machines and try. egards, Uma - Original Message - From: trang van anh anh...@vtc.vn Date: Wednesday, October 5, 2011 1:41 pm Subject: Re: Error using hadoop distcp To: common-user@hadoop.apache.org which host run the task that throws the exception ? ensure that each data node know another data nodes in hadoop cluster- add ub16 entry in /etc/hosts on where the task running. On 10/5/2011 12:15 PM, praveenesh kumar wrote: I am trying to use distcp to copy a file from one HDFS to another. But while copying I am getting the following exception : hadoop distcp hdfs://ub13:54310/user/hadoop/weblog hdfs://ub16:54310/user/hadoop/weblog 11/10/05 10:41:01 INFO mapred.JobClient: Task Id : attempt_201110031447_0005_m_07_0, Status : FAILED java.net.UnknownHostException: unknown host: ub16 at org.apache.hadoop.ipc.Client$Connection.init(Client.java:195) at org.apache.hadoop.ipc.Client.getConnection(Client.java:850) at org.apache.hadoop.ipc.Client.call(Client.java:720) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy1.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:113) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:215) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:177) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175) at org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:48) at org.apache.hadoop.mapred.OutputCommitter.setupJob(OutputCommitter.java:124) at org.apache.hadoop.mapred.Task.runJobSetupTask(Task.java:835) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:296) at org.apache.hadoop.mapred.Child.main(Child.java:170) Its saying its not finding ub16. But the entry is there in /etc/hosts files. I am able to ssh both the machines. Do I need password less ssh between these two NNs ? What can be the issue ? Any thing I am missing before using distcp ? Thanks, Praveenesh
Re: FileSystem closed
FileSystem objects will be cached in jvm. When it tries to get the FS object by using Filesystem.get(..) ( sequence file internally will use it), it will return same fs object if scheme and authority is same for the uri. fs cache key's equals implementation is below static boolean isEqual(Object a, Object b) { return a == b || (a != null a.equals(b)); } /** {@inheritDoc} */ public boolean equals(Object obj) { if (obj == this) { return true; } if (obj != null obj instanceof Key) { Key that = (Key)obj; return isEqual(this.scheme, that.scheme) isEqual(this.authority, that.authority) isEqual(this.ugi, that.ugi) (this.unique == that.unique); } return false; } I think, here some your files uri and schems are same and got the same fs object. When it closes first one, diffenitely other will get this exception. Regards, Uma - Original Message - From: Joey Echeverria j...@cloudera.com Date: Thursday, September 29, 2011 10:34 pm Subject: Re: FileSystem closed To: common-user@hadoop.apache.org Do you close your FileSystem instances at all? IIRC, the FileSystem instance you use is a singleton and if you close it once, it's closed for everybody. My guess is you close it in your cleanup method and you have JVM reuse turned on. -Joey On Thu, Sep 29, 2011 at 12:49 PM, Mark question markq2...@gmail.com wrote: Hello, I'm running 100 mappers sequentially on a single machine, where each mapper opens 100 files at the beginning then read one by one sequentially and closes after each one is done. After executing 6 mappers, the 7th gives this error: java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:297) at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:426) at java.io.FilterInputStream.close(FilterInputStream.java:155) at org.apache.hadoop.io.SequenceFile$Reader.close(SequenceFile.java:1653) at Mapper_Reader20HM4.CleanUp(Mapper_Reader20HM4.java:124) at BFMapper20HM9.close(BFMapper20HM9.java:264) at BFMapRunner20HM9.run(BFMapRunner20HM9.java:95) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:397) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:297) at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:426) at java.io.FilterInputStream.close(FilterInputStream.java:155) at org.apache.hadoop.io.SequenceFile$Reader.close(SequenceFile.java:1653) at Mapper_Reader20HM4.CleanUp(Mapper_Reader20HM4.java:124) at BFMapper20HM9.close(BFMapper20HM9.java:264) at BFMapRunner20HM9.run(BFMapRunner20HM9.java:95) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:397) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:297) at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:426) at java.io.FilterInputStream.close(FilterInputStream.java:155) at org.apache.hadoop.io.SequenceFile$Reader.close(SequenceFile.java:1653) at Mapper_Reader20HM4.CleanUp(Mapper_Reader20HM4.java:124) at BFMapper20HM9.close(BFMapper20HM9.java:264) at BFMapRunner20HM9.run(BFMapRunner20HM9.java:95) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:397) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:297) at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:426) at
Re: Block Size
hi, Here is some useful info: A small file is one which is significantly smaller than the HDFS block size (default 64MB). If you’re storing small files, then you probably have lots of them (otherwise you wouldn’t turn to Hadoop), and the problem is that HDFS can’t handle lots of files. Every file, directory and block in HDFS is represented as an object in the namenode’s memory, each of which occupies 150 bytes, as a rule of thumb. So 10 million files, each using a block, would use about 3 gigabytes of memory. Scaling up much beyond this level is a problem with current hardware. Certainly a billion files is not feasible. Furthermore, HDFS is not geared up to efficiently accessing small files: it is primarily designed for streaming access of large files. Reading through small files normally causes lots of seeks and lots of hopping from datanode to datanode to retrieve each small file, all of which is an inefficient data access pattern. Problems with small files and MapReduce Map tasks usually process a block of input at a time (using the default FileInputFormat). If the file is very small and there are a lot of them, then each map task processes very little input, and there are a lot more map tasks, each of which imposes extra bookkeeping overhead. Compare a 1GB file broken into 16 64MB blocks, and 10,000 or so 100KB files. The 10,000 files use one map each, and the job time can be tens or hundreds of times slower than the equivalent one with a single input file. There are a couple of features to help alleviate the bookkeeping overhead: task JVM reuse for running multiple map tasks in one JVM, thereby avoiding some JVM startup overhead (see the mapred.job.reuse.jvm.num.tasks property), and MultiFileInputSplit which can run more than one split per map. just copied from cloudera's blog. http://www.cloudera.com/blog/2009/02/the-small-files-problem/#comments regards, Uma - Original Message - From: lessonz less...@q.com Date: Thursday, September 29, 2011 11:10 pm Subject: Block Size To: common-user common-user@hadoop.apache.org I'm new to Hadoop, and I'm trying to understand the implications of a 64M block size in the HDFS. Is there a good reference that enumerates the implications of this decision and its effects on files stored in the system as well as map-reduce jobs? Thanks.
Re: How to run Hadoop in standalone mode in Windows
Java 6, Cygwin ( maven + tortoiseSVN are for building hadoop) should be enough for running standalone mode in windows. Regards, Uma - Original Message - From: Mark Kerzner markkerz...@gmail.com Date: Saturday, September 24, 2011 4:58 am Subject: How to run Hadoop in standalone mode in Windows To: common-user@hadoop.apache.org Hi, I have cygwin, and I have NetBeans, and I have a maven Hadoop project that works on Linux. How do I combine them to work in Windows? Thank you, Mark
Re: HDFS file into Blocks
@Kartheek, Great :-) - Original Message - From: kartheek muthyala kartheek0...@gmail.com Date: Monday, September 26, 2011 12:06 pm Subject: Re: HDFS file into Blocks To: common-user@hadoop.apache.org @Uma, Thanks alot!!. I have found the flow... Thanks, Kartheek. On Mon, Sep 26, 2011 at 10:03 AM, He Chen airb...@gmail.com wrote: Hi It is interesting that a guy from Huawei is also working on Hadoop project. :) Chen On Sun, Sep 25, 2011 at 11:29 PM, Uma Maheswara Rao G 72686 mahesw...@huawei.com wrote: Hi, You can find the Code in DFSOutputStream.java Here there will be one thread DataStreamer thread. This thread will pick the packets from DataQueue and write on to the sockets. Before this, when actually writing the chunks, based on the block size parameter passed from client, it will set the last packet parameter in Packet. If the streamer thread finds that is the last block then it end the block. That means it will close the socket which were used for witing the block. Streamer thread repeat the loops. When it find there is no sockets open then it will again create the pipeline for the next block. Go throgh the flow from writeChunk in DFSOutputStream.java, where exactly enqueing the packets in dataQueue. Regards, Uma - Original Message - From: kartheek muthyala kartheek0...@gmail.com Date: Sunday, September 25, 2011 11:06 am Subject: HDFS file into Blocks To: common-user@hadoop.apache.org Hi all, I am working around the code to understand where HDFS divides a file into blocks. Can anyone point me to this section of the code? Thanks, Kartheek
Re: Too many fetch failures. Help!
Hello Abdelrahman, Are you able to ping from one machine to other with the configured hostname? configure both the hostnames in /etc/hosts file properly and try. Regards, Uma - Original Message - From: Abdelrahman Kamel abdouka...@gmail.com Date: Monday, September 26, 2011 8:47 pm Subject: Too many fetch failures. Help! To: common-user@hadoop.apache.org Hi, This is my first post here. I'm new to Hadoop. I've already installed Hadoop on 2 Ubuntu boxes (one is both master andslave and the other is only slave). When I run a Wordcount example on 5 small txt files, the process never completes and I get a Too many fetch failures error on my terminal. If you can help me, I cant post my terminal's output and any log files needed. Great thanks. -- Abdelrahman Kamel
Re: HDFS file into Blocks
Hi, You can find the Code in DFSOutputStream.java Here there will be one thread DataStreamer thread. This thread will pick the packets from DataQueue and write on to the sockets. Before this, when actually writing the chunks, based on the block size parameter passed from client, it will set the last packet parameter in Packet. If the streamer thread finds that is the last block then it end the block. That means it will close the socket which were used for witing the block. Streamer thread repeat the loops. When it find there is no sockets open then it will again create the pipeline for the next block. Go throgh the flow from writeChunk in DFSOutputStream.java, where exactly enqueing the packets in dataQueue. Regards, Uma - Original Message - From: kartheek muthyala kartheek0...@gmail.com Date: Sunday, September 25, 2011 11:06 am Subject: HDFS file into Blocks To: common-user@hadoop.apache.org Hi all, I am working around the code to understand where HDFS divides a file into blocks. Can anyone point me to this section of the code? Thanks, Kartheek
Re: Can we replace namenode machine with some other machine ?
In NN many deamons will run. For replicating the blocks from one DN to other DN when there is no enough replications. SafeMode monitering, LeaseManager and will also maintain the Blocks to machineList mappings in memory, HeartbeatMonitoring, IPC handlers..etc. In JT also there are many deamons like this. If you are not dealing with very less files then normal configuration is enough. But you should configure enough memory for running the NN and JT.This always will comes under your usage. For better understanding, I would suggest you to go through the Hadoop Deffenitive Guide. All this details has been documented very well. Regards, Uma - Original Message - From: praveenesh kumar praveen...@gmail.com Date: Thursday, September 22, 2011 11:45 am Subject: Re: Can we replace namenode machine with some other machine ? To: common-user@hadoop.apache.org But apart from storing metadata info, Is there anything more NN/JT machinesare doing ?? . So I can say I can survive with poor NN if I am not dealing with lots of files in HDFS ? On Thu, Sep 22, 2011 at 11:08 AM, Uma Maheswara Rao G 72686 mahesw...@huawei.com wrote: By just changing the configs will not effect your data. You need to restart your DNs to connect to new NN. For the second question: It will again depends on your usage. If your files will more in DFS then NN will consume more memory as it needs to store all the metadata info of the files in NameSpace. If your files are more and more then it is recommended that dont put the NN and JT in same machine. Coming to DN case: Configured space will used for storing the block files.Once it is filled the space then NN will not select this DN for further writes. So, if one DN has less space should fine than less space for NN in big clusters. Configuring good configuration DN which has very good amount of space. And NN has less space to store your files metadata info then its of no use to have more space in DNs right :-) Regards, Uma - Original Message - From: praveenesh kumar praveen...@gmail.com Date: Thursday, September 22, 2011 10:42 am Subject: Re: Can we replace namenode machine with some other machine ? To: common-user@hadoop.apache.org If I just change configuration settings in slave machines, Will it effectany of the data that is currently residing in the cluster ?? And my second question was... Do we need the master node (NN/JT hosting machine) to have good configuration than our slave machines(DN/TT hosting machines). Actually my master node is a weaker machine than my slave machines,because I am assuming that master machines does not do much additional work, and its okay to have a weak machine as master. Now I have a new big server machine just being added to my cluster. So I am thinking shall I make this new machine as my new master(NN/JT) or just add this machine as slave ? Thanks, Praveenesh On Thu, Sep 22, 2011 at 10:20 AM, Uma Maheswara Rao G 72686 mahesw...@huawei.com wrote: You copy the same installations to new machine and change ip address. After that configure the new NN addresses to your clients and DNs. Also Does Namenode/JobTracker machine's configuration needs to be better than datanodes/tasktracker's ?? I did not get this question. Regards, Uma - Original Message - From: praveenesh kumar praveen...@gmail.com Date: Thursday, September 22, 2011 10:13 am Subject: Can we replace namenode machine with some other machine ? To: common-user@hadoop.apache.org Hi all, Can we replace our namenode machine later with some other machine. ? Actually I got a new server machine in my cluster and now I want to make this machine as my new namenode and jobtracker node ? Also Does Namenode/JobTracker machine's configuration needs to be betterthan datanodes/tasktracker's ?? How can I achieve this target with least overhead ? Thanks, Praveenesh
Re: RE: Making Mumak work with capacity scheduler
Yes Devaraj, From the logs, looks it failed to create /jobtracker/jobsInfo code snippet: if (!fs.exists(path)) { if (!fs.mkdirs(path, new FsPermission(JOB_STATUS_STORE_DIR_PERMISSION))) { throw new IOException( CompletedJobStatusStore mkdirs failed to create + path.toString()); } @ Arun, Can you check, you have correct permission as Devaraj said? 2011-09-22 15:53:57.598::INFO: Started SelectChannelConnector@0.0.0.0:50030 11/09/22 15:53:57 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 11/09/22 15:53:57 WARN conf.Configuration: mapred.task.cache.levels is deprecated. Instead, use mapreduce.jobtracker.taskcache.levels 11/09/22 15:53:57 WARN mapred.SimulatorJobTracker: Error starting tracker: java.io.IOException: CompletedJobStatusStore mkdirs failed to create /jobtracker/jobsInfo at org.apache.hadoop.mapred.CompletedJobStatusStore.init(CompletedJobStatusStore.java:83) at org.apache.hadoop.mapred.JobTracker.init(JobTracker.java:4684) at org.apache.hadoop.mapred.SimulatorJobTracker.init(SimulatorJobTracker.java:81) at org.apache.hadoop.mapred.SimulatorJobTracker.startTracker(SimulatorJobTracker.java:100) at org.apache.hadoop.mapred.SimulatorEngine.init(SimulatorEngine.java:210) at org.apache.hadoop.mapred.SimulatorEngine.init(SimulatorEngine.java:184) at org.apache.hadoop.mapred.SimulatorEngine.run(SimulatorEngine.java:292) at org.apache.hadoop.mapred.SimulatorEngine.run(SimulatorEngine.java:323) I cc'ed to Mapreduce user mailing list as well. Regards, Uma - Original Message - From: Devaraj K devara...@huawei.com Date: Thursday, September 22, 2011 6:01 pm Subject: RE: Making Mumak work with capacity scheduler To: common-user@hadoop.apache.org Hi Arun, I have gone through the logs. Mumak simulator is trying to start the job tracker and job tracking is failing to start because it is not able to create /jobtracker/jobsinfo directory. I think the directory doesn't have enough permissions. Please check thepermissions or any other reason why it is failing to create the dir. Devaraj K -Original Message- From: arun k [mailto:arunk...@gmail.com] Sent: Thursday, September 22, 2011 3:57 PM To: common-user@hadoop.apache.org Subject: Re: Making Mumak work with capacity scheduler Hi Uma ! u got me right ! Actually without any patch when i modified appropriate mapred- site.xml and capacity-scheduler.xml and copied capaciy jar accordingly. I am able to see see queues in Jobracker GUI but both the queues show same set of job's execution. I ran with trace and topology files from test/data : $bin/mumak.sh trace_file topology_file Is it because i am not submitting jobs to a particular queue ? If so how can i do it ? Got hadoop-0.22 from http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.22/ builded all three components but when i give arun@arun-Presario-C500-RU914PA-ACJ:~/hadoop22/branch- 0.22/mapreduce/src/contrib/mumak$ bin/mumak.sh src/test/data/19-jobs.trace.json.gz src/test/data/19-jobs.topology.json.gz it gets stuck at some point. Log is here http://pastebin.com/9SNUHLFy Thanks, Arun On Wed, Sep 21, 2011 at 2:03 PM, Uma Maheswara Rao G 72686 mahesw...@huawei.com wrote: Hello Arun, If you want to apply MAPREDUCE-1253 on 21 version, applying patch directly using commands may not work because of codebase changes. So, you take the patch and apply the lines in your code base manually. I am not sure any otherway for this. Did i understand wrongly your intention? Regards, Uma - Original Message - From: ArunKumar arunk...@gmail.com Date: Wednesday, September 21, 2011 1:52 pm Subject: Re: Making Mumak work with capacity scheduler To: hadoop-u...@lucene.apache.org Hi Uma ! Mumak is not part of stable versions yet. It comes from Hadoop- 0.21 onwards. Can u describe in detail You may need to merge them logically ( back port them) ? I don't get it . Arun On Wed, Sep 21, 2011 at 12:07 PM, Uma Maheswara Rao G [via Lucene] ml-node+s472066n3354668...@n3.nabble.com wrote: Looks that patchs are based on 0.22 version. So, you can not apply them directly. You may need to merge them logically ( back port them). one more point to note here 0.21 version of hadoop is not a stable version. Presently 0.20xx versions are stable. Regards, Uma - Original Message - From: ArunKumar [hidden email]http://user/SendEmail.jtp?type=nodenode=3354668i=0 Date: Wednesday, September 21, 2011 12:01 pm Subject: Re: Making Mumak work with capacity scheduler To: [hidden email] http://user/SendEmail.jtp?type=nodenode=3354668i=1 Hi Uma ! I am applying patch to mumak in hadoop-0.21
Re: Making Mumak work with capacity scheduler
Hello Arun, On which code base you are trying to apply the patch. Code should match to apply the patch. Regards, Uma - Original Message - From: ArunKumar arunk...@gmail.com Date: Wednesday, September 21, 2011 11:33 am Subject: Making Mumak work with capacity scheduler To: hadoop-u...@lucene.apache.org Hi ! I have set up mumak and able to run it in terminal and in eclipse. I have modified the mapred-site.xml and capacity-scheduler.xml as necessary.I tried to apply patch MAPREDUCE-1253-20100804.patch in https://issues.apache.org/jira/browse/MAPREDUCE-1253 https://issues.apache.org/jira/browse/MAPREDUCE-1253 as follows {HADOOP_HOME}contrib/mumak$patch -p0 patch_file_location but i get error 3 out of 3 HUNK failed. Thanks, Arun -- View this message in context: http://lucene.472066.n3.nabble.com/Making-Mumak-work-with-capacity- scheduler-tp3354615p3354615.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: RE: RE: java.io.IOException: Incorrect data format
I would suggest you to clean some space and try. Regards, Uma - Original Message - From: Peng, Wei wei.p...@xerox.com Date: Wednesday, September 21, 2011 10:03 am Subject: RE: RE: java.io.IOException: Incorrect data format To: common-user@hadoop.apache.org Yes, I can. The datanode is not able to start after crashing without enough HD space. Wei -Original Message- From: Uma Maheswara Rao G 72686 [mailto:mahesw...@huawei.com] Sent: Tuesday, September 20, 2011 9:30 PM To: common-user@hadoop.apache.org Subject: Re: RE: java.io.IOException: Incorrect data format Are you able to create the directory manually in the DataNode Machine? #mkdirs /state/partition2/hadoop/dfs/tmp Regards, Uma - Original Message - From: Peng, Wei wei.p...@xerox.com Date: Wednesday, September 21, 2011 9:44 am Subject: RE: java.io.IOException: Incorrect data format To: common-user@hadoop.apache.org I modified edits so that hadoop namenode is restarted, however, I couldnot start my datanode. The datanode log shows 2011-09-20 21:07:10,068 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Mkdirs failed to create /state/partition2/hadoop/dfs/tmpat org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.init(FSDatas et.java:394) at org.apache.hadoop.hdfs.server.datanode.FSDataset.init(FSDataset.java:8 94) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.j ava:318) at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:232 ) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.ja va:1363) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(Data Node.java:1318) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode. java:1326) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1448) Wei -Original Message- From: Uma Maheswara Rao G 72686 [mailto:mahesw...@huawei.com] Sent: Tuesday, September 20, 2011 9:10 PM To: common-user@hadoop.apache.org Subject: Re: java.io.IOException: Incorrect data format Can you check what is the value for command 'df -h'in NN machine. I think, one more possibility could be that while saving image itself it would have been currupted. To avoid such cases it has been already handled in trunk.For more details https://issues.apache.org/jira/browse/HDFS-1594 Regards, Uma - Original Message - From: Peng, Wei wei.p...@xerox.com Date: Wednesday, September 21, 2011 9:01 am Subject: java.io.IOException: Incorrect data format To: common-user@hadoop.apache.org I was not able to restart my name server because I the name server ran out of space. Then I adjusted dfs.datanode.du.reserved to 0, and used tune2fs -m to get some space, but I still could not restart the name node. I got the following error: java.io.IOException: Incorrect data format. logVersion is -18 but writables.length is 0. Anyone knows how to resolve this issue? Best, Wei
Re: Making Mumak work with capacity scheduler
Looks that patchs are based on 0.22 version. So, you can not apply them directly. You may need to merge them logically ( back port them). one more point to note here 0.21 version of hadoop is not a stable version. Presently 0.20xx versions are stable. Regards, Uma - Original Message - From: ArunKumar arunk...@gmail.com Date: Wednesday, September 21, 2011 12:01 pm Subject: Re: Making Mumak work with capacity scheduler To: hadoop-u...@lucene.apache.org Hi Uma ! I am applying patch to mumak in hadoop-0.21 version. Arun On Wed, Sep 21, 2011 at 11:55 AM, Uma Maheswara Rao G [via Lucene] ml-node+s472066n3354652...@n3.nabble.com wrote: Hello Arun, On which code base you are trying to apply the patch. Code should match to apply the patch. Regards, Uma - Original Message - From: ArunKumar [hidden email]http://user/SendEmail.jtp?type=nodenode=3354652i=0 Date: Wednesday, September 21, 2011 11:33 am Subject: Making Mumak work with capacity scheduler To: [hidden email] http://user/SendEmail.jtp?type=nodenode=3354652i=1 Hi ! I have set up mumak and able to run it in terminal and in eclipse. I have modified the mapred-site.xml and capacity-scheduler.xml as necessary.I tried to apply patch MAPREDUCE-1253-20100804.patch in https://issues.apache.org/jira/browse/MAPREDUCE-1253 https://issues.apache.org/jira/browse/MAPREDUCE-1253 as follows {HADOOP_HOME}contrib/mumak$patch -p0 patch_file_location but i get error 3 out of 3 HUNK failed. Thanks, Arun -- View this message in context: http://lucene.472066.n3.nabble.com/Making-Mumak-work-with- capacity- scheduler-tp3354615p3354615.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Making-Mumak-work-with- capacity-scheduler-tp3354615p3354652.html To unsubscribe from Making Mumak work with capacity scheduler, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3354615code=YXJ1bms3ODZAZ21haWwuY29tfDMzNTQ2MTV8NzA5NTc4MTY3. -- View this message in context: http://lucene.472066.n3.nabble.com/Making-Mumak-work-with-capacity- scheduler-tp3354615p3354660.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: Making Mumak work with capacity scheduler
Hello Arun, If you want to apply MAPREDUCE-1253 on 21 version, applying patch directly using commands may not work because of codebase changes. So, you take the patch and apply the lines in your code base manually. I am not sure any otherway for this. Did i understand wrongly your intention? Regards, Uma - Original Message - From: ArunKumar arunk...@gmail.com Date: Wednesday, September 21, 2011 1:52 pm Subject: Re: Making Mumak work with capacity scheduler To: hadoop-u...@lucene.apache.org Hi Uma ! Mumak is not part of stable versions yet. It comes from Hadoop- 0.21 onwards. Can u describe in detail You may need to merge them logically ( back port them) ? I don't get it . Arun On Wed, Sep 21, 2011 at 12:07 PM, Uma Maheswara Rao G [via Lucene] ml-node+s472066n3354668...@n3.nabble.com wrote: Looks that patchs are based on 0.22 version. So, you can not apply them directly. You may need to merge them logically ( back port them). one more point to note here 0.21 version of hadoop is not a stable version. Presently 0.20xx versions are stable. Regards, Uma - Original Message - From: ArunKumar [hidden email]http://user/SendEmail.jtp?type=nodenode=3354668i=0 Date: Wednesday, September 21, 2011 12:01 pm Subject: Re: Making Mumak work with capacity scheduler To: [hidden email] http://user/SendEmail.jtp?type=nodenode=3354668i=1 Hi Uma ! I am applying patch to mumak in hadoop-0.21 version. Arun On Wed, Sep 21, 2011 at 11:55 AM, Uma Maheswara Rao G [via Lucene] [hidden email] http://user/SendEmail.jtp?type=nodenode=3354668i=2 wrote: Hello Arun, On which code base you are trying to apply the patch. Code should match to apply the patch. Regards, Uma - Original Message - From: ArunKumar [hidden email]http://user/SendEmail.jtp?type=nodenode=3354652i=0 Date: Wednesday, September 21, 2011 11:33 am Subject: Making Mumak work with capacity scheduler To: [hidden email] http://user/SendEmail.jtp?type=nodenode=3354652i=1 Hi ! I have set up mumak and able to run it in terminal and in eclipse.I have modified the mapred-site.xml and capacity- scheduler.xml as necessary.I tried to apply patch MAPREDUCE-1253- 20100804.patch in https://issues.apache.org/jira/browse/MAPREDUCE-1253 https://issues.apache.org/jira/browse/MAPREDUCE-1253 as follows{HADOOP_HOME}contrib/mumak$patch -p0 patch_file_locationbut i get error 3 out of 3 HUNK failed. Thanks, Arun -- View this message in context: http://lucene.472066.n3.nabble.com/Making-Mumak-work-with- capacity- scheduler-tp3354615p3354615.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Making-Mumak-work-with- capacity-scheduler-tp3354615p3354652.html To unsubscribe from Making Mumak work with capacity scheduler, click here -- View this message in context: http://lucene.472066.n3.nabble.com/Making-Mumak-work-with- capacity- scheduler-tp3354615p3354660.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Making-Mumak-work-with- capacity-scheduler-tp3354615p3354668.html To unsubscribe from Making Mumak work with capacity scheduler, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3354615code=YXJ1bms3ODZAZ21haWwuY29tfDMzNTQ2MTV8NzA5NTc4MTY3. -- View this message in context: http://lucene.472066.n3.nabble.com/Making-Mumak-work-with-capacity- scheduler-tp3354615p3354818.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: Any other way to copy to HDFS ?
Hi, You need not copy the files to NameNode. Hadoop provide Client code as well to copy the files. To copy the files from other node ( non dfs), you need to put the hadoop**.jar's into classpath and use the below code snippet. FileSystem fs =new DistributedFileSystem(); fs.initialize(NAMENODE_URI, configuration); fs.copyFromLocal(srcPath, dstPath); using this API, you can copy the files from any machine. Regards, Uma - Original Message - From: praveenesh kumar praveen...@gmail.com Date: Wednesday, September 21, 2011 2:14 pm Subject: Any other way to copy to HDFS ? To: common-user@hadoop.apache.org Guys, As far as I know hadoop, I think, to copy the files to HDFS, first it needs to be copied to the NameNode's local filesystem. Is it right ?? So does it mean that even if I have a hadoop cluster of 10 nodes with overall capacity of 6TB, but if my NameNode's hard disk capacity is 500 GB, I can not copy any file to HDFS greater than 500 GB ? Is there any other way to directly copy to HDFS without copy the file to namenode's local filesystem ? What can be other ways to copy large files greater than namenode's diskcapacity ? Thanks, Praveenesh.
Re: Any other way to copy to HDFS ?
For more understanding the flows, i would recommend you to go through once below docs http://hadoop.apache.org/common/docs/r0.16.4/hdfs_design.html#The+File+System+Namespace Regards, Uma - Original Message - From: Uma Maheswara Rao G 72686 mahesw...@huawei.com Date: Wednesday, September 21, 2011 2:36 pm Subject: Re: Any other way to copy to HDFS ? To: common-user@hadoop.apache.org Hi, You need not copy the files to NameNode. Hadoop provide Client code as well to copy the files. To copy the files from other node ( non dfs), you need to put the hadoop**.jar's into classpath and use the below code snippet. FileSystem fs =new DistributedFileSystem(); fs.initialize(NAMENODE_URI, configuration); fs.copyFromLocal(srcPath, dstPath); using this API, you can copy the files from any machine. Regards, Uma - Original Message - From: praveenesh kumar praveen...@gmail.com Date: Wednesday, September 21, 2011 2:14 pm Subject: Any other way to copy to HDFS ? To: common-user@hadoop.apache.org Guys, As far as I know hadoop, I think, to copy the files to HDFS, first it needs to be copied to the NameNode's local filesystem. Is it right ?? So does it mean that even if I have a hadoop cluster of 10 nodes with overall capacity of 6TB, but if my NameNode's hard disk capacity is 500 GB, I can not copy any file to HDFS greater than 500 GB ? Is there any other way to directly copy to HDFS without copy the file to namenode's local filesystem ? What can be other ways to copy large files greater than namenode's diskcapacity ? Thanks, Praveenesh.
Re: Any other way to copy to HDFS ?
When you start the NameNode in Linux Machine, it will listen on one address.You can configure that address in NameNode by using fs.default.name. From the clients, you can give this address to connect to your NameNode. initialize API will take URI and configuration. Assume if your NameNode is running on hdfs://10.18.52.63:9000 Then you can caonnect to your NameNode like below. FileSystem fs =new DistributedFileSystem(); fs.initialize(new URI(hdfs://10.18.52.63:9000/), new Configuration()); Please go through the below mentioned docs, you will more understanding. if I want to copy data from windows machine to namenode machine ? In DFS namenode will be responsible for only nameSpace. in simple words to understand quickly the flow: Clients will ask NameNode to give some DNs to copy the data. Then NN will create file entry in NameSpace and also will return the block entries based on client request. Then clients directly will connect to the DNs and copy the data. Reading data back also will the sameway. I hope you will understand better now :-) Regards, Uma - Original Message - From: praveenesh kumar praveen...@gmail.com Date: Wednesday, September 21, 2011 3:11 pm Subject: Re: Any other way to copy to HDFS ? To: common-user@hadoop.apache.org So I want to copy the file from windows machine to linux namenode. How can I define NAMENODE_URI in the code you mention, if I want to copy data from windows machine to namenode machine ? Thanks, Praveenesh On Wed, Sep 21, 2011 at 2:37 PM, Uma Maheswara Rao G 72686 mahesw...@huawei.com wrote: For more understanding the flows, i would recommend you to go through once below docs http://hadoop.apache.org/common/docs/r0.16.4/hdfs_design.html#The+File+System+Namespace Regards, Uma - Original Message - From: Uma Maheswara Rao G 72686 mahesw...@huawei.com Date: Wednesday, September 21, 2011 2:36 pm Subject: Re: Any other way to copy to HDFS ? To: common-user@hadoop.apache.org Hi, You need not copy the files to NameNode. Hadoop provide Client code as well to copy the files. To copy the files from other node ( non dfs), you need to put the hadoop**.jar's into classpath and use the below code snippet. FileSystem fs =new DistributedFileSystem(); fs.initialize(NAMENODE_URI, configuration); fs.copyFromLocal(srcPath, dstPath); using this API, you can copy the files from any machine. Regards, Uma - Original Message - From: praveenesh kumar praveen...@gmail.com Date: Wednesday, September 21, 2011 2:14 pm Subject: Any other way to copy to HDFS ? To: common-user@hadoop.apache.org Guys, As far as I know hadoop, I think, to copy the files to HDFS, first it needs to be copied to the NameNode's local filesystem. Is it right ?? So does it mean that even if I have a hadoop cluster of 10 nodes with overall capacity of 6TB, but if my NameNode's hard disk capacity is 500 GB, I can not copy any file to HDFS greater than 500 GB ? Is there any other way to directly copy to HDFS without copy the file to namenode's local filesystem ? What can be other ways to copy large files greater than namenode's diskcapacity ? Thanks, Praveenesh.
Re: Fwd: Any other way to copy to HDFS ?
) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy0.create(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:2833) ... 10 more As far as I know, the exception is coming because some other user is trying to access HDFS than my hadoop user. Does it mean I have to change permission ? or is there any other way to do it from java code ? Thanks, Praveenesh -- Forwarded message -- From: Uma Maheswara Rao G 72686 mahesw...@huawei.com Date: Wed, Sep 21, 2011 at 3:27 PM Subject: Re: Any other way to copy to HDFS ? To: common-user@hadoop.apache.org When you start the NameNode in Linux Machine, it will listen on one address.You can configure that address in NameNode by using fs.default.name. From the clients, you can give this address to connect to your NameNode. initialize API will take URI and configuration. Assume if your NameNode is running on hdfs://10.18.52.63:9000 Then you can caonnect to your NameNode like below. FileSystem fs =new DistributedFileSystem(); fs.initialize(new URI(hdfs://10.18.52.63:9000/), new Configuration()); Please go through the below mentioned docs, you will more understanding. if I want to copy data from windows machine to namenode machine ? In DFS namenode will be responsible for only nameSpace. in simple words to understand quickly the flow: Clients will ask NameNode to give some DNs to copy the data. Then NN will create file entry in NameSpace and also will return the block entries based on client request. Then clients directly will connect to the DNs and copy the data. Reading data back also will the sameway. I hope you will understand better now :-) Regards, Uma - Original Message - From: praveenesh kumar praveen...@gmail.com Date: Wednesday, September 21, 2011 3:11 pm Subject: Re: Any other way to copy to HDFS ? To: common-user@hadoop.apache.org So I want to copy the file from windows machine to linux namenode. How can I define NAMENODE_URI in the code you mention, if I want to copy data from windows machine to namenode machine ? Thanks, Praveenesh On Wed, Sep 21, 2011 at 2:37 PM, Uma Maheswara Rao G 72686 mahesw...@huawei.com wrote: For more understanding the flows, i would recommend you to go through once below docs http://hadoop.apache.org/common/docs/r0.16.4/hdfs_design.html#The+File+System+Namespace Regards, Uma - Original Message - From: Uma Maheswara Rao G 72686 mahesw...@huawei.com Date: Wednesday, September 21, 2011 2:36 pm Subject: Re: Any other way to copy to HDFS ? To: common-user@hadoop.apache.org Hi, You need not copy the files to NameNode. Hadoop provide Client code as well to copy the files. To copy the files from other node ( non dfs), you need to put the hadoop**.jar's into classpath and use the below code snippet. FileSystem fs =new DistributedFileSystem(); fs.initialize(NAMENODE_URI, configuration); fs.copyFromLocal(srcPath, dstPath); using this API, you can copy the files from any machine. Regards, Uma - Original Message - From: praveenesh kumar praveen...@gmail.com Date: Wednesday, September 21, 2011 2:14 pm Subject: Any other way to copy to HDFS ? To: common-user@hadoop.apache.org Guys, As far as I know hadoop, I think, to copy the files to HDFS, first it needs to be copied to the NameNode's local filesystem. Is it right ?? So does it mean that even if I have a hadoop cluster of 10 nodes with overall capacity of 6TB, but if my NameNode's hard disk capacity is 500 GB, I can not copy any file to HDFS greater than 500 GB ? Is there any other way to directly copy to HDFS without copy the file to namenode's local filesystem ? What can be other ways to copy large files greater than namenode's diskcapacity ? Thanks, Praveenesh.
Re: Problem with MR job
Hi, Any cluster restart happend? ..is your NameNode detecting DataNodes as live? Looks DNs did not report anyblocks to NN yet. You have 13 blocks persisted in NameNode namespace. At least 12 blocks should be reported from your DNs. Other wise automatically it will not come out of safemode. Regards, Uma - Original Message - From: George Kousiouris gkous...@mail.ntua.gr Date: Wednesday, September 21, 2011 7:29 pm Subject: Problem with MR job To: common-user@hadoop.apache.org common-user@hadoop.apache.org Hi all, We are trying to run a mahout job in a hadoop cluster, but we keep getting the same status. The job passes the initial mahout stages and when it comes to be executed as a MR job, it seems to be stuck at 0% progress. Through the UI we see that it is submitted but not running. After a while it gets killed. In the logs the error shown is this one: 2011-09-21 07:47:50,507 INFO org.apache.hadoop.mapred.JobTracker: problem cleaning system directory: hdfs://master/var/lib/hadoop-0.20/cache/hdfs/mapred/system org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory /var/lib/hadoop-0.20/cache/hdfs/mapred/system. Name nod$ The reported blocks 0 needs additional 12 blocks to reach the threshold 0.9990 of total blocks 13. Safe mode will be turned off automatically. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:1966) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:1940) at org.apache.hadoop.hdfs.server.namenode.NameNode.mkdirs(NameNode.java:770) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) Some staging files seem to have been created however. I was thinking of sending this to the mahout mailing list but it seems a more core hadoop issue. We are using the following command to launch the mahout example: ./mahout org.apache.mahout.clustering.syntheticcontrol.kmeans.Job --input hdfs://master/user/hdfs/testdata/synthetic_control.data -- output hdfs://master/user/hdfs/testdata/output --t1 0.5 --t2 1 --maxIter 50 Any clues? George -- --- George Kousiouris Electrical and Computer Engineer Division of Communications, Electronics and Information Engineering School of Electrical and Computer Engineering Tel: +30 210 772 2546 Mobile: +30 6939354121 Fax: +30 210 772 2569 Email: gkous...@mail.ntua.gr Site: http://users.ntua.gr/gkousiou/ National Technical University of Athens 9 Heroon Polytechniou str., 157 73 Zografou, Athens, Greece
Re: Problem with MR job
Can you check your DN data directories once, whether the blocks present or not? Can you give the DN and NN logs. Please put them in some site and share the link here. Regards, Uma - Original Message - From: George Kousiouris gkous...@mail.ntua.gr Date: Wednesday, September 21, 2011 8:06 pm Subject: Re: Problem with MR job To: common-user@hadoop.apache.org Cc: Uma Maheswara Rao G 72686 mahesw...@huawei.com Hi, Some more logs, specifically from the JobTracker: 2011-09-21 10:22:43,482 INFO org.apache.hadoop.mapred.JobInProgress: Initializing job_201109211018_0001 2011-09-21 10:22:43,538 ERROR org.apache.hadoop.mapred.JobHistory: Failed creating job history log file for job job_201109211018_0001 java.io.FileNotFoundException: /usr/lib/hadoop- 0.20/logs/history/master_1316614721548_job_201109211018_0001_hdfs_Input+Driver+running+over+input%3A+hdfs%3A%2F%2Fmaster%2Fuse (P$ at java.io.FileOutputStream.open(Native Method) at java.io.FileOutputStream.init(FileOutputStream.java:179) at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.init(RawLocalFileSystem.java:189) at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.init(RawLocalFileSystem.java:185) at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:243) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.init(ChecksumFileSystem.java:336) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:369) at org.apache.hadoop.mapred.JobHistory$JobInfo.logSubmitted(JobHistory.java:1223) at org.apache.hadoop.mapred.JobInProgress$3.run(JobInProgress.java:681) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:678) at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:4013) at org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:79) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) 2011-09-21 10:22:43,666 ERROR org.apache.hadoop.mapred.JobHistory: Failed to store job conf in the log dir java.io.FileNotFoundException: /usr/lib/hadoop- 0.20/logs/history/master_1316614721548_job_201109211018_0001_conf.xml (Permission denied) at java.io.FileOutputStream.open(Native Method) at java.io.FileOutputStream.init(FileOutputStream.java:179) at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.init(RawLocalFileSystem.java:189) at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.init(RawLocalFileSystem.java:185) at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:243) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.init(ChecksumFileSystem.java:336) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:369) On 9/21/2011 5:15 PM, George Kousiouris wrote: Hi, The status seems healthy and the datanodes live: Status: HEALTHY Total size:118805326 B Total dirs:31 Total files:38 Total blocks (validated):38 (avg. block size 3126455 B) Minimally replicated blocks:38 (100.0 %) Over-replicated blocks:0 (0.0 %) Under-replicated blocks:9 (23.68421 %) Mis-replicated blocks:0 (0.0 %) Default replication factor:1 Average block replication:1.2368422 Corrupt blocks:0 Missing replicas:72 (153.19148 %) Number of data-nodes:2 Number of racks:1 FSCK ended at Wed Sep 21 10:06:17 EDT 2011 in 9 milliseconds The filesystem under path '/' is HEALTHY The jps command has the following output: hdfs@master:~$ jps 24292 SecondaryNameNode 30010 Jps 24109 DataNode 23962 NameNode Shouldn't this have two datanode listings? In our system, one of the datanodes and the namenode is the same machine, but i seem to remember that in the past even with this setup two datanode listings appeared in the jps output. Thanks, George On 9/21/2011 5:08 PM, Uma Maheswara Rao G 72686 wrote: Hi, Any cluster restart happend? ..is your NameNode detecting DataNodes as live? Looks DNs did not report anyblocks to NN yet. You have 13 blocks persisted in NameNode namespace. At least 12 blocks should be reported from your DNs. Other wise automatically it will not come out of safemode. Regards, Uma - Original
Re: risks of using Hadoop
Jignesh, Please see my comments inline. - Original Message - From: Kobina Kwarko kobina.kwa...@gmail.com Date: Wednesday, September 21, 2011 9:33 pm Subject: Re: risks of using Hadoop To: common-user@hadoop.apache.org Jignesh, Will your point 2 still be valid if we hire very experienced Java programmers? Kobina. On 20 September 2011 21:07, Jignesh Patel jign...@websoft.com wrote: @Kobina 1. Lack of skill set 2. Longer learning curve 3. Single point of failure @Uma I am curious to know about .20.2 is that stable? Is it same as the one you mention in your email(Federation changes), If I need scaled nameNode and append support, which version I should choose. Regarding Single point of failure, I believe Hortonworks(a.k.a Yahoo) is updating the Hadoop API. When that will be integrated with Hadoop. If I need Yes, 0.20 versions are stable. Federation changes will not be available in 0.20 versions. I think Fedaration changes has been merged to 0.23 branch. So, from 0.23 onwards you can get Fedaration implementaion. But there is no release happend for 0.23 branch yet. Regarding NameNode High Availability, there is one issue HDFS-1623 to build.(Inprogress)This may take couple of months to integrate. -Jignesh On Sep 17, 2011, at 12:08 AM, Uma Maheswara Rao G 72686 wrote: Hi Kobina, Some experiences which may helpful for you with respective to DFS. 1. Selecting the correct version. I will recommend to use 0.20X version. This is pretty stable version and all other organizations prefers it. Well tested as well. Dont go for 21 version.This version is not a stable version.This is risk. 2. You should perform thorough test with your customer operations. (of-course you will do this :-)) 3. 0.20x version has the problem of SPOF. If NameNode goes down you will loose the data.One way of recovering is by using the secondaryNameNode.You can recover the data till last checkpoint.But here manual intervention is required. In latest trunk SPOF will be addressed bu HDFS-1623. 4. 0.20x NameNodes can not scale. Federation changes included in latest versions. ( i think in 22). this may not be the problem for your cluster. But please consider this aspect as well. 5. Please select the hadoop version depending on your security requirements. There are versions available for security as well in 0.20X. 6. If you plan to use Hbase, it requires append support. 20Append has the support for append. 0.20.205 release also will have append support but not yet released. Choose your correct version to avoid sudden surprises. Regards, Uma - Original Message - From: Kobina Kwarko kobina.kwa...@gmail.com Date: Saturday, September 17, 2011 3:42 am Subject: Re: risks of using Hadoop To: common-user@hadoop.apache.org We are planning to use Hadoop in my organisation for quality of servicesanalysis out of CDR records from mobile operators. We are thinking of having a small cluster of may be 10 - 15 nodes and I'm preparing the proposal. my office requires that i provide some risk analysis in the proposal. thank you. On 16 September 2011 20:34, Uma Maheswara Rao G 72686 mahesw...@huawei.comwrote: Hello, First of all where you are planning to use Hadoop? Regards, Uma - Original Message - From: Kobina Kwarko kobina.kwa...@gmail.com Date: Saturday, September 17, 2011 0:41 am Subject: risks of using Hadoop To: common-user common-user@hadoop.apache.org Hello, Please can someone point some of the risks we may incur if we decide to implement Hadoop? BR, Isaac. Regards, Uma
Re: Can we replace namenode machine with some other machine ?
You copy the same installations to new machine and change ip address. After that configure the new NN addresses to your clients and DNs. Also Does Namenode/JobTracker machine's configuration needs to be better than datanodes/tasktracker's ?? I did not get this question. Regards, Uma - Original Message - From: praveenesh kumar praveen...@gmail.com Date: Thursday, September 22, 2011 10:13 am Subject: Can we replace namenode machine with some other machine ? To: common-user@hadoop.apache.org Hi all, Can we replace our namenode machine later with some other machine. ? Actually I got a new server machine in my cluster and now I want to make this machine as my new namenode and jobtracker node ? Also Does Namenode/JobTracker machine's configuration needs to be betterthan datanodes/tasktracker's ?? How can I achieve this target with least overhead ? Thanks, Praveenesh
Re: RE: risks of using Hadoop
Absolutely agree with you. Mainly we should consider SPOF and minimize the problem with our carefulness. (there are many ways to minimize this issue, we have seen in this thread) Regards, Uma - Original Message - From: Bill Habermaas bill.haberm...@oracle.com Date: Thursday, September 22, 2011 10:04 am Subject: RE: risks of using Hadoop To: common-user@hadoop.apache.org Amen to that. I haven't heard a good rant in a long time, I am definitely amused end entertained. As a veteran of 3 years with Hadoop I will say that the SPOF issue is whatever you want to make it. But it has not, nor will it ever defer me from using this great system. Every system has its risks and they can be minimized by careful architectural crafting and intelligent usage. Bill -Original Message- From: Michael Segel [mailto:michael_se...@hotmail.com] Sent: Wednesday, September 21, 2011 1:48 PM To: common-user@hadoop.apache.org Subject: RE: risks of using Hadoop Kobina The points 1 and 2 are definitely real risks. SPOF is not. As I pointed out in my mini-rant to Tom was that your end users / developers who use the cluster can do more harm to your cluster than a SPOF machine failure. I don't know what one would consider a 'long learning curve'. With the adoption of any new technology, you're talking at least 3-6 months based on the individual and the overall complexity of the environment. Take anyone who is a strong developer, put them through Cloudera's training, plus some play time, and you've shortened the learning curve.The better the java developer, the easier it is for them to pick up Hadoop. I would also suggest taking the approach of hiring a senior person who can cross train and mentor your staff. This too will shorten the runway. HTH -Mike Date: Wed, 21 Sep 2011 17:02:45 +0100 Subject: Re: risks of using Hadoop From: kobina.kwa...@gmail.com To: common-user@hadoop.apache.org Jignesh, Will your point 2 still be valid if we hire very experienced Java programmers? Kobina. On 20 September 2011 21:07, Jignesh Patel jign...@websoft.com wrote: @Kobina 1. Lack of skill set 2. Longer learning curve 3. Single point of failure @Uma I am curious to know about .20.2 is that stable? Is it same as the one you mention in your email(Federation changes), If I need scaled nameNode and append support, which version I should choose. Regarding Single point of failure, I believe Hortonworks(a.k.a Yahoo) is updating the Hadoop API. When that will be integrated with Hadoop. If I need -Jignesh On Sep 17, 2011, at 12:08 AM, Uma Maheswara Rao G 72686 wrote: Hi Kobina, Some experiences which may helpful for you with respective to DFS. 1. Selecting the correct version. I will recommend to use 0.20X version. This is pretty stable version and all other organizations prefers it. Well tested as well. Dont go for 21 version.This version is not a stable version.This is risk. 2. You should perform thorough test with your customer operations.(of-course you will do this :-)) 3. 0.20x version has the problem of SPOF. If NameNode goes down you will loose the data.One way of recovering is by using the secondaryNameNode.You can recover the data till last checkpoint.But here manual intervention is required. In latest trunk SPOF will be addressed bu HDFS-1623. 4. 0.20x NameNodes can not scale. Federation changes included in latest versions. ( i think in 22). this may not be the problem for your cluster. But please consider this aspect as well. 5. Please select the hadoop version depending on your security requirements. There are versions available for security as well in 0.20X. 6. If you plan to use Hbase, it requires append support. 20Append has the support for append. 0.20.205 release also will have append support but not yet released. Choose your correct version to avoid sudden surprises. Regards, Uma - Original Message - From: Kobina Kwarko kobina.kwa...@gmail.com Date: Saturday, September 17, 2011 3:42 am Subject: Re: risks of using Hadoop To: common-user@hadoop.apache.org We are planning to use Hadoop in my organisation for quality of servicesanalysis out of CDR records from mobile operators. We are thinking of having a small cluster of may be 10 - 15 nodes and I'm preparing the proposal. my office requires that i provide some risk analysis in the proposal. thank you. On 16 September 2011 20:34, Uma Maheswara Rao G 72686 mahesw...@huawei.comwrote: Hello, First of all where you are planning to use Hadoop? Regards, Uma - Original Message - From: Kobina Kwarko kobina.kwa...@gmail.com Date: Saturday
Re: Can we replace namenode machine with some other machine ?
By just changing the configs will not effect your data. You need to restart your DNs to connect to new NN. For the second question: It will again depends on your usage. If your files will more in DFS then NN will consume more memory as it needs to store all the metadata info of the files in NameSpace. If your files are more and more then it is recommended that dont put the NN and JT in same machine. Coming to DN case: Configured space will used for storing the block files.Once it is filled the space then NN will not select this DN for further writes. So, if one DN has less space should fine than less space for NN in big clusters. Configuring good configuration DN which has very good amount of space. And NN has less space to store your files metadata info then its of no use to have more space in DNs right :-) Regards, Uma - Original Message - From: praveenesh kumar praveen...@gmail.com Date: Thursday, September 22, 2011 10:42 am Subject: Re: Can we replace namenode machine with some other machine ? To: common-user@hadoop.apache.org If I just change configuration settings in slave machines, Will it effectany of the data that is currently residing in the cluster ?? And my second question was... Do we need the master node (NN/JT hosting machine) to have good configuration than our slave machines(DN/TT hosting machines). Actually my master node is a weaker machine than my slave machines,because I am assuming that master machines does not do much additional work, and its okay to have a weak machine as master. Now I have a new big server machine just being added to my cluster. So I am thinking shall I make this new machine as my new master(NN/JT) or just add this machine as slave ? Thanks, Praveenesh On Thu, Sep 22, 2011 at 10:20 AM, Uma Maheswara Rao G 72686 mahesw...@huawei.com wrote: You copy the same installations to new machine and change ip address. After that configure the new NN addresses to your clients and DNs. Also Does Namenode/JobTracker machine's configuration needs to be better than datanodes/tasktracker's ?? I did not get this question. Regards, Uma - Original Message - From: praveenesh kumar praveen...@gmail.com Date: Thursday, September 22, 2011 10:13 am Subject: Can we replace namenode machine with some other machine ? To: common-user@hadoop.apache.org Hi all, Can we replace our namenode machine later with some other machine. ? Actually I got a new server machine in my cluster and now I want to make this machine as my new namenode and jobtracker node ? Also Does Namenode/JobTracker machine's configuration needs to be betterthan datanodes/tasktracker's ?? How can I achieve this target with least overhead ? Thanks, Praveenesh
Re: RE: java.io.IOException: Incorrect data format
Are you able to create the directory manually in the DataNode Machine? #mkdirs /state/partition2/hadoop/dfs/tmp Regards, Uma - Original Message - From: Peng, Wei wei.p...@xerox.com Date: Wednesday, September 21, 2011 9:44 am Subject: RE: java.io.IOException: Incorrect data format To: common-user@hadoop.apache.org I modified edits so that hadoop namenode is restarted, however, I couldnot start my datanode. The datanode log shows 2011-09-20 21:07:10,068 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Mkdirs failed to create /state/partition2/hadoop/dfs/tmp at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.init(FSDatas et.java:394) at org.apache.hadoop.hdfs.server.datanode.FSDataset.init(FSDataset.java:8 94) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.j ava:318) at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:232 ) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.ja va:1363) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(Data Node.java:1318) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode. java:1326) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1448) Wei -Original Message- From: Uma Maheswara Rao G 72686 [mailto:mahesw...@huawei.com] Sent: Tuesday, September 20, 2011 9:10 PM To: common-user@hadoop.apache.org Subject: Re: java.io.IOException: Incorrect data format Can you check what is the value for command 'df -h'in NN machine. I think, one more possibility could be that while saving image itself it would have been currupted. To avoid such cases it has been already handled in trunk.For more details https://issues.apache.org/jira/browse/HDFS-1594 Regards, Uma - Original Message - From: Peng, Wei wei.p...@xerox.com Date: Wednesday, September 21, 2011 9:01 am Subject: java.io.IOException: Incorrect data format To: common-user@hadoop.apache.org I was not able to restart my name server because I the name server ran out of space. Then I adjusted dfs.datanode.du.reserved to 0, and used tune2fs -m to get some space, but I still could not restart the name node. I got the following error: java.io.IOException: Incorrect data format. logVersion is -18 but writables.length is 0. Anyone knows how to resolve this issue? Best, Wei
Re: Out of heap space errors on TTs
Hello, You need configure heap size for child tasks using below proprty. mapred.child.java.opts in mapred-site.xml by default it will be 200mb. But your io.sort.mb(300) is more than that. So, configure more heap space for child tasks. ex: -Xmx512m Regards, Uma - Original Message - From: john smith js1987.sm...@gmail.com Date: Monday, September 19, 2011 6:14 pm Subject: Out of heap space errors on TTs To: common-user@hadoop.apache.org Hey guys, I am running hive and I am trying to join two tables (2.2GB and 136MB) on a cluster of 9 nodes (replication = 3) Hadoop version - 0.20.2 Each data node memory - 2GB HADOOP_HEAPSIZE - 1000MB other heap settings are defaults. My hive launches 40 Maptasks and everytask failed with the same error 2011-09-19 18:37:17,110 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 300 2011-09-19 18:37:17,223 FATAL org.apache.hadoop.mapred.TaskTracker: Error running child : java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:781) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:350) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) Looks like I need to tweak some of the heap settings for TTs to handle the memory efficiently. I am unable to understand which variables to modify (there are too many related to heap sizes). Any specific things I must look at? Thanks, jS
Re: Out of heap space errors on TTs
Hello John You can use below properties mapred.tasktracker.map.tasks.maximum mapred.tasktracker.reduce.tasks.maximum By default that values will be 10. AFAIK, you can reduce io.sort.mb. But disk usage will be high. Since this is related to mapred, I have moved this discussion to Mapreduce. and cc'ed to common. Regards, Uma - Original Message - From: john smith js1987.sm...@gmail.com Date: Monday, September 19, 2011 7:02 pm Subject: Re: Out of heap space errors on TTs To: common-user@hadoop.apache.org Hi all, Thanks for the inputs... Can I reduce the ? (owing to the fact that I have less ram size , 2GB) My conf files doesn't have an entry mapred.child.java.opts .. So I guess its taking a default value of 200MB. Also how to decide the number of tasks per TT ? I have 4 cores per node and 2GB of total memory . So how many per node maximum tasks should I set? Thanks On Mon, Sep 19, 2011 at 6:28 PM, Uma Maheswara Rao G 72686 mahesw...@huawei.com wrote: Hello, You need configure heap size for child tasks using below proprty. mapred.child.java.opts in mapred-site.xml by default it will be 200mb. But your io.sort.mb(300) is more than that. So, configure more heap space for child tasks. ex: -Xmx512m Regards, Uma - Original Message - From: john smith js1987.sm...@gmail.com Date: Monday, September 19, 2011 6:14 pm Subject: Out of heap space errors on TTs To: common-user@hadoop.apache.org Hey guys, I am running hive and I am trying to join two tables (2.2GB and 136MB) on a cluster of 9 nodes (replication = 3) Hadoop version - 0.20.2 Each data node memory - 2GB HADOOP_HEAPSIZE - 1000MB other heap settings are defaults. My hive launches 40 Maptasks and everytask failed with the same error 2011-09-19 18:37:17,110 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 300 2011-09-19 18:37:17,223 FATAL org.apache.hadoop.mapred.TaskTracker: Error running child : java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:781) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:350) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) Looks like I need to tweak some of the heap settings for TTs to handle the memory efficiently. I am unable to understand which variables to modify (there are too many related to heap sizes). Any specific things I must look at? Thanks, jS
Re: Namenode server is not starting for lily.
One more point to check. Did you copy sh files from windows box? If yes, please do dos2unix conversion if your target os is linux. other point is, it is clear that format has abborted. You need to give Y option instead of y. ( Harsh mentioned it) Thanks Uma - Original Message - From: Harsh J ha...@cloudera.com Date: Monday, September 19, 2011 10:28 pm Subject: Re: Namenode server is not starting for lily. To: common-user@hadoop.apache.org Hey Rahul, Few things: - For format command, please enter a capital 'Y' when it prompts for a reformat. I don't see why you need a reformat, but if that's what you want to do… (Side note: On future versions, hadoop will do OK with regular answers like y/Y etc.). - Ensure you have bash installed on all your nodes? - Can you paste your hadoop-env.sh contents to a site like pastebin.com and post the link? Or paste it into a reply mail. I'm guessing you have some odd formatting in there, probably. On Mon, Sep 19, 2011 at 12:45 PM, Rahul Mehta rahul23134...@gmail.com wrote: Hi, Actually we want to run lily seperatly hadoop , hbase and all.but when i am starting lily server it say namenode server is not started . Please suggest what is the problem. when i am doing bin/hadoop namenode -format then it is giving me following result . 11/09/19 05:07:49 INFO namenode.NameNode: STARTUP_MSG: / STARTUP_MSG: Starting NameNode STARTUP_MSG: host = basf/184.106.83.141 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 0.20.2-cdh3u1 STARTUP_MSG: build = file:///tmp/nightly_2011-07-18_07-57-52_3/hadoop-0.20- 0.20.2+923.97-1~maver ick -r bdafb1dbffd0d5f2fbc6ee022e1c8df6500fd638; compiled by 'root' on Mon Jul 18 09:40:07 PDT 2011 / Re-format filesystem in /home/reach121/lily- 1.0.1/data/dfs_name_dir ? (Y or N) y Format aborted in /home/reach121/lily-1.0.1/data/dfs_name_dir 11/09/19 05:07:59 INFO namenode.NameNode: SHUTDOWN_MSG: / SHUTDOWN_MSG: Shutting down NameNode at basf/184.106.83.141 / and when m starting bin/start-all.sh Then it is giving me following result . [sudo] password for reach121: /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 0: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 136: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 0c#: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 0: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 136: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 0c#: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 0: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 136: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 0c#: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 0: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 136: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 0c#: command not found starting namenode, logging to /usr/lib/hadoop-0.20/bin/../logs/hadoop-root-namenode-basf.out /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 0: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 136: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 0c#: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 0: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 136: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 0c#: command not found May not run daemons as root. Please specify HADOOP_NAMENODE_USER /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 0: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 136: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 0c#: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 0: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 136: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 0c#: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 0: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 136: command not found /usr/lib/hadoop-0.20/bin/../conf/hadoop-env.sh: line 1: 0c#: command not found The authenticity of host 'basf (184.106.83.141)' can't be established. RSA key fingerprint is
Re: Submitting Jobs from different user to a queue in capacity scheduler
Did you give permissions recursively? $ sudo chown -R hduser:hadoop hadoop Regards, Uma - Original Message - From: ArunKumar arunk...@gmail.com Date: Sunday, September 18, 2011 12:00 pm Subject: Submitting Jobs from different user to a queue in capacity scheduler To: hadoop-u...@lucene.apache.org Hi ! I have set up hadoop on my machine as per http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu- linux-single-node-cluster/ I am able to run application with capacity scheduler by submit jobs to a paricular queue from owner of hadoop hduser. I tried this from other user : 1. Configured ssh 2. Changed the hadoop exract's permission to 777. 3. Updated $HOME/.bashrc as per above link 4. Changed hadoop.tmp.dir permission to 777. 5. $bin/start-all.sh gives chown: changing ownership of `/home/hduser/hadoop203/bin/../logs': Operationnot permitted starting namenode, logging to /home/hduser/hadoop203/bin/../logs/hadoop-arun-namenode-arun- Presario-C500-RU914PA-ACJ.out localhost: chown: changing ownership of `/home/hduser/hadoop203/bin/../logs': Operation not permitted localhost: starting datanode, logging to /home/hduser/hadoop203/bin/../logs/hadoop-arun-datanode-arun- Presario-C500-RU914PA-ACJ.out localhost: chown: changing ownership of `/home/hduser/hadoop203/bin/../logs': Operation not permitted localhost: starting secondarynamenode, logging to /home/hduser/hadoop203/bin/../logs/hadoop-arun-secondarynamenode- arun-Presario-C500-RU914PA-ACJ.out chown: changing ownership of `/home/hduser/hadoop203/bin/../logs': Operationnot permitted starting jobtracker, logging to /home/hduser/hadoop203/bin/../logs/hadoop-arun-jobtracker-arun- Presario-C500-RU914PA-ACJ.out localhost: chown: changing ownership of `/home/hduser/hadoop203/bin/../logs': Operation not permitted localhost: starting tasktracker, logging to /home/hduser/hadoop203/bin/../logs/hadoop-arun-tasktracker-arun- Presario-C500-RU914PA-ACJ.out How can i submit jobs from other users ? Any help ? Thanks, Arun -- View this message in context: http://lucene.472066.n3.nabble.com/Submitting-Jobs-from-different- user-to-a-queue-in-capacity-scheduler-tp3345752p3345752.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: Submitting Jobs from different user to a queue in capacity scheduler
Hello Arun, Now we reached to hadoop permissions ;) If you really need not worry about permissions, then you can disable it and proceed (dfs.permissions = false). else you can set the required permissions to user as well. permissions guide. http://hadoop.apache.org/common/docs/current/hdfs_permissions_guide.html Regards, Uma - Original Message - From: ArunKumar arunk...@gmail.com Date: Sunday, September 18, 2011 1:38 pm Subject: Re: Submitting Jobs from different user to a queue in capacity scheduler To: hadoop-u...@lucene.apache.org Hi ! I have given permissions in the beginning $ sudo chown -R hduser:hadoophadoop . I gave $chmod -R 777 hadoop When i try arun$ /home/hduser/hadoop203/bin/hadoop jar /home/hduser/hadoop203/hadoop-examples*.jar pi 1 1 I get Number of Maps = 1 Samples per Map = 1 org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=arun, access=WRITE, inode=user:hduser:supergroup:rwxr-xr-x . Caused by: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.security.AccessControlException: Permission denied: user=arun, access=WRITE, inode=user:hduser:supergroup:rwxr-xr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:199) I have attached mapred-sie.xml http://pastebin.com/scS6EevU here and capacity-scheduler.xml http://pastebin.com/ScGFAfv5 here Arun -- View this message in context: http://lucene.472066.n3.nabble.com/Submitting-Jobs-from-different- user-to-a-queue-in-capacity-scheduler-tp3345752p3345838.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: Submitting Jobs from different user to a queue in capacity scheduler
Hi Arun, Setting mapreduce.jobtracker.staging.root.dir propery value to /user might fix this issue... or other way could be, just execute below command hadoop fs -chmod 777 / Regards, Uma - Original Message - From: ArunKumar arunk...@gmail.com Date: Sunday, September 18, 2011 8:38 pm Subject: Re: Submitting Jobs from different user to a queue in capacity scheduler To: hadoop-u...@lucene.apache.org Hi Uma ! I have deleted the data in /app/hadoop/tmp and formatted namenode and restarted cluster.. I tried arun$ /home/hduser/hadoop203/bin/hadoop jar /home/hduser/hadoop203/hadoop-examples*.jar pi 1 1 Number of Maps = 1 Samples per Map = 1 org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=arun, access=WRITE, inode=:hduser:supergroup:rwxr-xr-x . Arun -- View this message in context: http://lucene.472066.n3.nabble.com/Submitting-Jobs-from-different- user-to-a-queue-in-capacity-scheduler-tp3345752p3346364.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
Re: Submitting Jobs from different user to a queue in capacity scheduler
Agreed. i suggested 'dfs.permissions' flag also earlier in this thread. :-) Regards, Uma - Original Message - From: Aaron T. Myers a...@cloudera.com Date: Monday, September 19, 2011 7:45 am Subject: Re: Submitting Jobs from different user to a queue in capacity scheduler To: common-user@hadoop.apache.org Cc: hadoop-u...@lucene.apache.org On Sun, Sep 18, 2011 at 9:35 AM, Uma Maheswara Rao G 72686 mahesw...@huawei.com wrote: or other way could be, just execute below command hadoop fs -chmod 777 / I wouldn't do this - it's overkill, and there's no way to go back. Instead,if you really want to disregard all permissions on HDFS, you can just set the config value dfs.permissions to false and restart your NN. This is still overkill, but at least you could roll back if you change your mind later. :) -- Aaron T. Myers Software Engineer, Cloudera
Re: risks of using Hadoop
Hi George, You can use it noramally as well. Append interfaces will be exposed. For Hbase, append support is required very much. Regards, Uma - Original Message - From: George Kousiouris gkous...@mail.ntua.gr Date: Saturday, September 17, 2011 12:29 pm Subject: Re: risks of using Hadoop To: common-user@hadoop.apache.org Cc: Uma Maheswara Rao G 72686 mahesw...@huawei.com Hi, When you say that 0.20.205 will support appends, you mean for general purpose writes on the HDFS? or only Hbase? Thanks, George On 9/17/2011 7:08 AM, Uma Maheswara Rao G 72686 wrote: 6. If you plan to use Hbase, it requires append support. 20Append has the support for append. 0.20.205 release also will have append support but not yet released. Choose your correct version to avoid sudden surprises. Regards, Uma - Original Message - From: Kobina Kwarkokobina.kwa...@gmail.com Date: Saturday, September 17, 2011 3:42 am Subject: Re: risks of using Hadoop To: common-user@hadoop.apache.org We are planning to use Hadoop in my organisation for quality of servicesanalysis out of CDR records from mobile operators. We are thinking of having a small cluster of may be 10 - 15 nodes and I'm preparing the proposal. my office requires that i provide some risk analysis in the proposal. thank you. On 16 September 2011 20:34, Uma Maheswara Rao G 72686 mahesw...@huawei.comwrote: Hello, First of all where you are planning to use Hadoop? Regards, Uma - Original Message - From: Kobina Kwarkokobina.kwa...@gmail.com Date: Saturday, September 17, 2011 0:41 am Subject: risks of using Hadoop To: common-usercommon-user@hadoop.apache.org Hello, Please can someone point some of the risks we may incur if we decide to implement Hadoop? BR, Isaac. -- --- George Kousiouris Electrical and Computer Engineer Division of Communications, Electronics and Information Engineering School of Electrical and Computer Engineering Tel: +30 210 772 2546 Mobile: +30 6939354121 Fax: +30 210 772 2569 Email: gkous...@mail.ntua.gr Site: http://users.ntua.gr/gkousiou/ National Technical University of Athens 9 Heroon Polytechniou str., 157 73 Zografou, Athens, Greece
Re: risks of using Hadoop
Yes, I was mentioning before append beacuse branch name itself is 20Append. Sync is the main api name to sync editlogs. @George mainly need to consider the Hbase usage. sync is supported. append api has some open issues. for example https://issues.apache.org/jira/browse/HDFS-1228 Apologies for the confusions if any. Thanks a lot for more clarification! Thanks Uma - Original Message - From: Todd Lipcon t...@cloudera.com Date: Sunday, September 18, 2011 1:35 am Subject: Re: risks of using Hadoop To: common-user@hadoop.apache.org To clarify, *append* is not supported and is known to be buggy. *sync* support is what HBase needs and what 0.20.205 will support. Before 205 is released, you can also find these features in CDH3 or by building your own release from SVN. -Todd On Sat, Sep 17, 2011 at 4:59 AM, Uma Maheswara Rao G 72686 mahesw...@huawei.com wrote: Hi George, You can use it noramally as well. Append interfaces will be exposed. For Hbase, append support is required very much. Regards, Uma - Original Message - From: George Kousiouris gkous...@mail.ntua.gr Date: Saturday, September 17, 2011 12:29 pm Subject: Re: risks of using Hadoop To: common-user@hadoop.apache.org Cc: Uma Maheswara Rao G 72686 mahesw...@huawei.com Hi, When you say that 0.20.205 will support appends, you mean for general purpose writes on the HDFS? or only Hbase? Thanks, George On 9/17/2011 7:08 AM, Uma Maheswara Rao G 72686 wrote: 6. If you plan to use Hbase, it requires append support. 20Append has the support for append. 0.20.205 release also will have append support but not yet released. Choose your correct version to avoid sudden surprises. Regards, Uma - Original Message - From: Kobina Kwarkokobina.kwa...@gmail.com Date: Saturday, September 17, 2011 3:42 am Subject: Re: risks of using Hadoop To: common-user@hadoop.apache.org We are planning to use Hadoop in my organisation for quality of servicesanalysis out of CDR records from mobile operators. We are thinking of having a small cluster of may be 10 - 15 nodes and I'm preparing the proposal. my office requires that i provide some risk analysis in the proposal. thank you. On 16 September 2011 20:34, Uma Maheswara Rao G 72686 mahesw...@huawei.comwrote: Hello, First of all where you are planning to use Hadoop? Regards, Uma - Original Message - From: Kobina Kwarkokobina.kwa...@gmail.com Date: Saturday, September 17, 2011 0:41 am Subject: risks of using Hadoop To: common-usercommon-user@hadoop.apache.org Hello, Please can someone point some of the risks we may incur if we decide to implement Hadoop? BR, Isaac. -- --- George Kousiouris Electrical and Computer Engineer Division of Communications, Electronics and Information Engineering School of Electrical and Computer Engineering Tel: +30 210 772 2546 Mobile: +30 6939354121 Fax: +30 210 772 2569 Email: gkous...@mail.ntua.gr Site: http://users.ntua.gr/gkousiou/ National Technical University of Athens 9 Heroon Polytechniou str., 157 73 Zografou, Athens, Greece -- Todd Lipcon Software Engineer, Cloudera
Re: Tutorial about Security in Hadoop
Hi, please find the below links https://media.blackhat.com/bh-us-10/whitepapers/Becherer/BlackHat-USA-2010-Becherer-Andrew-Hadoop-Security-wp.pdf http://markmail.org/download.xqy?id=yjdqleg3zv5pr54tnumber=1 Which will help you to understand more. Regards, Uma - Original Message - From: Xianqing Yu x...@ncsu.edu Date: Friday, September 16, 2011 10:43 pm Subject: Tutorial about Security in Hadoop To: common-user@hadoop.apache.org Hi Community, I am trying to install security mechanism in the Hadoop, for instance, using kerberos. However, I didn't find much information about it. Anyone knows that if there any link talking about the tutorial about installing kerberos in Hadoop? Thanks, Xianqing Yu -- Graduate Research Assistant, Cyber Defense Lab Department of Computer Science North Carolina State University, Raleigh, NC E-mail: x...@ncsu.edu
Re: risks of using Hadoop
Hello, First of all where you are planning to use Hadoop? Regards, Uma - Original Message - From: Kobina Kwarko kobina.kwa...@gmail.com Date: Saturday, September 17, 2011 0:41 am Subject: risks of using Hadoop To: common-user common-user@hadoop.apache.org Hello, Please can someone point some of the risks we may incur if we decide to implement Hadoop? BR, Isaac.
Re: risks of using Hadoop
Hi Kobina, Some experiences which may helpful for you with respective to DFS. 1. Selecting the correct version. I will recommend to use 0.20X version. This is pretty stable version and all other organizations prefers it. Well tested as well. Dont go for 21 version.This version is not a stable version.This is risk. 2. You should perform thorough test with your customer operations. (of-course you will do this :-)) 3. 0.20x version has the problem of SPOF. If NameNode goes down you will loose the data.One way of recovering is by using the secondaryNameNode.You can recover the data till last checkpoint.But here manual intervention is required. In latest trunk SPOF will be addressed bu HDFS-1623. 4. 0.20x NameNodes can not scale. Federation changes included in latest versions. ( i think in 22). this may not be the problem for your cluster. But please consider this aspect as well. 5. Please select the hadoop version depending on your security requirements. There are versions available for security as well in 0.20X. 6. If you plan to use Hbase, it requires append support. 20Append has the support for append. 0.20.205 release also will have append support but not yet released. Choose your correct version to avoid sudden surprises. Regards, Uma - Original Message - From: Kobina Kwarko kobina.kwa...@gmail.com Date: Saturday, September 17, 2011 3:42 am Subject: Re: risks of using Hadoop To: common-user@hadoop.apache.org We are planning to use Hadoop in my organisation for quality of servicesanalysis out of CDR records from mobile operators. We are thinking of having a small cluster of may be 10 - 15 nodes and I'm preparing the proposal. my office requires that i provide some risk analysis in the proposal. thank you. On 16 September 2011 20:34, Uma Maheswara Rao G 72686 mahesw...@huawei.comwrote: Hello, First of all where you are planning to use Hadoop? Regards, Uma - Original Message - From: Kobina Kwarko kobina.kwa...@gmail.com Date: Saturday, September 17, 2011 0:41 am Subject: risks of using Hadoop To: common-user common-user@hadoop.apache.org Hello, Please can someone point some of the risks we may incur if we decide to implement Hadoop? BR, Isaac.
Re: Is it possible to access the HDFS via Java OUTSIDE the Cluster?
Hi, It is very much possible. Infact that is the main use case for Hadoop :-) You need to put the hadoop-hdfs*.jar hdoop-common*.jar's in your class path from where you want to run the client program. At client node side use the below sample code Configuration conf=new Configuration(); //you can set the required configurations here FileSystem fs =new DistributedFileSystem(); fs.initialize(new URI(Name_Node_URL), conf); fs.copyToLocal(srcPath, destPath) fs.copyFromLocal(srcPath,destPath) .etc There are many API exposed in FileSystem.java class. So, you can make use of them. Regards, Uma - Original Message - From: Ralf Heyde ralf.he...@gmx.de Date: Monday, September 5, 2011 7:59 pm Subject: Is it possible to access the HDFS via Java OUTSIDE the Cluster? To: common-user@hadoop.apache.org Hello, I have found a HDFSClient which shows me, how to access my HDFS from inside the cluster (i.e. running on a Node). My Idea is, that different processes may write 64M Chunks to HDFS from external Sources/Clients. Is that possible? How that can be done? Does anybody have some Example Code? Thanks, Ralf
Re: Out of Memory Exception while building hadoop
Hi Jhon, Mostly the problem with your java. This problem can come if your java link refers to java-gcj. Please check some related links: http://jeffchannell.com/Flex-3/gc-warning.html Regards, Uma - Original Message - From: john smith js1987.sm...@gmail.com Date: Sunday, September 4, 2011 10:22 pm Subject: Out of Memory Exception while building hadoop To: common-user@hadoop.apache.org, common-...@hadoop.apache.org Hey folks, Strangely I get a out of memory exception while building hadoop from source. I have 2gigs of ram and I've tried building it from both eclipse and commandline http://pastebin.com/9pcHg1P9 is the full stack trace. Can anyone help me out on this? Thanks, John Smith
Re: /tmp/hadoop-oracle/dfs/name is in an inconsistent state
Hi, Before starting, you need to format the namenode. ./hdfs namenode -format then this directories will be created. respective configuration is 'dfs.namenode.name.dir' default configurations will exist in hdfs-default.xml. If you want to configure your own directory path, you can add the above property in hdfs-site.xml file. Regards, Uma Mahesh ** This email and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained here in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this email in error, please notify the sender by phone or email immediately and delete it! * - Original Message - From: Daniel,Wu hadoop...@163.com Date: Thursday, July 28, 2011 6:51 pm Subject: /tmp/hadoop-oracle/dfs/name is in an inconsistent state To: common-user@hadoop.apache.org When I started hadoop, the namenode failed to startup because of the following error. The strange thing is that it says/tmp/hadoop- oracle/dfs/name isinconsistent, but I don't think I have configured it as /tmp/hadoop-oracle/dfs/name. Where should I check for the related configuration? 2011-07-28 21:07:35,383 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /tmp/hadoop-oracle/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
Re: cygwin not connecting to Hadoop server
Hi A Df, see inline at :: - Original Message - From: A Df abbey_dragonfor...@yahoo.com Date: Wednesday, July 27, 2011 10:31 pm Subject: Re: cygwin not connecting to Hadoop server To: common-user@hadoop.apache.org common-user@hadoop.apache.org See inline at **. More questions and many Thanks :D From: Uma Maheswara Rao G 72686 mahesw...@huawei.com To: common-user@hadoop.apache.org; A Df abbey_dragonfor...@yahoo.comCc: common-user@hadoop.apache.org common-user@hadoop.apache.org Sent: Wednesday, 27 July 2011, 17:31 Subject: Re: cygwin not connecting to Hadoop server Hi A Df, Did you format the NameNode first? ** I had formatted it already but then I had reinstalled Java and upgraded the plugins in cygwin so I reformatted it again. :D yes it worked!! I am not sure all the steps that got it to finally work :: Great :-) but I will have to document it to prevent this headache in the future. Although I typed ssh localhost too , so question is, do I need to type ssh localhost each time I need to run hadoop?? Also, :: Actually ssh is an authentication service. To save the athentication keys, you can run below commands. which will provide authentication.No need to give password every time. ssh-keygen -t rsa -P cat /root/.ssh/id_rsa.pub /root/.ssh/authosized_keys then exceute /etc/init.d/sshd restart To connect to remote machines cat /root/.ssh/id_rsa.pub | ssh root@remoteIP 'cat /root/.ssh/authorized_keys' then in remote machine excute /etc/init.d/sshd restart since I need to work with Eclipse maybe you can have a look at my post about the plugin cause I can get the patch to work. The subject is Re: Cygwin not working with Hadoop and Eclipse Plugin. I plan to read up on how to write programs for Hadoop. I am using the tutorial at Yahoo but if you know of any really good about coding with Hadoop or just about understanding Hadoop then please let me know. Hadoop Definitive guide will the great book for understanding the Hadoop.Some sample prgrams also will be available. To check the Hadoop internals: http://www.google.co.in/url?sa=tsource=webcd=8ved=0CEMQFjAHurl=http%3A%2F%2Findia.paxcel.net%3A6060%2FLargeDataMatters%2Fwp-content%2Fuploads%2F2010%2F09%2FHDFS1.pdfrct=jq=hadoop%20internals%20%2B%20part%201ei=CqAxTtD8C4fprQfYq6DMCwusg=AFQjCNGYMQbAeGP0cYGl4OJHseRsfEMGvQcad=rja Can you check the NN logs whether NN is started or not? ** I checked and the previous runs had some logs missing but now the last one have all 5 logs and I got two conf files in xml. I also copied out the other output files which I plan to examine. Where do I specify the output extension format that I want for my output file? I was hoping for an txt file it shows the output in a file with no extension even though I can read it in Notepad++. I also got to view the web interface at: NameNode - http://localhost:50070/ JobTracker - http://localhost:50030/ ** See below for the working version, finally!! Thanks CMD Williams@TWilliams-LTPC ~/hadoop-0.20.2 $ bin/hadoop jar hadoop-0.20.2-examples.jar grep input 11/07/27 17:42:20 INFO mapred.FileInputFormat: Total in 11/07/27 17:42:20 INFO mapred.JobClient: Running job: j 11/07/27 17:42:21 INFO mapred.JobClient: map 0% reduce 11/07/27 17:42:33 INFO mapred.JobClient: map 15% reduc 11/07/27 17:42:36 INFO mapred.JobClient: map 23% reduc 11/07/27 17:42:39 INFO mapred.JobClient: map 38% reduc 11/07/27 17:42:42 INFO mapred.JobClient: map 38% reduc 11/07/27 17:42:45 INFO mapred.JobClient: map 53% reduc 11/07/27 17:42:48 INFO mapred.JobClient: map 69% reduc 11/07/27 17:42:51 INFO mapred.JobClient: map 76% reduc 11/07/27 17:42:54 INFO mapred.JobClient: map 92% reduc 11/07/27 17:42:57 INFO mapred.JobClient: map 100% redu 11/07/27 17:43:06 INFO mapred.JobClient: map 100% redu 11/07/27 17:43:09 INFO mapred.JobClient: Job complete: 11/07/27 17:43:09 INFO mapred.JobClient: Counters: 18 11/07/27 17:43:09 INFO mapred.JobClient: Job Counters 11/07/27 17:43:09 INFO mapred.JobClient: Launched r 11/07/27 17:43:09 INFO mapred.JobClient: Launched m 11/07/27 17:43:09 INFO mapred.JobClient: Data-local 11/07/27 17:43:09 INFO mapred.JobClient: FileSystemCo 11/07/27 17:43:09 INFO mapred.JobClient: FILE_BYTES 11/07/27 17:43:09 INFO mapred.JobClient: HDFS_BYTES 11/07/27 17:43:09 INFO mapred.JobClient: FILE_BYTES 11/07/27 17:43:09 INFO mapred.JobClient: HDFS_BYTES 11/07/27 17:43:09 INFO mapred.JobClient: Map-Reduce F 11/07/27 17:43:09 INFO mapred.JobClient: Reduce inp 11/07/27 17:43:09 INFO mapred.JobClient: Combine ou 11/07/27 17:43:09 INFO mapred.JobClient: Map input 11/07/27 17:43:09 INFO mapred.JobClient: Reduce shu 11/07/27 17:43:09 INFO mapred.JobClient: Reduce out 11/07/27 17:43:09 INFO mapred.JobClient: Spilled Re 11/07/27 17:43:09 INFO mapred.JobClient: Map output
Re: cygwin not connecting to Hadoop server
Hi A Df, see inline at :: - Original Message - From: A Df abbey_dragonfor...@yahoo.com Date: Wednesday, July 27, 2011 10:31 pm Subject: Re: cygwin not connecting to Hadoop server To: common-user@hadoop.apache.org common-user@hadoop.apache.org See inline at **. More questions and many Thanks :D From: Uma Maheswara Rao G 72686 mahesw...@huawei.com To: common-user@hadoop.apache.org; A Df abbey_dragonfor...@yahoo.comCc: common-user@hadoop.apache.org common-user@hadoop.apache.org Sent: Wednesday, 27 July 2011, 17:31 Subject: Re: cygwin not connecting to Hadoop server Hi A Df, Did you format the NameNode first? ** I had formatted it already but then I had reinstalled Java and upgraded the plugins in cygwin so I reformatted it again. :D yes it worked!! I am not sure all the steps that got it to finally work :: Great :-) but I will have to document it to prevent this headache in the future. Although I typed ssh localhost too , so question is, do I need to type ssh localhost each time I need to run hadoop?? Also, :: Actually ssh is an authentication service. To save the athentication keys, you can run below commands. which will provide authentication.No need to give password every time. ssh-keygen -t rsa -P cat /root/.ssh/id_rsa.pub /root/.ssh/authosized_keys then exceute /etc/init.d/sshd restart To connect to remote machines cat /root/.ssh/id_rsa.pub | ssh root@remoteIP 'cat /root/.ssh/authorized_keys' then in remote machine excute /etc/init.d/sshd restart since I need to work with Eclipse maybe you can have a look at my post about the plugin cause I can get the patch to work. The subject is Re: Cygwin not working with Hadoop and Eclipse Plugin. I plan to read up on how to write programs for Hadoop. I am using the tutorial at Yahoo but if you know of any really good about coding with Hadoop or just about understanding Hadoop then please let me know. Hadoop Definitive guide will the great book for understanding the Hadoop.Some sample prgrams also will be available. To check the Hadoop internals: http://www.google.co.in/url?sa=tsource=webcd=8ved=0CEMQFjAHurl=http%3A%2F%2Findia.paxcel.net%3A6060%2FLargeDataMatters%2Fwp-content%2Fuploads%2F2010%2F09%2FHDFS1.pdfrct=jq=hadoop%20internals%20%2B%20part%201ei=CqAxTtD8C4fprQfYq6DMCwusg=AFQjCNGYMQbAeGP0cYGl4OJHseRsfEMGvQcad=rja Can you check the NN logs whether NN is started or not? ** I checked and the previous runs had some logs missing but now the last one have all 5 logs and I got two conf files in xml. I also copied out the other output files which I plan to examine. Where do I specify the output extension format that I want for my output file? I was hoping for an txt file it shows the output in a file with no extension even though I can read it in Notepad++. I also got to view the web interface at: NameNode - http://localhost:50070/ JobTracker - http://localhost:50030/ ** See below for the working version, finally!! Thanks CMD Williams@TWilliams-LTPC ~/hadoop-0.20.2 $ bin/hadoop jar hadoop-0.20.2-examples.jar grep input 11/07/27 17:42:20 INFO mapred.FileInputFormat: Total in 11/07/27 17:42:20 INFO mapred.JobClient: Running job: j 11/07/27 17:42:21 INFO mapred.JobClient: map 0% reduce 11/07/27 17:42:33 INFO mapred.JobClient: map 15% reduc 11/07/27 17:42:36 INFO mapred.JobClient: map 23% reduc 11/07/27 17:42:39 INFO mapred.JobClient: map 38% reduc 11/07/27 17:42:42 INFO mapred.JobClient: map 38% reduc 11/07/27 17:42:45 INFO mapred.JobClient: map 53% reduc 11/07/27 17:42:48 INFO mapred.JobClient: map 69% reduc 11/07/27 17:42:51 INFO mapred.JobClient: map 76% reduc 11/07/27 17:42:54 INFO mapred.JobClient: map 92% reduc 11/07/27 17:42:57 INFO mapred.JobClient: map 100% redu 11/07/27 17:43:06 INFO mapred.JobClient: map 100% redu 11/07/27 17:43:09 INFO mapred.JobClient: Job complete: 11/07/27 17:43:09 INFO mapred.JobClient: Counters: 18 11/07/27 17:43:09 INFO mapred.JobClient: Job Counters 11/07/27 17:43:09 INFO mapred.JobClient: Launched r 11/07/27 17:43:09 INFO mapred.JobClient: Launched m 11/07/27 17:43:09 INFO mapred.JobClient: Data-local 11/07/27 17:43:09 INFO mapred.JobClient: FileSystemCo 11/07/27 17:43:09 INFO mapred.JobClient: FILE_BYTES 11/07/27 17:43:09 INFO mapred.JobClient: HDFS_BYTES 11/07/27 17:43:09 INFO mapred.JobClient: FILE_BYTES 11/07/27 17:43:09 INFO mapred.JobClient: HDFS_BYTES 11/07/27 17:43:09 INFO mapred.JobClient: Map-Reduce F 11/07/27 17:43:09 INFO mapred.JobClient: Reduce inp 11/07/27 17:43:09 INFO mapred.JobClient: Combine ou 11/07/27 17:43:09 INFO mapred.JobClient: Map input 11/07/27 17:43:09 INFO mapred.JobClient: Reduce shu 11/07/27 17:43:09 INFO mapred.JobClient: Reduce out 11/07/27 17:43:09 INFO mapred.JobClient: Spilled Re 11/07/27 17:43:09 INFO mapred.JobClient: Map output
Re: cygwin not connecting to Hadoop server
Hi A Df, Did you format the NameNode first? Can you check the NN logs whether NN is started or not? Regards, Uma ** This email and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained here in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this email in error, please notify the sender by phone or email immediately and delete it! * - Original Message - From: A Df abbey_dragonfor...@yahoo.com Date: Wednesday, July 27, 2011 9:55 pm Subject: cygwin not connecting to Hadoop server To: common-user@hadoop.apache.org common-user@hadoop.apache.org Hi All: I am have Hadoop 0.20.2 and I am using cygwin on Windows 7. I modified the files as shown below for the Hadoop configuration. conf/core-site.xml: configuration property namefs.default.name/name valuehdfs://localhost:9100/value /property /configuration conf/hdfs-site.xml: configuration property namedfs.replication/name value1/value /property /configuration conf/mapred-site.xml: configuration property namemapred.job.tracker/name valuelocalhost:9101/value /property /configuration Then I have the PATH variable with $PATH:/cygdrive/c/cygwin/bin:/cygdrive/c/cygwin/usr/bin I added JAVA_HOME to the file in cygwin\home\Williams\hadoop- 0.20.2\conf\hadoop-env.sh. My Java home is now at C:\Java\jdk1.6.0_26 so there is not space. I also turned off my firewall. However, I get the error from the command line: CODE Williams@TWilliams-LTPC ~ $ pwd /home/Williams Williams@TWilliams-LTPC ~ $ cd hadoop-0.20.2 Williams@TWilliams-LTPC ~/hadoop-0.20.2 $ bin/start-all.sh starting namenode, logging to /home/Williams/hadoop- 0.20.2/bin/../logs/hadoop-Wi lliams-namenode-TWilliams-LTPC.out localhost: starting datanode, logging to /home/Williams/hadoop- 0.20.2/bin/../logs/hadoop-Williams-datanode-TWilliams-LTPC.out localhost: starting secondarynamenode, logging to /home/Williams/hadoop-0.20.2/b in/../logs/hadoop-Williams-secondarynamenode-TWilliams-LTPC.out starting jobtracker, logging to /home/Williams/hadoop- 0.20.2/bin/../logs/hadoop- Williams-jobtracker-TWilliams-LTPC.out localhost: starting tasktracker, logging to /home/Williams/hadoop- 0.20.2/bin/../logs/hadoop-Williams-tasktracker-TWilliams-LTPC.out Williams@TWilliams-LTPC ~/hadoop-0.20.2 $ bin/hadoop fs -put conf input 11/07/27 17:11:28 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9100. Already tried 0 time(s). 11/07/27 17:11:30 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9100. Already tried 1 time(s). 11/07/27 17:11:32 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9100. Already tried 2 time(s). 11/07/27 17:11:34 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9100. Already tried 3 time(s). 11/07/27 17:11:36 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9100. Already tried 4 time(s). 11/07/27 17:11:38 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9100. Already tried 5 time(s). 11/07/27 17:11:40 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9100. Already tried 6 time(s). 11/07/27 17:11:43 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9100. Already tried 7 time(s). 11/07/27 17:11:45 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9100. Already tried 8 time(s). 11/07/27 17:11:47 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9100. Already tried 9 time(s). Bad connection to FS. command aborted. Williams@TWilliams-LTPC ~/hadoop-0.20.2 $ bin/hadoop fs -put conf input 11/07/27 17:17:29 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9100. Already tried 0 time(s). 11/07/27 17:17:31 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9100. Already tried 1 time(s). 11/07/27 17:17:33 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9100. Already tried 2 time(s). 11/07/27 17:17:35 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9100. Already tried 3 time(s). 11/07/27 17:17:37 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9100. Already tried 4 time(s). 11/07/27 17:17:39 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9100. Already tried 5 time(s). 11/07/27 17:17:41 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9100. Already tried 6 time(s). 11/07/27 17:17:44 INFO
Re: Build Hadoop 0.20.2 from source
Hi Vighnesh, Step 1) Download the code base from apache svn repository. Step 2) In root folder you can find build.xml file. In that folder just execute a)ant and b)ant eclipse this will generate the eclipse project setings files. After this directly you can import this project in you eclipse. Regards, Uma ** This email and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained here in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this email in error, please notify the sender by phone or email immediately and delete it! * - Original Message - From: Vighnesh Avadhani vighnesh.avadh...@gmail.com Date: Wednesday, July 27, 2011 11:08 am Subject: Build Hadoop 0.20.2 from source To: common-user@hadoop.apache.org Hi, I want to build Hadoop 0.20.2 from source using the Eclipse IDE. Can anyone help me with this? Regards, Vighnesh
Re: FW: Question about property fs.default.name
Hi Mahesh, When starting the NN, it will throw exception with your provided configuration. please check the code snippet below where exactly validation will happen. in NameNode: public static InetSocketAddress getAddress(URI filesystemURI) { String authority = filesystemURI.getAuthority(); if (authority == null) { throw new IllegalArgumentException(String.format( Invalid URI for NameNode address (check %s): %s has no authority., FileSystem.FS_DEFAULT_NAME_KEY, filesystemURI.toString())); } if (!FSConstants.HDFS_URI_SCHEME.equalsIgnoreCase( filesystemURI.getScheme())) { throw new IllegalArgumentException(String.format( Invalid URI for NameNode address (check %s): %s is not of scheme '%s'., FileSystem.FS_DEFAULT_NAME_KEY, filesystemURI.toString(), FSConstants.HDFS_URI_SCHEME)); } Since NN is specific to HDFS, it will expect scheme as hdfs. Otherwise it will throw the exception. If you develop your own file system, then clients will have intelligence to connect to that file system, based on your provided configuration for fs.default.name. Coming to clients side, if you pass file:/// as fs uri, it will not try to connect to DFS because your passed fs is related to local. So, it will create LocalFileSystem instead of DistributedFilesystem. Regards, Uma Regards, Uma ** This email and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained here in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this email in error, please notify the sender by phone or email immediately and delete it! * - Original Message - From: Mahesh Shinde mahesh_shi...@persistent.co.in Date: Saturday, July 23, 2011 4:17 pm Subject: FW: Question about property fs.default.name To: common-user@hadoop.apache.org common-user@hadoop.apache.org Hi, I have basic question on property fs.default.name is that I am not able to open NameNode URL when I set fs.default.name=file:/// . If we define not use HDFS as our file system then how hadoop deals with File system of local?.. Please reply. Mahesh Shinde | Systems Engineer mahesh_shi...@persistent.co.inmailto:mahesh_shi...@persistent.co.in| Cell: +918308321501 | Tel: +91-20-30235194 Persistent Systems Ltd. | 20 Glorious Years | www.persistentsys.comhttp://www.persistentsys.com/ DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.
Re: replicate data in HDFS with smarter encoding
Hi, We have already thoughts about it. Looks like you are talking about this features right https://issues.apache.org/jira/browse/HDFS-1640 https://issues.apache.org/jira/browse/HDFS-2115 but implementation not yet ready in trunk Regards, Uma ** This email and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained here in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this email in error, please notify the sender by phone or email immediately and delete it! * - Original Message - From: Da Zheng zhengda1...@gmail.com Date: Tuesday, July 19, 2011 9:23 am Subject: Re: replicate data in HDFS with smarter encoding To: common-user@hadoop.apache.org Cc: Joey Echeverria j...@cloudera.com, hdfs-u...@hadoop.apache.org hdfs-u...@hadoop.apache.org So this kind of feature is desired by the community? It seems this implementation can only reduce the data size on the disk by the background daemon RaidNode, but it cannot reduce the disk bandwidth and network bandwidth when the client writes data to HDFS. It might be more interesting to reduce the disk bandwidth and network bandwidth although it might require to modify the implementation of the pipeline in HDFS. Thanks, Da On 07/18/11 04:10, Joey Echeverria wrote: Facebook contributed some code to do something similar called HDFS RAID: http://wiki.apache.org/hadoop/HDFS-RAID -Joey On Jul 18, 2011, at 3:41, Da Zhengzhengda1...@gmail.com wrote: Hello, It seems that data replication in HDFS is simply data copy among nodes. Has anyone considered to use a better encoding to reduce the data size? Say, a block of data is split into N pieces, and as long as M pieces of data survive in the network, we can regenerate original data. There are many benefits to reduce the data size. It can save network and disk benefit, and thus reduce energy consumption. Computation power might be a concern, but we can use GPU to encode and decode. But maybe the idea is stupid or it's hard to reduce the data size. I would like to hear your comments. Thanks, Da