Determining input record directory using Streaming...
Hi All: Is there any way using Hadoop Streaming to determining the directory from which an input record is being read? This is straightforward in Hadoop using InputFormats, but I am curious if the same concept can be applied to streaming. The goal here is to read in data from 2 directories, say A/ and B/, and make decisions about what to do based on where the data is rooted. Thanks for any help...CG
What happens in HDFS DataNode recovery?
Hi All: I elected to take a node out of one of our grids for service. Naturally HDFS recognized the loss of the DataNode and did the right stuff, fixing replication issues and ultimately delivering a clean file system. So now the node I removed is ready to go back in service. When I return it to service a bunch of files will suddenly have a replication of 4 instead of 3. My questions: 1. Will HDFS delete a copy of the data to bring replication back to 3? 2. If (1) above is yes, will it remove the copy by deleting from other nodes, or will it remove files from the returned node, or both? The motivation for asking the questions are that I have a file system which is extremely unbalanced - we recently doubled the size of the grid when a few dozen terabytes already stored on the existing nodes. I am wondering if an easy way to restore some sense of balance is to cycle through the old nodes, removing each one from service for several hours and then return it to service. Thoughts? Thanks in Advance, C G
Copy data between HDFS instances...
Hi All: I am setting up 2 grids, each with its own HDFS. The grids are unaware of each other but exist on the same network. I'd like to copy data from one HDFS to the other. Is there a way to do this simply, or do I need to cobble together scripts to copy from HDFS on one side and pipe to a dfs -cp on the other side? I tried something like this: hadoop dfs -ls hdfs://grid1NameNode:portNo/ from grid2 trying to ls on grid1 but got a "wrong FS" error message. I also tried: hadoop dfs -ls hdfs://grid1NameNode:portNo/foo on grid2 where "/foo" exists on grid1 and got 0 files found. I assume there is some way to do this and I just don't have the right command line magic. This is Hadoop 0.15.0. Any help appreciated. Thanks, C G
NameNode memory usage and 32 vs. 64 bit JVMs
I've got a grid which has been up and running for some time. It's been using a 32 bit JVM. I am hitting the wall on memory within NameNode and need to specify max heap size > 4G. Is it possible to switch seemlessly from 32bit JVM to 64bit? I've tried this on a small test grid and had no issues, but I want to make sure it's OK to proceed. Speaking of NameNode, what does it keep in memory? Our memory usage ramped up rather suddenly recently. Also, does SecondaryNameNode require the same amount of memory as NameNode? Thanks for any help, C G
Re: HDFS Vs KFS
I've built and deployed KFS outside of Hadoop and it seems to work. I'm planning to bring up a test environment shortly running Hadoop with KFS. With all due respect to HDFS developers and committers, I am strongly hesitant to call HDFS "stable." We've had several major issues with HDFS in post-0.15.x releases. I don't know if KFS will be any better or more reliable, but it seems worth investing time finding out. --- On Thu, 8/21/08, Pete Wyckoff <[EMAIL PROTECTED]> wrote: From: Pete Wyckoff <[EMAIL PROTECTED]> Subject: Re: HDFS Vs KFS To: core-user@hadoop.apache.org Date: Thursday, August 21, 2008, 5:51 PM For hdfs: http://wiki.apache.org/hadoop/MountableHDFS Not sure how you can mount KFS?? On 8/21/08 10:47 AM, "Otis Gospodnetic" <[EMAIL PROTECTED]> wrote: > Isn't there FUSE for HDFS, as well as the WebDAV option? > > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > - Original Message >> From: Tim Wintle <[EMAIL PROTECTED]> >> To: core-user@hadoop.apache.org >> Sent: Thursday, August 21, 2008 1:42:51 PM >> Subject: Re: HDFS Vs KFS >> >> I haven't used KFS, but I believe a major difference is that you can >> (apparently) mount KFS as a standard device under Linux, allowing you to >> read and write directly to it without having to re-compile the >> application (as far as I know that's not possible with HDFS, although >> the last time I installed HDFS was 0.16) >> >> ... It is definitely much newer. >> >> >> On Fri, 2008-08-22 at 01:35 +0800, rae l wrote: >>> On Fri, Aug 22, 2008 at 12:34 AM, Wasim Bari wrote: KFS is also another Distributed file system implemented in C++. Here you can get details: http://kosmosfs.sourceforge.net/ >>> >>> Just from the basic information: >>> >>> http://sourceforge.net/projects/kosmosfs >>> >>> # Developers : 2 >>> # Development Status : 3 - Alpha >>> # Intended Audience : Developers >>> # Registered : 2007-08-30 21:05 >>> >>> and from the history of subversion repository: >>> >>> http://kosmosfs.svn.sourceforge.net/viewvc/kosmosfs/trunk/ >>> >>> I think it's just not stable and not widely used as HDFS: >>> >>> * HDFS is stable and production level available. >>> >>> This maybe not totally right and I'm waiting someone more familiar to >>> KFS to talk about this. >
0.18.0 DataNode refuses to start...
We can get the NameNode and SecondaryNameNode up and running, but DataNodes fail as shown below. Hadoop Jira 4019 tracks this problem (https://issues.apache.org/jira/browse/HADOOP-4019), but I'm curious if anybody has solved it yet... 2008-08-25 23:21:53,743 INFO org.apache.hadoop.dfs.DataNode: STARTUP_MSG: / STARTUP_MSG: Starting DataNode STARTUP_MSG: host = node1/10.2.11.1 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.18.0 STARTUP_MSG: build = http://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.18 -r 686010; compiled by 'hadoopqa' on Thu Aug 14 19:48:33 UTC 2008 / 2008-08-25 23:21:53,885 INFO org.apache.hadoop.dfs.DataNode: Registered FSDatasetStatusMBean 2008-08-25 23:21:53,887 INFO org.apache.hadoop.dfs.DataNode: Opened info server at 50010 2008-08-25 23:21:53,888 INFO org.apache.hadoop.dfs.DataNode: Balancing bandwith is 1048576 bytes/s 2008-08-25 23:21:53,941 INFO org.mortbay.util.Credential: Checking Resource aliases 2008-08-25 23:21:53,994 INFO org.mortbay.http.HttpServer: Version Jetty/5.1..4 2008-08-25 23:21:53,994 INFO org.mortbay.util.Container: Started HttpContext[/static,/static] 2008-08-25 23:21:53,995 INFO org.mortbay.util.Container: Started HttpContext[/logs,/logs] 2008-08-25 23:21:54,172 INFO org.mortbay.util.Container: Started [EMAIL PROTECTED] 2008-08-25 23:21:54,197 INFO org.mortbay.util.Container: Started WebApplicationContext[/,/] 2008-08-25 23:21:54,199 INFO org.mortbay.http.SocketListener: Started SocketListener on 0.0.0.0:50075 2008-08-25 23:21:54,199 INFO org.mortbay.util.Container: Started [EMAIL PROTECTED] 2008-08-25 23:21:54,202 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=DataNode, sessionId=null 2008-08-25 23:21:54,210 ERROR org.apache.hadoop.dfs.DataNode: java.lang.NullPointerException at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:130) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:119) at org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:359) at org.apache.hadoop.dfs.DataNode.(DataNode.java:190) at org.apache.hadoop.dfs.DataNode.makeInstance(DataNode.java:2987) at org.apache.hadoop.dfs.DataNode.instantiateDataNode(DataNode.java:2942) at org.apache.hadoop.dfs.DataNode.createDataNode(DataNode.java:2950) at org.apache.hadoop.dfs.DataNode.main(DataNode.java:3072) 2008-08-25 23:21:54,211 INFO org.apache.hadoop.dfs.DataNode: SHUTDOWN_MSG: / SHUTDOWN_MSG: Shutting down DataNode at node1/10.2.11.1 /
Re: dfs.DataNode connection issues
You should look at https://issues.apache.org/jira/browse/HADOOP-3678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12610003#action_12610003 as well. This eliminates spurious "connection reset by peer" messages that clutter up the DataNode logs and can be confusing. --- On Wed, 7/16/08, brainstorm <[EMAIL PROTECTED]> wrote: From: brainstorm <[EMAIL PROTECTED]> Subject: Re: dfs.DataNode connection issues To: core-user@hadoop.apache.org Date: Wednesday, July 16, 2008, 10:25 AM Raghu, seems to be resolved by your patch: http://issues.apache.org/jira/browse/HADOOP-3007 Do you know of any other "complaints" on this issue (conn reset & related errors) after applying this patch ? Thanks. On Wed, Jul 16, 2008 at 4:04 PM, brainstorm <[EMAIL PROTECTED]> wrote: > Just for the record, as I have seen on previous archives regarding > this same problem, I've changed the (cheap) 10/100 switch with a > (robust?) 100/1000 one and a couple of ethernet cables... and nope, in > my case it's not hardware related (at least on switch/cable end). > > Any other hints ? > > Thanks in advance ! > > On Wed, Jul 16, 2008 at 3:12 PM, brainstorm <[EMAIL PROTECTED]> wrote: >> If you refer to the other nodes: >> >> 2008-07-16 14:41:00,124 ERROR dfs.DataNode - >> 192.168.0.252:50010:DataXceiver: java.io.IOException: Block >> blk_7443738244200783289 has already been started (though not >> completed), and thus cannot be created. >>at org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:638) >>at org.apache.hadoop.dfs.DataNode$BlockReceiver.(DataNode.java:1983) >>at org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:1074) >>at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:938) >>at java.lang.Thread.run(Thread.java:595) >> >> 2008-07-16 14:41:00,309 ERROR dfs.DataNode - >> 192.168.0.252:50010:DataXceiver: java.io.IOException: Block >> blk_7443738244200783289 is valid, and cannot be written to. >>at org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:608) >>at org.apache.hadoop.dfs.DataNode$BlockReceiver.(DataNode.java:1983) >>at org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:1074) >>at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:938) >>at java.lang.Thread.run(Thread.java:595) >> >> and: >> >> 2008-07-16 14:41:00,178 WARN dfs.DataNode - >> 192.168.0.253:50010:Failed to transfer blk_7443738244200783289 to >> 192.168.0.252:50010 got java.net.SocketException: Connection reset >>at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96) >>at java.net.SocketOutputStream.write(SocketOutputStream.java:136) >>at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) >>at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109) >>at java.io.DataOutputStream.write(DataOutputStream.java:90) >>at org.apache.hadoop.dfs.DataNode$BlockSender.sendChunk(DataNode.java:1602) >>at org.apache.hadoop.dfs.DataNode$BlockSender.sendBlock(DataNode.java:1636) >>at org.apache.hadoop.dfs.DataNode$DataTransfer.run(DataNode.java:2391) >>at java.lang.Thread.run(Thread.java:595) >> >> (Seem inter-node DFS communication errors also :-/) >> >> On Tue, Jul 15, 2008 at 11:19 PM, Raghu Angadi <[EMAIL PROTECTED]> wrote: >>> >>> Are there any errors reported on the other side of the socket (for the first >>> error below, its the datanode on 192.168.0.251)?. >>> >>> Raghu. >>> >>> brainstorm wrote: I'm getting the following WARNINGs that seem to slow down my nutch processes on a 3 node and 1 frontend cluster: 2008-07-15 18:53:19,048 WARN dfs.DataNode - 192.168.0.100:50010:Failed to transfer blk_-8676066332392254756 to 192.168.0.251:50010 got java.net.SocketException: Connection reset at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96) at java.net.SocketOutputStream.write(SocketOutputStream.java:136) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109) at java.io.DataOutputStream.write(DataOutputStream.java:90) at org.apache.hadoop.dfs.DataNode$BlockSender.sendChunk(DataNode.java:1602) at org.apache.hadoop.dfs.DataNode$BlockSender.sendBlock(DataNode.java:1636) at org.apache.hadoop.dfs.DataNode$DataTransfer.run(DataNode.java:2391) at java.lang.Thread.run(Thread.java:595) 2008-07-15 18:53:52,162 WARN dfs.DataNode - 192.168.0.100:50010:Failed to transfer blk_5699662911845813103 to 192.168.0.253:50010 got java.net.SocketException: Broken pipe at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(S
Re: Re: Hadoop 0.17.0 - lots of I/O problems and can't run small datasets?
Yongqiang: Thanks for this information. I'll try your changes and see if the experiment runs better. Thanks, C G --- On Mon, 7/7/08, heyongqiang <[EMAIL PROTECTED]> wrote: From: heyongqiang <[EMAIL PROTECTED]> Subject: Re: Re: Hadoop 0.17.0 - lots of I/O problems and can't run small datasets? To: "core-user@hadoop.apache.org" Date: Monday, July 7, 2008, 9:03 PM i doubt this error was because one datanode quit during the client write,and that datanode was chosen by namenode for the client to contact to write(this was what DFSClient.DFSOutputStream.nextBlockOutputStream did). Default,client side retry 3 times and sleep total 3*xxx seconds,but NameNode need more time to find the deadnode.So every time when client wake up, there is a chance the dead node was chosen again. maybe u should chang the NameNode's interval finding the deadnode and chang the Client's sleep more long? I have changed the DFSClient.DFSOutputStream.nextBlockOutputStream's sleep code like below: if (!success) { LOG.info("Abandoning block " + block + " and retry..."); namenode.abandonBlock(block, src, clientName); // Connection failed. Let's wait a little bit and retry retry = true; try { if (System.currentTimeMillis() - startTime > 5000) { LOG.info("Waiting to find target node: " + nodes[0].getName()); } long time=heartbeatRecheckInterval; Thread.sleep(time); } catch (InterruptedException iex) { } } heartbeatRecheckInterval is exactly the interval of the NameNode's deadnode monitor's recheck interval.And I also changed the NameNode's deadnode recheck interval to be double of heartbeat interval. Best regards, Yongqiang He 2008-07-08 Email: [EMAIL PROTECTED] Tel: 86-10-62600966(O) Research Center for Grid and Service Computing, Institute of Computing Technology, Chinese Academy of Sciences P.O.Box 2704, 100080, Beijing, China 发件人: Raghu Angadi 发送时间: 2008-07-08 01:45:19 收件人: core-user@hadoop.apache.org 抄送: 主题: Re: Hadoop 0.17.0 - lots of I/O problems and can't run small datasets? ConcurrentModificationException looks like a bug we should file a jira. Regd why the writes are failing, we need to look at more logs.. Could you attach complete log from one of the failed tasks. Also try to see if there is anything in NameNode log around that time. Raghu. C G wrote: > Hi All: > > I've got 0.17.0 set up on a 7 node grid (6 slaves w/datanodes, 1 master running namenode). I'm trying to process a small (180G) dataset. I've done this succesfully and painlessly running 0.15.0. When I run 0.17.0 with the same data and same code (w/API changes for 0.17.0 and recompiled, of course), I get a ton of failures. I've increased the number of namenode threads trying to resolve this, but that doesn't seem to help. The errors are of the following flavor: > > java.io.IOException: Could not get block locations. Aborting... > java.io.IOException: All datanodes 10.2.11.2:50010 are bad. Aborting... > Exception in thread "Thread-2" java.util.ConcurrentModificationException > Exception closing file /blah/_temporary/_task_200807052311_0001_r_ > 04_0/baz/part-x > > As things stand right now, I can't deploy to 0.17.0 (or 0.16.4 or 0.17.1). I am wondering if anybody can shed some light on this, or if others are having similar problems. > > Any thoughts, insights, etc. would be greatly appreciated. > > Thanks, > C G > > Here's an ugly trace: > 08/07/06 01:43:29 INFO mapred.JobClient: map 100% reduce 93% > 08/07/06 01:43:29 INFO mapred.JobClient: Task Id : task_200807052311_0001_r_03_0, Status : FAILED > java.io.IOException: Could not get block locations. Aborting... > at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2080) > at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702) > at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818) > task_200807052311_0001_r_03_0: Exception closing file /output/_temporary/_task_200807052311_0001_r_ > 03_0/a/b/part-3 > task_200807052311_0001_r_03_0: java.io.IOException: All datanodes 10.2.11.2:50010 are bad. Aborting... > task_200807052311_0001_r_03_0: at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.ja > va:2095) > task_200807052311_0001_r_03_0: at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702) > task_200807052311_0001_r_03_0: at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1 > 818) > task_200807052311_0001_r_03_0: Exception in thread "Thread-2&
Hadoop 0.17.0 - lots of I/O problems and can't run small datasets?
Hi All: I've got 0.17.0 set up on a 7 node grid (6 slaves w/datanodes, 1 master running namenode). I'm trying to process a small (180G) dataset. I've done this succesfully and painlessly running 0.15.0. When I run 0.17.0 with the same data and same code (w/API changes for 0.17.0 and recompiled, of course), I get a ton of failures. I've increased the number of namenode threads trying to resolve this, but that doesn't seem to help. The errors are of the following flavor: java.io.IOException: Could not get block locations. Aborting... java.io.IOException: All datanodes 10.2.11.2:50010 are bad. Aborting... Exception in thread "Thread-2" java.util.ConcurrentModificationException Exception closing file /blah/_temporary/_task_200807052311_0001_r_ 04_0/baz/part-x As things stand right now, I can't deploy to 0.17.0 (or 0.16.4 or 0.17.1). I am wondering if anybody can shed some light on this, or if others are having similar problems. Any thoughts, insights, etc. would be greatly appreciated. Thanks, C G Here's an ugly trace: 08/07/06 01:43:29 INFO mapred.JobClient: map 100% reduce 93% 08/07/06 01:43:29 INFO mapred.JobClient: Task Id : task_200807052311_0001_r_03_0, Status : FAILED java.io.IOException: Could not get block locations. Aborting... at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2080) at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702) at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818) task_200807052311_0001_r_03_0: Exception closing file /output/_temporary/_task_200807052311_0001_r_ 03_0/a/b/part-3 task_200807052311_0001_r_03_0: java.io.IOException: All datanodes 10.2.11.2:50010 are bad. Aborting... task_200807052311_0001_r_03_0: at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.ja va:2095) task_200807052311_0001_r_03_0: at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702) task_200807052311_0001_r_03_0: at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1 818) task_200807052311_0001_r_03_0: Exception in thread "Thread-2" java.util..ConcurrentModificationException task_200807052311_0001_r_03_0: at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100) task_200807052311_0001_r_03_0: at java.util.TreeMap$KeyIterator.next(TreeMap.java:1154) task_200807052311_0001_r_03_0: at org.apache.hadoop.dfs.DFSClient.close(DFSClient.java:217) task_200807052311_0001_r_03_0: at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:214) task_200807052311_0001_r_03_0: at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1324) task_200807052311_0001_r_03_0: at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:224) task_200807052311_0001_r_03_0: at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:209) 08/07/06 01:44:32 INFO mapred.JobClient: map 100% reduce 74% 08/07/06 01:44:32 INFO mapred.JobClient: Task Id : task_200807052311_0001_r_01_0, Status : FAILED java.io.IOException: Could not get block locations. Aborting... at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2080) at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702) at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818) task_200807052311_0001_r_01_0: Exception in thread "Thread-2" java.util..ConcurrentModificationException task_200807052311_0001_r_01_0: at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100) task_200807052311_0001_r_01_0: at java.util.TreeMap$KeyIterator.next(TreeMap.java:1154) task_200807052311_0001_r_01_0: at org.apache.hadoop.dfs.DFSClient.close(DFSClient.java:217) task_200807052311_0001_r_01_0: at org.apache.hadoop.dfs.DistributedFileSystem.close(DistributedFileSystem.java:214) task_200807052311_0001_r_01_0: at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1324) task_200807052311_0001_r_01_0: at org.apache.hadoop.fs.FileSystem.closeAll(FileSystem.java:224) task_200807052311_0001_r_01_0: at org.apache.hadoop.fs.FileSystem$ClientFinalizer.run(FileSystem.java:209) 08/07/06 01:44:45 INFO mapred.JobClient: map 100% reduce 54%
Too many Task Manager children...
Hi All: I have mapred.tasktracker.tasks.maximum set to 4 in our conf/hadoop-site.xml, yet I frequently see 5-6 instances of org.apache.hadoop.mapred.TaskTracker$Child running on the slave nodes. Is there another setting I need to tweak in order to dial back the number of children running? The effect of running this many children is that our boxes have extremely high load factors, and eventually mapred tasks start timing out and failing. Note that the number of instances is for a single job. I see far more if I run multiple jobs simultaneously (something we do not typically do). This is on Hadoop 0.15.0, upgrading is not an option at the moment. Any help appreciate... Thanks, C G
Re: 0.16.4 DataNode problem...
I've repeated the experiment under more controlled circumstances: by creating a new file system formatted by 0.16.4 and then populating it. In this scenario we see the same problem: during the reduce phase the DataNode instances consume more and more memory until the system fails. Further, our server monitoring shows us that at the time of failure, each node in the system has about 7,000 open socket connections. We're now upgrading to 0.17.0 to repeat the same experiment, but I am pessimistic about getting any resolution. Does anybody have any insight into what might be going on? It seems really strange to have code that works in an old version but won't run in the more modern releases. Thanks, C G C G <[EMAIL PROTECTED]> wrote: Hi All: I'm seeing an inability to run one of our applications over a reasonably small dataset (~200G input) while running 0.16.4. Previously we were on 0.15.0 and the same application ran fine with the same dataset. A lengthy description follows, including log file output, etc. The failure mode smells like a bug in 0.16.4, but I'm not 100% positive about that. My questions are: 1. Any known issues upgrading from 0.15.0 to 0.16.4? Our code runs just fine over small datasets, but dies on these larger ones. We followed the upgrade instructions in the wiki, etc. 2. Would an upgrade to 0.17.0 help resolve these problems? 3. Would a re-format/re-load of HDFS help correct these issues? This is the thing I hope for the least in that I have 3T of data on-board HDFS and it will take days to dump it all and reload it. 4. Any other advice or help? I've been looking at this for the past few days and have been unable to make progress of solving it. I would hate to have to fall back to 0.15.0 (see above regarding 3T data reloads, not to mention being stuck on an old release). Any help, thoughts, comments, etc., would be very helpful. Thanks! Description: Following an upgrade from 0.15.0 to 0.16.4 (and after recompiling our apps, etc.), a job that used to run correctly on our grid now fails. The failure occurs after the map phase is complete, and about 2/3rds of the way through the reduce job. The error which gets kicked out from the application perspective is: 08/05/27 11:30:08 INFO mapred.JobClient: map 100% reduce 89% 08/05/27 11:30:41 INFO mapred.JobClient: map 100% reduce 90% 08/05/27 11:32:45 INFO mapred.JobClient: map 100% reduce 86% 08/05/27 11:32:45 INFO mapred.JobClient: Task Id : task_200805271056_0001_r_07_0, Status : FAILED java.io.IOException: Could not get block locations. Aborting... at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanode Error(DFSClient.java:1832) at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1100(DFSClient.java:1487) at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1579) I then discovered that 1 or more DataNode instances on the slave nodes are down (we run 1 DataNode instance per machine). The cause for at least some of the DataNode failures is a JVM internal error that gets raised due to a complete out-of-memory scenario (on a 4G, 4-way machine). Watching the DataNodes run, I can see them consuming more and more memory. For those failures for which there is a JVM traceback, I see (in part): # # java.lang.OutOfMemoryError: requested 16 bytes for CHeapObj-new. Out of swap space? # # Internal Error (414C4C4F434154494F4E0E494E4C494E450E4850500017), pid=4246, tid=2283883408 # # Java VM: Java HotSpot(TM) Server VM (1.6.0_02-b05 mixed mode) # If you would like to submit a bug report, please visit: # http://java.sun.com/webapps/bugreport/crash.jsp # --- T H R E A D --- Current thread (0x8a942000): JavaThread "[EMAIL PROTECTED]" daemon [_thread_in_Java, id=15064] Stack: [0x881c4000,0x88215000), sp=0x882139e0, free space=318k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x53b707] V [libjvm.so+0x225fe1] V [libjvm.so+0x16fdc5] V [libjvm.so+0x22aef3] Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) v blob 0xf4f235a7 J java.io.DataInputStream.readInt()I j org.apache.hadoop.dfs.DataNode$BlockReceiver.receiveBlock(Ljava/io/DataOutputStream;Ljava/io/DataInputStream;Ljava/io/DataOutputStream;Ljava/lang/String;Lorg/a pache/hadoop/dfs/DataNode$Throttler;I)V+126 j org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(Ljava/io/DataInputStream;)V+746 j org.apache.hadoop.dfs.DataNode$DataXceiver.run()V+174 j java.lang.Thread.run()V+11 v ~StubRoutines::call_stub --- P R O C E S S --- Java Threads: ( => current thread ) 0x0ae3f400 JavaThread "process reaper" daemon [_thread_blocked, id=26870] 0x852e6000 JavaThread "[EMAIL PROTECTED]" daemon [_thread_in_vm, id=26869] 0x08a1cc00 JavaThread "PacketResponder 0 for Block blk_-6186975972786687394" daem
Re: 0.16.4 DataNode problem...
After stopping/restarting DFS an fsck shows that the file system is healthy with no under/over replicated data, and no missing files/blocks. Ted Dunning <[EMAIL PROTECTED]> wrote: What does hadoop dfs -fsck show you? On 5/27/08 8:58 AM, "C G" wrote: > Hi All: > > I'm seeing an inability to run one of our applications over a reasonably > small dataset (~200G input) while running 0.16.4. Previously we were on > 0.15.0 and the same application ran fine with the same dataset. > > A lengthy description follows, including log file output, etc. The failure > mode smells like a bug in 0.16.4, but I'm not 100% positive about that. > > My questions are: > > 1. Any known issues upgrading from 0.15.0 to 0.16.4? Our code runs just > fine over small datasets, but dies on these larger ones. We followed the > upgrade instructions in the wiki, etc. > > 2. Would an upgrade to 0.17.0 help resolve these problems? > > 3. Would a re-format/re-load of HDFS help correct these issues? This is > the thing I hope for the least in that I have 3T of data on-board HDFS and it > will take days to dump it all and reload it. > > 4. Any other advice or help? > > I've been looking at this for the past few days and have been unable to make > progress of solving it. I would hate to have to fall back to 0.15.0 (see > above regarding 3T data reloads, not to mention being stuck on an old > release). Any help, thoughts, comments, etc., would be very helpful. > Thanks! > > Description: > Following an upgrade from 0.15.0 to 0.16.4 (and after recompiling our apps, > etc.), a job that used to run correctly on our grid now fails. The failure > occurs after the map phase is complete, and about 2/3rds of the way through > the reduce job. The error which gets kicked out from the application > perspective is: > > 08/05/27 11:30:08 INFO mapred.JobClient: map 100% reduce 89% > 08/05/27 11:30:41 INFO mapred.JobClient: map 100% reduce 90% > 08/05/27 11:32:45 INFO mapred.JobClient: map 100% reduce 86% > 08/05/27 11:32:45 INFO mapred.JobClient: Task Id : > task_200805271056_0001_r_07_0, Status : FAILED > java.io.IOException: Could not get block locations. Aborting... > at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanode > Error(DFSClient.java:1832) > at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1100(DFSClient.java:148 > 7) > at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.jav > a:1579) > > > I then discovered that 1 or more DataNode instances on the slave nodes are > down (we run 1 DataNode instance per machine). The cause for at least some of > the DataNode failures is a JVM internal error that gets raised due to a > complete out-of-memory scenario (on a 4G, 4-way machine). > > Watching the DataNodes run, I can see them consuming more and more memory. > For those failures for which there is a JVM traceback, I see (in part): > # > # java.lang.OutOfMemoryError: requested 16 bytes for CHeapObj-new. Out of swap > space? > # > # Internal Error (414C4C4F434154494F4E0E494E4C494E450E4850500017), pid=4246, > tid=2283883408 > # > # Java VM: Java HotSpot(TM) Server VM (1.6.0_02-b05 mixed mode) > # If you would like to submit a bug report, please visit: > # http://java.sun.com/webapps/bugreport/crash.jsp > # > --- T H R E A D --- > Current thread (0x8a942000): JavaThread > "[EMAIL PROTECTED]" daemon [_thread_in_Java, > id=15064] > Stack: [0x881c4000,0x88215000), sp=0x882139e0, free space=318k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native > code) > V [libjvm.so+0x53b707] > V [libjvm.so+0x225fe1] > V [libjvm.so+0x16fdc5] > V [libjvm.so+0x22aef3] > Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) > v blob 0xf4f235a7 > J java.io.DataInputStream.readInt()I > j > org.apache.hadoop.dfs.DataNode$BlockReceiver.receiveBlock(Ljava/io/DataOutputS > tream;Ljava/io/DataInputStream;Ljava/io/DataOutputStream;Ljava/lang/String;Lor > g/a > pache/hadoop/dfs/DataNode$Throttler;I)V+126 > j > org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(Ljava/io/DataInputStream > ;)V+746 > j org.apache.hadoop.dfs.DataNode$DataXceiver.run()V+174 > j java.lang.Thread.run()V+11 > v ~StubRoutines::call_stub > --- P R O C E S S --- > Java Threads: ( => current thread ) > 0x0ae3f400 JavaThread "process reaper" daemon [_thread_blocked, id=26870] > 0x852e6000 JavaThread "[EMAIL PROTECTED]" > daemon [_thread_in_vm, id=26869] > 0x08a1cc00 JavaThread "PacketResponder 0 for Block blk_-6186975972786687394" > daemon [_thread_blocked, id=26769] > 0x852e5000
Which version of Java for Hadoop 0.16.x?
Folks: What version of the JVM/JDK is everyone running in order to run Hadoop 0.16.x (specifically .4 in my case)? We're running: java version "1.6.0_02" Java(TM) SE Runtime Environment (build 1.6.0_02-b05) Java HotSpot(TM) Server VM (build 1.6.0_02-b05, mixed mode) and I'm seeing DataNodes crash under moderately heavy load (processing 250G of input data) with errors like: # An unexpected error has been detected by Java Runtime Environment: # # Internal Error (4E4D4554484F440E435050071F), pid=31202, tid=413023120 # # Java VM: Java HotSpot(TM) Server VM (1.6.0_02-b05 mixed mode) # If you would like to submit a bug report, please visit: # http://java.sun.com/webapps/bugreport/crash.jsp # --- T H R E A D --- Current thread (0x86b7fc00): JavaThread "[EMAIL PROTECTED]" daemon [_thread_in_Java, id=3795] Stack: [0x18993000,0x189e4000), sp=0x189e28f0, free space=318k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x53b707] V [libjvm.so+0x225f2f] V [libjvm.so+0x4413ec] V [libjvm.so+0x4ae0fa] V [libjvm.so+0x454290] V [libjvm.so+0x4518f8] So I'm curious who is running 0.16.4 and what version of Java is being used. Thanks, C G
"firstbadlink is/as" messages in 0.16.4
Hi All: So far, running 0.16.4 has been a bit of a nightmare. The latest problem I'm seeing concerns a series of odd messages concerning "firstbadlink". I've attached one of these message sequences. I'm curious what it's trying to tell me. This is from the master node, but the same pattern also occurs on all the slaves as well. Thanks for any insight... C G 2008-05-24 22:35:00,170 INFO org.apache.hadoop.dfs.DataNode: Receiving block blk_-5166040941538436352 src: /10.2.13.1:41942 dest: /10.2.13.1:50010 2008-05-24 22:35:00,283 INFO org.apache.hadoop.dfs.DataNode: Datanode 2 got response for connect ack from downstream datanode with firstbadlink as 2008-05-24 22:35:00,283 INFO org.apache.hadoop.dfs.DataNode: Datanode 2 forwarding connect ack to upstream firstbadlink is 2008-05-24 22:35:00,313 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-5166040941538436352 of size 427894 from /10.2.13.1 2008-05-24 22:35:00,313 INFO org.apache.hadoop.dfs.DataNode: PacketResponder 2 for block blk_-5166040941538436352 terminating
Re: 0.16.4 DFS dropping blocks, then won't retart...
Ugh, that solved the problem. Thanks Dhruba! Thanks, C G Dhruba Borthakur <[EMAIL PROTECTED]> wrote: If you look at the log message starting with "STARTUP_MSG: build =..." you will see that the namenode and good datanode was built by CG whereas the bad datanodes were compiled by hadoopqa! thanks, dhruba On Fri, May 23, 2008 at 9:01 AM, C G wrote: > 2008-05-23 11:53:25,377 INFO org.apache.hadoop.dfs.NameNode: STARTUP_MSG: > / > STARTUP_MSG: Starting NameNode > STARTUP_MSG: host = primary/10.2.13.1 > STARTUP_MSG: args = [] > STARTUP_MSG: version = 0.16.4-dev > STARTUP_MSG: build = svn+ssh://[EMAIL > PROTECTED]/srv/svn/repositories/svnvmc/overdrive/trunk/hadoop-0.16.4 -r 2182; > compiled > by 'cg' on Mon May 19 17:47:05 EDT 2008 > / > 2008-05-23 11:53:26,107 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: > Initializing RPC Metrics with hostName=NameNode, port=54310 > 2008-05-23 11:53:26,136 INFO org.apache.hadoop.dfs.NameNode: Namenode up at: > overdrive1-node-primary/10.2.13.1:54310 > 2008-05-23 11:53:26,146 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: > Initializing JVM Metrics with processName=NameNode, sessionId=null > 2008-05-23 11:53:26,149 INFO org.apache.hadoop.dfs.NameNodeMetrics: > Initializing NameNodeMeterics using context object:org.apache.hadoop.metr > ics.spi.NullContext > 2008-05-23 11:53:26,463 INFO org.apache.hadoop.fs.FSNamesystem: fsOwner=cg,cg > 2008-05-23 11:53:26,463 INFO org.apache.hadoop.fs.FSNamesystem: > supergroup=supergroup > 2008-05-23 11:53:26,463 INFO org.apache.hadoop.fs.FSNamesystem: > isPermissionEnabled=true > 2008-05-23 11:53:36,064 INFO org.apache.hadoop.fs.FSNamesystem: Finished > loading FSImage in 9788 msecs > 2008-05-23 11:53:36,079 INFO org.apache.hadoop.dfs.StateChange: STATE* > SafeModeInfo.enter: Safe mode is ON. > Safe mode will be turned off automatically. > 2008-05-23 11:53:36,115 INFO org.apache.hadoop.fs.FSNamesystem: Registered > FSNamesystemStatusMBean > 2008-05-23 11:53:36,339 INFO org.mortbay.util.Credential: Checking Resource > aliases > 2008-05-23 11:53:36,410 INFO org.mortbay.http.HttpServer: Version Jetty/5.1.4 > 2008-05-23 11:53:36,410 INFO org.mortbay.util.Container: Started > HttpContext[/static,/static] > 2008-05-23 11:53:36,410 INFO org.mortbay.util.Container: Started > HttpContext[/logs,/logs] > 2008-05-23 11:53:36,752 INFO org.mortbay.util.Container: Started [EMAIL > PROTECTED] > 2008-05-23 11:53:36,925 INFO org.mortbay.util.Container: Started > WebApplicationContext[/,/] > 2008-05-23 11:53:36,926 INFO org.mortbay.http.SocketListener: Started > SocketListener on 0.0.0.0:50070 > 2008-05-23 11:53:36,926 INFO org.mortbay.util.Container: Started [EMAIL > PROTECTED] > 2008-05-23 11:53:36,926 INFO org.apache.hadoop.fs.FSNamesystem: Web-server up > at: 0.0.0.0:50070 > 2008-05-23 11:53:36,927 INFO org.apache.hadoop.ipc.Server: IPC Server > Responder: starting > 2008-05-23 11:53:36,927 INFO org.apache.hadoop.ipc.Server: IPC Server > listener on 54310: starting > 2008-05-23 11:53:36,939 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 0 on 54310: starting > 2008-05-23 11:53:36,939 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 1 on 54310: starting > 2008-05-23 11:53:36,939 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 2 on 54310: starting > 2008-05-23 11:53:36,939 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 3 on 54310: starting > 2008-05-23 11:53:36,939 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 4 on 54310: starting > 2008-05-23 11:53:36,939 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 5 on 54310: starting > 2008-05-23 11:53:36,939 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 6 on 54310: starting > 2008-05-23 11:53:36,939 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 7 on 54310: starting > 2008-05-23 11:53:36,939 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 8 on 54310: starting > 2008-05-23 11:53:36,940 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 9 on 54310: starting > 2008-05-23 11:53:37,096 INFO org.apache.hadoop.dfs.NameNode: Error report > from worker9:50010: Incompatible build versions: na > menode BV = 2182; datanode BV = 652614 > 2008-05-23 11:53:37,097 INFO org.apache.hadoop.dfs.NameNode: Error report > from worker12:50010: Incompatible build versions: n > amenode BV = 2182; datanode BV = 652614 > [error above repeated for all nodes in system] > 2008-05-23 11:53:42,082 INFO org.apache.hadoop.dfs.StateChange: BLOCK* > NameSystem.registerDatanode: node registration from 10.2.13.1:50010 st > orage DS-18
Re: 0.16.4 DFS dropping blocks, then won't retart...
23 11:53:40,786 INFO org.apache.hadoop.dfs.DataNode: Registered FSDatasetStatusMBean 2008-05-23 11:53:40,786 INFO org.apache.hadoop.dfs.DataNode: Opened server at 50010 2008-05-23 11:53:40,793 INFO org.apache.hadoop.dfs.DataNode: Balancing bandwith is 1048576 bytes/s 2008-05-23 11:53:41,838 INFO org.mortbay.util.Credential: Checking Resource aliases 2008-05-23 11:53:41,868 INFO org.mortbay.http.HttpServer: Version Jetty/5.1.4 2008-05-23 11:53:41,869 INFO org.mortbay.util.Container: Started HttpContext[/static,/static] 2008-05-23 11:53:41,869 INFO org.mortbay.util.Container: Started HttpContext[/logs,/logs] 2008-05-23 11:53:42,051 INFO org.mortbay.util.Container: Started [EMAIL PROTECTED] 2008-05-23 11:53:42,079 INFO org.mortbay.util.Container: Started WebApplicationContext[/,/] 2008-05-23 11:53:42,081 INFO org.mortbay.http.SocketListener: Started SocketListener on 0.0.0.0:50075 2008-05-23 11:53:42,081 INFO org.mortbay.util.Container: Started [EMAIL PROTECTED] 2008-05-23 11:53:42,101 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=DataNode, sessionId=null 2008-05-23 11:53:42,120 INFO org.apache.hadoop.dfs.DataNode: 10.2.13.1:50010In DataNode.run, data = FSDataset{dirpath='/data/HDFS/data/curren t'} 2008-05-23 11:53:42,121 INFO org.apache.hadoop.dfs.DataNode: using BLOCKREPORT_INTERVAL of 3368704msec Initial delay: 6msec 2008-05-23 11:53:46,169 INFO org.apache.hadoop.dfs.DataNode: BlockReport of 66383 blocks got processed in 3027 msecs 2008-05-23 11:54:47,033 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_672882539393226281 2008-05-23 11:54:47,070 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_4933623861101284298 2008-05-23 11:54:51,834 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_-4096375515223627412 2008-05-23 11:54:52,834 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_-4329313062145243554 2008-05-23 11:54:52,869 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_5951529374648563965 2008-05-23 11:54:53,033 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_5526809302368511891 2008-05-23 11:55:07,101 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_3384706504442100270 2008-05-23 11:56:23,966 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_-8100668927196678325 2008-05-23 11:56:24,165 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_-9045089577001067802 2008-05-23 11:56:53,365 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_-5156742068519955681 2008-05-23 11:56:53,375 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_8099933609289941991 2008-05-23 11:56:57,164 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_-519952963742834206 2008-05-23 11:56:57,565 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_-7514486773323267604 2008-05-23 11:56:59,366 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_5706035426017364787 2008-05-23 11:56:59,398 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_-8163260915256505245 2008-05-23 11:57:08,455 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_-3800057016159468929 2008-05-23 11:57:27,159 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_-1945776220462007170 2008-05-23 11:57:41,058 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_1059797111434771921 2008-05-23 11:57:50,107 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_335910613100888045 2008-05-23 11:58:04,999 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_-758702836140613218 2008-05-23 11:58:17,060 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_5680261036802662113 2008-05-23 11:58:31,128 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_6577967380328271133 2008-05-23 11:58:45,185 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_-7268945479231310134 2008-05-23 11:58:59,450 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_5582966652198891861 2008-05-23 11:59:14,499 INFO org.apache.hadoop.dfs.DataBlockScanner: Verification succeeded for blk_-8204668722708860846 Raghu Angadi <[EMAIL PROTECTED]> wrote: Can you attach initialization part of NameNode? thanks, Raghu. C G wrote: > We've recently upgraded from 0.15.0 to 0.16.4. Two nights ago we had a > problem where DFS nodes could not communicate. After not finding anything > obviously wrong we decided to shut down DFS and restart. Following restart I > was seeing a corrupted system with significant amounts of missing data. > Furt
0.16.4 DFS dropping blocks, then won't retart...
We've recently upgraded from 0.15.0 to 0.16.4. Two nights ago we had a problem where DFS nodes could not communicate. After not finding anything obviously wrong we decided to shut down DFS and restart. Following restart I was seeing a corrupted system with significant amounts of missing data. Further checking showed that DataNodes on all slaves did not start due to what looks like a version skew issue. Our distribution is a straight 0.16.4 dist, so I'm having difficulty understanding what's causing this issue. Note that we haven't finalized the upgrade yet. Any help understanding this problem would be very much appreciated. We have several TB of data in our system and reloading from scratch would be a big problem. Here is the log from one of the failed nodes: / STARTUP_MSG: Starting DataNode STARTUP_MSG: host = worker9/10.2.0.9 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.16.4 STARTUP_MSG: build = http://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.16 -r 652614; compiled by 'hadoopqa' on Fri May 2 00:18:12 UTC 2008 / 2008-05-23 08:10:47,196 FATAL org.apache.hadoop.dfs.DataNode: Incompatible build versions: namenode BV = 2182; datanode BV = 652614 2008-05-23 08:10:47,202 ERROR org.apache.hadoop.dfs.DataNode: java.io.IOException: Incompatible build versions: namenode BV = 2182; datanode BV = 652614 at org.apache.hadoop.dfs.DataNode.handshake(DataNode.java:342) at org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:213) at org.apache.hadoop.dfs.DataNode.(DataNode.java:162) at org.apache.hadoop.dfs.DataNode.makeInstance(DataNode.java:2512) at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2456) at org.apache.hadoop.dfs.DataNode.createDataNode(DataNode.java:2477) at org.apache.hadoop.dfs.DataNode.main(DataNode.java:2673) 2008-05-23 08:10:47,203 INFO org.apache.hadoop.dfs.DataNode: SHUTDOWN_MSG: / SHUTDOWN_MSG: Shutting down DataNode at worker9/10.2.0.9 /
Re: When is HDFS really "corrupt"...(and can I upgrade a corrupt FS?)
Lohit: Awesome, thanks very much. I deleted that file (and the other spurious task files laying around) and the file system is now HEALTHY. I really appreciate the help! Thanks, C G lohit <[EMAIL PROTECTED]> wrote: Yes, that file is temp file used by one of your reducer. That is a file which was opened, but never closed hence namenode does not know location information of last block of such files. In hadoop-0.18 we have an option to filter of files which are open and do not consider them part contributing to filesystem as being CORRUPT. Thanks, Lohit - Original Message From: C G To: core-user@hadoop.apache.org Sent: Thursday, May 15, 2008 12:51:55 PM Subject: Re: When is HDFS really "corrupt"...(and can I upgrade a corrupt FS?) I hadn't considered looking for the word MISSING...thanks for the heads-up. I did a search and found the following: /output/ae/_task_200803191317_9183_r_08_1/part-8 0, 390547 block(s): MISSING 1 blocks of total size 0 B 0. -7099420740240431420 len=0 MISSING! That's the only one found. Is it safe/sufficient to simply delete this file? There were MR jobs active when the master failed...it wasn't a clean shutdown by any means. I surmise this file is remnant from an active job. Thanks, C G Lohit wrote: Filesystem is considered corrupt if there are any missing blocks. do you see MISSING in your output? and also we see missing blocks for files not closed yet. When u stopped MR cluster where there any jobs running? On May 15, 2008, at 12:15 PM, C G wrote: Earlier this week I wrote about a master node crash and our efforts to recover from the crash. We recovered from the crash and all systems are normal. However, I have a concern about what fsck is reporting and what it really means for a filesystem to be marked "corrupt." With the mapred engine shut down, I ran fsck / -files -blocks -locations to inspect the file system. The output looks clean with the exception of this at the end of the output: Status: CORRUPT Total size: 5113667836544 B Total blocks: 1070996 (avg. block size 4774684 B) Total dirs: 50012 Total files: 1027089 Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Target replication factor: 3 Real replication factor: 3.0 The filesystem under path '/' is CORRUPT In reviewing the fsck output, there are no obvious errors being reported. I see tons of output like this: /foo/bar/part-5 3387058, 6308 block(s): OK 0. 4958936159429948772 len=3387058 repl=3 [10.2.14.5:50010, 10.2.14.20:50010, 10.2.14.8:50010] and the only status ever reported is "OK." So this begs the question about what causes HDFS to declare the FS is "corrupt" and how do I clear this up? The second question, assuming that I can't make the "corrupt" state go away, concerns running an upgrade. If every file in HDFS reports "OK" but the FS reports "corrupt", is it safe to undertake an upgrade from 0.15.x to 0.16.4 ? Thanks for any help C G
Re: When is HDFS really "corrupt"...(and can I upgrade a corrupt FS?)
I hadn't considered looking for the word MISSING...thanks for the heads-up. I did a search and found the following: /output/ae/_task_200803191317_9183_r_08_1/part-8 0, 390547 block(s): MISSING 1 blocks of total size 0 B 0. -7099420740240431420 len=0 MISSING! That's the only one found. Is it safe/sufficient to simply delete this file? There were MR jobs active when the master failed...it wasn't a clean shutdown by any means. I surmise this file is remnant from an active job. Thanks, C G Lohit <[EMAIL PROTECTED]> wrote: Filesystem is considered corrupt if there are any missing blocks. do you see MISSING in your output? and also we see missing blocks for files not closed yet. When u stopped MR cluster where there any jobs running? On May 15, 2008, at 12:15 PM, C G wrote: Earlier this week I wrote about a master node crash and our efforts to recover from the crash. We recovered from the crash and all systems are normal. However, I have a concern about what fsck is reporting and what it really means for a filesystem to be marked "corrupt." With the mapred engine shut down, I ran fsck / -files -blocks -locations to inspect the file system. The output looks clean with the exception of this at the end of the output: Status: CORRUPT Total size: 5113667836544 B Total blocks: 1070996 (avg. block size 4774684 B) Total dirs: 50012 Total files: 1027089 Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Target replication factor: 3 Real replication factor: 3.0 The filesystem under path '/' is CORRUPT In reviewing the fsck output, there are no obvious errors being reported. I see tons of output like this: /foo/bar/part-5 3387058, 6308 block(s): OK 0. 4958936159429948772 len=3387058 repl=3 [10.2.14.5:50010, 10.2.14.20:50010, 10.2.14.8:50010] and the only status ever reported is "OK." So this begs the question about what causes HDFS to declare the FS is "corrupt" and how do I clear this up? The second question, assuming that I can't make the "corrupt" state go away, concerns running an upgrade. If every file in HDFS reports "OK" but the FS reports "corrupt", is it safe to undertake an upgrade from 0.15.x to 0.16.4 ? Thanks for any help C G
When is HDFS really "corrupt"...(and can I upgrade a corrupt FS?)
Earlier this week I wrote about a master node crash and our efforts to recover from the crash. We recovered from the crash and all systems are normal. However, I have a concern about what fsck is reporting and what it really means for a filesystem to be marked "corrupt." With the mapred engine shut down, I ran fsck / -files -blocks -locations to inspect the file system. The output looks clean with the exception of this at the end of the output: Status: CORRUPT Total size:5113667836544 B Total blocks: 1070996 (avg. block size 4774684 B) Total dirs:50012 Total files: 1027089 Over-replicated blocks:0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Target replication factor: 3 Real replication factor: 3.0 The filesystem under path '/' is CORRUPT In reviewing the fsck output, there are no obvious errors being reported. I see tons of output like this: /foo/bar/part-5 3387058, 6308 block(s): OK 0. 4958936159429948772 len=3387058 repl=3 [10.2.14.5:50010, 10.2.14.20:50010, 10.2.14.8:50010] and the only status ever reported is "OK." So this begs the question about what causes HDFS to declare the FS is "corrupt" and how do I clear this up? The second question, assuming that I can't make the "corrupt" state go away, concerns running an upgrade. If every file in HDFS reports "OK" but the FS reports "corrupt", is it safe to undertake an upgrade from 0.15.x to 0.16.4 ? Thanks for any help C G
Re: HDFS corrupt...how to proceed?
Thanks to everyone who responded. Things are back on the air now - all the replication issues seem to have gone away. I am wading through a detailed fsck output now looking for specific problems on a file-by-file basis. Just in case anybody is interested, we mirror our master nodes using DRBD. It performed very well in this first "real world" test. If there is interest I can write up how we protect our master nodes in more detail and share w/the community. Thanks, C G Ted Dunning <[EMAIL PROTECTED]> wrote: You don't need to correct over-replicated files. The under-replicated files should cure themselves, but there is a problem on old versions where that doesn't happen quite right. You can use hadoop fsck / to get a list of the files that are broken and there are options to copy what remains of them to lost+found or to delete them. Other than that, things should correct themselves fairly quickly. On 5/11/08 8:23 PM, "C G" wrote: > Hi All: > > We had a primary node failure over the weekend. When we brought the node > back up and I ran Hadoop fsck, I see the file system is corrupt. I'm unsure > how best to proceed. Any advice is greatly appreciated. If I've missed a > Wiki page or documentation somewhere please feel free to tell me to RTFM and > let me know where to look. > > Specific question: how to clear under and over replicated files? Is the > correct procedure to copy the file locally, delete from HDFS, and then copy > back to HDFS? > > The fsck output is long, but the final summary is: > > Total size: 4899680097382 B > Total blocks: 994252 (avg. block size 4928006 B) > Total dirs: 47404 > Total files: 952070 > > CORRUPT FILES: 2 > MISSING BLOCKS: 24 > MISSING SIZE: 1501009630 B > > Over-replicated blocks: 1 (1.0057812E-4 %) > Under-replicated blocks: 14958 (1.5044476 %) > Target replication factor: 3 > Real replication factor: 2.9849212 > > The filesystem under path '/' is CORRUPT > > > - > Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it > now. - Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.
Re: HDFS corrupt...how to proceed?
Yes, several of our logging apps had accumulated backlogs of data and were "eager" to write to HDFS Dhruba Borthakur <[EMAIL PROTECTED]> wrote: Is it possible that new files were being created by running applications between the first and second fsck runs? thans, dhruba On Sun, May 11, 2008 at 8:55 PM, C G wrote: > The system hosting the namenode experienced an OS panic and shut down, we > subsequently rebooted it. Currently we don't believe there is/was a bad disk > or other hardware problem. > > Something interesting: I've ran fsck twice, the first time it gave the result > I posted. The second time I still declared the FS to be corrupt, but said: > [many rows of periods deleted] > ..Status: CORRUPT > Total size: 4900076384766 B > Total blocks: 994492 (avg. block size 4927215 B) > Total dirs: 47404 > Total files: 952310 > Over-replicated blocks: 0 (0.0 %) > Under-replicated blocks: 0 (0.0 %) > Target replication factor: 3 > Real replication factor: 3.0 > > > The filesystem under path '/' is CORRUPT > > So it seems like it's fixing some problems on its own? > > Thanks, > C G > > > Dhruba Borthakur wrote: > Did one datanode fail or did the namenode fail? By "fail" do you mean > that the system was rebooted or was there a bad disk that caused the > problem? > > thanks, > dhruba > > On Sun, May 11, 2008 at 7:23 PM, C G > > > wrote: > > Hi All: > > > > We had a primary node failure over the weekend. When we brought the node > > back up and I ran Hadoop fsck, I see the file system is corrupt. I'm unsure > > how best to proceed. Any advice is greatly appreciated. If I've missed a > > Wiki page or documentation somewhere please feel free to tell me to RTFM > > and let me know where to look. > > > > Specific question: how to clear under and over replicated files? Is the > > correct procedure to copy the file locally, delete from HDFS, and then copy > > back to HDFS? > > > > The fsck output is long, but the final summary is: > > > > Total size: 4899680097382 B > > Total blocks: 994252 (avg. block size 4928006 B) > > Total dirs: 47404 > > Total files: 952070 > > > > CORRUPT FILES: 2 > > MISSING BLOCKS: 24 > > MISSING SIZE: 1501009630 B > > > > Over-replicated blocks: 1 (1.0057812E-4 %) > > Under-replicated blocks: 14958 (1.5044476 %) > > Target replication factor: 3 > > Real replication factor: 2.9849212 > > > > The filesystem under path '/' is CORRUPT > > > > > > > > - > > Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it > > now. > > > > - > Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. - Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.
Re: HDFS corrupt...how to proceed?
The system hosting the namenode experienced an OS panic and shut down, we subsequently rebooted it. Currently we don't believe there is/was a bad disk or other hardware problem. Something interesting: I've ran fsck twice, the first time it gave the result I posted. The second time I still declared the FS to be corrupt, but said: [many rows of periods deleted] ..Status: CORRUPT Total size:4900076384766 B Total blocks: 994492 (avg. block size 4927215 B) Total dirs:47404 Total files: 952310 Over-replicated blocks:0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Target replication factor: 3 Real replication factor: 3.0 The filesystem under path '/' is CORRUPT So it seems like it's fixing some problems on its own? Thanks, C G Dhruba Borthakur <[EMAIL PROTECTED]> wrote: Did one datanode fail or did the namenode fail? By "fail" do you mean that the system was rebooted or was there a bad disk that caused the problem? thanks, dhruba On Sun, May 11, 2008 at 7:23 PM, C G wrote: > Hi All: > > We had a primary node failure over the weekend. When we brought the node back > up and I ran Hadoop fsck, I see the file system is corrupt. I'm unsure how > best to proceed. Any advice is greatly appreciated. If I've missed a Wiki > page or documentation somewhere please feel free to tell me to RTFM and let > me know where to look. > > Specific question: how to clear under and over replicated files? Is the > correct procedure to copy the file locally, delete from HDFS, and then copy > back to HDFS? > > The fsck output is long, but the final summary is: > > Total size: 4899680097382 B > Total blocks: 994252 (avg. block size 4928006 B) > Total dirs: 47404 > Total files: 952070 > > CORRUPT FILES: 2 > MISSING BLOCKS: 24 > MISSING SIZE: 1501009630 B > > Over-replicated blocks: 1 (1.0057812E-4 %) > Under-replicated blocks: 14958 (1.5044476 %) > Target replication factor: 3 > Real replication factor: 2.9849212 > > The filesystem under path '/' is CORRUPT > > > > - > Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. - Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.
HDFS corrupt...how to proceed?
Hi All: We had a primary node failure over the weekend. When we brought the node back up and I ran Hadoop fsck, I see the file system is corrupt. I'm unsure how best to proceed. Any advice is greatly appreciated. If I've missed a Wiki page or documentation somewhere please feel free to tell me to RTFM and let me know where to look. Specific question: how to clear under and over replicated files? Is the correct procedure to copy the file locally, delete from HDFS, and then copy back to HDFS? The fsck output is long, but the final summary is: Total size:4899680097382 B Total blocks: 994252 (avg. block size 4928006 B) Total dirs:47404 Total files: 952070 CORRUPT FILES:2 MISSING BLOCKS: 24 MISSING SIZE: 1501009630 B Over-replicated blocks:1 (1.0057812E-4 %) Under-replicated blocks: 14958 (1.5044476 %) Target replication factor: 3 Real replication factor: 2.9849212 The filesystem under path '/' is CORRUPT - Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.
Re: Quick jar deployment question...
Yeah, everything is packaged into one jar...I've been copying those jars everywhere which didn't seem right, hence the question. Thanks, C G Ted Dunning <[EMAIL PROTECTED]> wrote: The easiest way is to package all of your code (classes and jars) into a single jar file which you then execute. When you instantiate a JobClient and run a job, your jar gets copied to all necessary nodes. The machine you use to launch the job need not even be in the cluster, just able to see the cluster. On 4/3/08 11:23 AM, "C G" wrote: > Hi All: > > When deploying a jar file containing code for a Hadoop job, is it necessary > to copy the jar to the same path on all nodes in the grid, or just on the node > which will launch the job? > > Thanks, > C G > > > > - > You rock. That's why Blockbuster's offering you one month of Blockbuster Total > Access, No Cost. - You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost.
Quick jar deployment question...
Hi All: When deploying a jar file containing code for a Hadoop job, is it necessary to copy the jar to the same path on all nodes in the grid, or just on the node which will launch the job? Thanks, C G - You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost.
Most excellent Hadoop Summit
Dear Yahoo, Amazon, and Hadoop Community: Thank you very much for a very well-done Hadoop Summit. It came as a complete surprise that a FREE conference would include breakfast, lunch, snacks, happy hour, and swag - very classy and very nice. All the presentations and discussions were interesting (and I do mean ALL), and I learned a great deal in a very short time. Thanks to the presenters for doing a good job preparing and an equally fine job presenting. It was also a lot of fun putting faces to the names we see on the email list and having some time to chat and network with everyone. The day-ending roundtable on the future of Hadoop made it clear that all the major players and participants are in this for the long-haul. Thanks to all for presenting what I thought was a very clear, very logical roadmap. I'm back in Boston after having flown out to the valley literally just to attend the Summit. It was money and time well spent to attend (and I can't wait until the airlines provide in-flight internet access...). I would like to encourage everyone to consider future Summits, with perhaps an even larger venue if possible - I know that I would be willing to pay the usual conference registration fees to attend. In short: Well done...bravo. Thanks, Chris - Looking for last minute shopping deals? Find them fast with Yahoo! Search.
RE: Solving the "hang" problem in dfs -copyToLocal/-cat...
I think HTTP access is read-only...you'll need to continue to use copyFromLocalFile C G Phillip Wu <[EMAIL PROTECTED]> wrote: Very helpful information. Is there any ways to put files into DFS remotely, like http post? Or I have to keep using copyFromLocalFile? Thanks, Phil mobile . 626.234.7515 . yim . heliophillip www.helio.com -Original Message----- From: C G [mailto:[EMAIL PROTECTED] Sent: Wednesday, February 27, 2008 2:46 PM To: core-user@hadoop.apache.org Subject: RE: Solving the "hang" problem in dfs -copyToLocal/-cat... I haven't looked at the source code to see how -cat is implemented, but I was pretty surprised at the results as well. When I sat down to do this experiment I figured I was wasting my time..surprisingly I was not. C G Joydeep Sen Sarma wrote: This is amazing .. Wouldn't dfs -cat use the same dfs client codepath that an actual map-reduce program would? (If so, should it also start using http client instead? (at least for the non-local case)) Or maybe it already does? -Original Message- From: Ted Dunning [mailto:[EMAIL PROTECTED] Sent: Wednesday, February 27, 2008 12:10 PM To: core-user@hadoop.apache.org Subject: Re: Solving the "hang" problem in dfs -copyToLocal/-cat... Have you tried using http to fetch the file instead? http:///data/ This will get redirected to one of the datanodes to handle and should be pretty fast. It would be interesting to find out if this alternative path is subject to the same hangs that you are seeing. On 2/27/08 12:05 PM, "C G" wrote: > Hi All: > > The following write-up is offered to help out anybody else who has seen > performance problems and "hangs" while using dfs -copyToLocal/-cat. > > One of the performance problems that has been causing big problems for us > has been using the dfs commands -copyToLocal and -cat to move data from HDFS > to a local file system. We do this in order to populate a data warehouse that > is HDFS-unaware. > > The "pattern" I've been using is: > > rm -f loadfile.dat > fileList=`bin/hadoop dfs -ls /foo | grep part | awk '{print $1}'` > for x in `echo ${fileList}` > do > bin/hadoop dfs -cat ${x} >> loadfile.dat > done > > This pattern repeats several times, ultimately cat-ing 353 files into > several load files. This process is extremely slow, often taking 20-30 > minutes to transfer 142M of data. More frustrating is that the system simply > "pauses" during cat operations. There is no I/O activity, no CPU activity, > nothing written to the log files on any node. Things just stop. I changed > the pattern to use -copyToLocal instead of -cat and had the same results. We > observe this "pause" behavior without respect for where the -copyToLocal or > -cat originates - I've tried running directly on the grid, and also directly > on the DB server which is not part of the grid proper. I've tried many > different releases of Hadoop, including 0.16.0, and all exhibit this problem. > > I decided to try a different approach and use the HTTP interface to the > namenode to transfer the data: > > rm -f loadfile.dat > fileList=`bin/hadoop dfs -ls /foo | grep part | awk '{print $1}'` > for x in `echo ${fileList}` > do > wget -q http://mynamenodeserver:50070/data${x} > done > > There is a trivial step to merge the individual part files into one file > preparatory for loading data. > > I ran this experiment across 10,850 files containing an aggregate total of > 4.6G of data. It ran in under 2 hours, which while not great is significantly > better than the 18 hours it previously took -copyToLocal/-cat to run. > > I found it surprising that this solution works better than > -copyToLocal/-cat. > > Hope this helps... > C G > > > > - > Looking for last minute shopping deals? Find them fast with Yahoo! Search. - Looking for last minute shopping deals? Find them fast with Yahoo! Search. - Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.
RE: Solving the "hang" problem in dfs -copyToLocal/-cat...
I haven't looked at the source code to see how -cat is implemented, but I was pretty surprised at the results as well. When I sat down to do this experiment I figured I was wasting my time..surprisingly I was not. C G Joydeep Sen Sarma <[EMAIL PROTECTED]> wrote: This is amazing .. Wouldn't dfs -cat use the same dfs client codepath that an actual map-reduce program would? (If so, should it also start using http client instead? (at least for the non-local case)) Or maybe it already does? -Original Message- From: Ted Dunning [mailto:[EMAIL PROTECTED] Sent: Wednesday, February 27, 2008 12:10 PM To: core-user@hadoop.apache.org Subject: Re: Solving the "hang" problem in dfs -copyToLocal/-cat... Have you tried using http to fetch the file instead? http:///data/ This will get redirected to one of the datanodes to handle and should be pretty fast. It would be interesting to find out if this alternative path is subject to the same hangs that you are seeing. On 2/27/08 12:05 PM, "C G" wrote: > Hi All: > > The following write-up is offered to help out anybody else who has seen > performance problems and "hangs" while using dfs -copyToLocal/-cat. > > One of the performance problems that has been causing big problems for us > has been using the dfs commands -copyToLocal and -cat to move data from HDFS > to a local file system. We do this in order to populate a data warehouse that > is HDFS-unaware. > > The "pattern" I've been using is: > > rm -f loadfile.dat > fileList=`bin/hadoop dfs -ls /foo | grep part | awk '{print $1}'` > for x in `echo ${fileList}` > do > bin/hadoop dfs -cat ${x} >> loadfile.dat > done > > This pattern repeats several times, ultimately cat-ing 353 files into > several load files. This process is extremely slow, often taking 20-30 > minutes to transfer 142M of data. More frustrating is that the system simply > "pauses" during cat operations. There is no I/O activity, no CPU activity, > nothing written to the log files on any node. Things just stop. I changed > the pattern to use -copyToLocal instead of -cat and had the same results. We > observe this "pause" behavior without respect for where the -copyToLocal or > -cat originates - I've tried running directly on the grid, and also directly > on the DB server which is not part of the grid proper. I've tried many > different releases of Hadoop, including 0.16.0, and all exhibit this problem. > > I decided to try a different approach and use the HTTP interface to the > namenode to transfer the data: > > rm -f loadfile.dat > fileList=`bin/hadoop dfs -ls /foo | grep part | awk '{print $1}'` > for x in `echo ${fileList}` > do > wget -q http://mynamenodeserver:50070/data${x} > done > > There is a trivial step to merge the individual part files into one file > preparatory for loading data. > > I ran this experiment across 10,850 files containing an aggregate total of > 4.6G of data. It ran in under 2 hours, which while not great is significantly > better than the 18 hours it previously took -copyToLocal/-cat to run. > > I found it surprising that this solution works better than > -copyToLocal/-cat. > > Hope this helps... > C G > > > > - > Looking for last minute shopping deals? Find them fast with Yahoo! Search. - Looking for last minute shopping deals? Find them fast with Yahoo! Search.
Re: Add your project or company to the powered by page?
Here is my contribution to the Hadoop Powered-by page: Visible Measures Corporation (www.visiblemeasures.com) uses Hadoop as a component in our Scalable Data Pipeline, which ultimately powers VisibleSuite and other products. We use Hadoop to aggregate, store, and analyze data related to in-stream viewing behavior of Internet video audiences. Our current grid contains more than 128 CPU cores and in excess of 100 terabytes of storage, and we plan to grow that substantially during 2008. Thanks, C G --- Christopher Gillett Chief Software Architect Visible Measures Corporation 25 Kingston Street, 5th Floor Boston, MA 02111 http://www.visiblemeasures.com Eric Baldeschwieler <[EMAIL PROTECTED]> wrote: Hi Folks, Let's get the word out that Hadoop is being used and is useful in your organizations, ok? Please add yourselves to the Hadoop powered by page, or reply to this email with what details you would like to add and I'll do it. http://wiki.apache.org/hadoop/PoweredBy Thanks! E14 --- eric14 a.k.a. Eric Baldeschwieler senior director, grid computing Yahoo! Inc. - Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.
Solving the "hang" problem in dfs -copyToLocal/-cat...
Hi All: The following write-up is offered to help out anybody else who has seen performance problems and "hangs" while using dfs -copyToLocal/-cat. One of the performance problems that has been causing big problems for us has been using the dfs commands -copyToLocal and -cat to move data from HDFS to a local file system. We do this in order to populate a data warehouse that is HDFS-unaware. The "pattern" I've been using is: rm -f loadfile.dat fileList=`bin/hadoop dfs -ls /foo | grep part | awk '{print $1}'` for x in `echo ${fileList}` do bin/hadoop dfs -cat ${x} >> loadfile.dat done This pattern repeats several times, ultimately cat-ing 353 files into several load files. This process is extremely slow, often taking 20-30 minutes to transfer 142M of data. More frustrating is that the system simply "pauses" during cat operations. There is no I/O activity, no CPU activity, nothing written to the log files on any node. Things just stop. I changed the pattern to use -copyToLocal instead of -cat and had the same results. We observe this "pause" behavior without respect for where the -copyToLocal or -cat originates - I've tried running directly on the grid, and also directly on the DB server which is not part of the grid proper. I've tried many different releases of Hadoop, including 0.16.0, and all exhibit this problem. I decided to try a different approach and use the HTTP interface to the namenode to transfer the data: rm -f loadfile.dat fileList=`bin/hadoop dfs -ls /foo | grep part | awk '{print $1}'` for x in `echo ${fileList}` do wget -q http://mynamenodeserver:50070/data${x} done There is a trivial step to merge the individual part files into one file preparatory for loading data. I ran this experiment across 10,850 files containing an aggregate total of 4.6G of data. It ran in under 2 hours, which while not great is significantly better than the 18 hours it previously took -copyToLocal/-cat to run. I found it surprising that this solution works better than -copyToLocal/-cat. Hope this helps... C G - Looking for last minute shopping deals? Find them fast with Yahoo! Search.
RE: Questions regarding configuration parameters...
Guys: Thanks for the information...I've gotten some pretty good results twiddling some parameters. I've also reminded myself about the pitfalls of oversubscribing resources (like number of reducers). Here's what I learned, written up here to hopefully help somebody later... I set up one of my apps on a 4-node test grid. Each grid member is a 4-way box. The configuration had default values (2) for mapred.tasktracker.(map,reduce).tasks.maximum. The values for mapred.map.tasks and mapred.reduce.tasks were 29 and 3 respectively (using the prime number recommendations in the docs). The initial run took 00:23:21...not so good. I changed (map,reduce).tasks.maximum to 4 and the time fell to 19:40. Then I tried 7 and it fell to 14:37. So far so good. I then looked at my code and realized that I was specifying 32 for the number of reducers (damned hard-coded constants...I bop myself on the head and call myself a moron). The large value was based on running on a much larger grid. So I backed that value down to 3, and my execution time fell to 09:17. Then I changed (map,reduce).tasks.maximum from 7 to 4 and ran again in 06:48. w00t! Bottom line: Carefully setting configuration parameters, and paying attention to map/reduce task values relative to the size of the grid is VERY important in achieving good performance. Thanks, C G Joydeep Sen Sarma <[EMAIL PROTECTED]> wrote: > The default value are 2 so you might only see 2 cores used by Hadoop per > node/host. that's 2 each for map and reduce. so theoretically - one could fully utilize a 4 core box with this setting. in practice - a little bit of oversubscription (3 each on a 4 core) seems to be working out well for us (maybe overlapping some compute and io - but mostly we are trading off for higher # concurrent jobs against per job latency). unlikely that these settings are causing slowness in processing small amounts of data. send more details - what's slow (map/shuffle/reduce)? check cpu consumption when map task is running .. etc. -Original Message- From: Andy Li [mailto:[EMAIL PROTECTED] Sent: Thu 2/21/2008 2:36 PM To: core-user@hadoop.apache.org Subject: Re: Questions regarding configuration parameters... Try the 2 parameters to utilize all the cores per node/host. mapred.tasktracker.map.tasks.maximum 7 The maximum number of map tasks that will be run simultaneously by a task tracker. mapred.tasktracker.reduce.tasks.maximum 7 The maximum number of reduce tasks that will be run simultaneously by a task tracker. The default value are 2 so you might only see 2 cores used by Hadoop per node/host. If each system/machine has 4 cores (dual dual core), then you can change them to 3. Hope this works for you. -Andy On Wed, Feb 20, 2008 at 9:30 AM, C G wrote: > Hi All: > > The documentation for the configuration parameters mapred.map.tasks and > mapred.reduce.tasks discuss these values in terms of "number of available > hosts" in the grid. This description strikes me as a bit odd given that a > "host" could be anything from a uniprocessor to an N-way box, where values > for N could vary from 2..16 or more. The documentation is also vague about > computing the actual value. For example, for mapred.map.tasks the doc > says ".a prime number several times greater.". I'm curious about how people > are interpreting the descriptions and what values people are using. > Specifically, I'm wondering if I should be using "core count" instead of > "host count" to set these values. > > In the specific case of my system, we have 24 hosts where each host is a > 4-way system (i.e. 96 cores total). For mapred.map.tasks I chose the > value 173, as that is a prime number which is near 7*24. For > mapred.reduce.tasks I chose 23 since that is a prime number close to 24. > Is this what was intended? > > Beyond curiousity, I'm concerned about setting these values and other > configuration parameters correctly because I am pursuing some performance > issues where it is taking a very long time to process small amounts of data. > I am hoping that some amount of tuning will resolve the problems. > > Any thoughts and insights most appreciated. > > Thanks, > C G > > > > - > Never miss a thing. Make Yahoo your homepage. > - Looking for last minute shopping deals? Find them fast with Yahoo! Search.
RE: Questions regarding configuration parameters...
My performance problems fall into 2 categories: 1. Extremely slow reduce phases - our map phases march along at impressive speed, but during reduce phases most nodes go idle...the active machines mostly clunk along at 10-30% CPU. Compare this to the map phase where I get all grid nodes cranking away at > 100% CPU. This is a vague explanation I realize. 2. Pregnant pauses during dfs -copyToLocal and -cat operations. Frequently I'll be iterating over a list of HDFS files cat-ing them into one file to bulk load into a database. Many times I'll see one of the copies/cats sit for anywhere from 2-5 minutes. During that time no data is transferred, all nodes are idle, and absolutely nothing is written to any of the logs. The file sizes being copied are relatively small...less than 1G each in most cases. Both of these issues persist in 0.16.0 and definitely have me puzzled. I'm sure that I'm doing something wrong/non-optimal w/r/t slow reduce phases, but the long pauses during a dfs command line operation seems like a bug to me. Unfortunately I've not seen anybody else report this. Any thoughts/ideas most welcome... Thanks, C G Joydeep Sen Sarma <[EMAIL PROTECTED]> wrote: > The default value are 2 so you might only see 2 cores used by Hadoop per > node/host. that's 2 each for map and reduce. so theoretically - one could fully utilize a 4 core box with this setting. in practice - a little bit of oversubscription (3 each on a 4 core) seems to be working out well for us (maybe overlapping some compute and io - but mostly we are trading off for higher # concurrent jobs against per job latency). unlikely that these settings are causing slowness in processing small amounts of data. send more details - what's slow (map/shuffle/reduce)? check cpu consumption when map task is running .. etc. -Original Message- From: Andy Li [mailto:[EMAIL PROTECTED] Sent: Thu 2/21/2008 2:36 PM To: core-user@hadoop.apache.org Subject: Re: Questions regarding configuration parameters... Try the 2 parameters to utilize all the cores per node/host. mapred.tasktracker.map.tasks.maximum 7 The maximum number of map tasks that will be run simultaneously by a task tracker. mapred.tasktracker.reduce.tasks.maximum 7 The maximum number of reduce tasks that will be run simultaneously by a task tracker. The default value are 2 so you might only see 2 cores used by Hadoop per node/host. If each system/machine has 4 cores (dual dual core), then you can change them to 3. Hope this works for you. -Andy On Wed, Feb 20, 2008 at 9:30 AM, C G wrote: > Hi All: > > The documentation for the configuration parameters mapred.map.tasks and > mapred.reduce.tasks discuss these values in terms of "number of available > hosts" in the grid. This description strikes me as a bit odd given that a > "host" could be anything from a uniprocessor to an N-way box, where values > for N could vary from 2..16 or more. The documentation is also vague about > computing the actual value. For example, for mapred.map.tasks the doc > says ".a prime number several times greater.". I'm curious about how people > are interpreting the descriptions and what values people are using. > Specifically, I'm wondering if I should be using "core count" instead of > "host count" to set these values. > > In the specific case of my system, we have 24 hosts where each host is a > 4-way system (i.e. 96 cores total). For mapred.map.tasks I chose the > value 173, as that is a prime number which is near 7*24. For > mapred.reduce.tasks I chose 23 since that is a prime number close to 24. > Is this what was intended? > > Beyond curiousity, I'm concerned about setting these values and other > configuration parameters correctly because I am pursuing some performance > issues where it is taking a very long time to process small amounts of data. > I am hoping that some amount of tuning will resolve the problems. > > Any thoughts and insights most appreciated. > > Thanks, > C G > > > > - > Never miss a thing. Make Yahoo your homepage. > - Looking for last minute shopping deals? Find them fast with Yahoo! Search.
Questions regarding configuration parameters...
Hi All: The documentation for the configuration parameters mapred.map.tasks and mapred.reduce.tasks discuss these values in terms of number of available hosts in the grid. This description strikes me as a bit odd given that a host could be anything from a uniprocessor to an N-way box, where values for N could vary from 2..16 or more. The documentation is also vague about computing the actual value. For example, for mapred.map.tasks the doc says a prime number several times greater . Im curious about how people are interpreting the descriptions and what values people are using. Specifically, Im wondering if I should be using core count instead of host count to set these values. In the specific case of my system, we have 24 hosts where each host is a 4-way system (i.e. 96 cores total). For mapred.map.tasks I chose the value 173, as that is a prime number which is near 7*24. For mapred.reduce.tasks I chose 23 since that is a prime number close to 24. Is this what was intended? Beyond curiousity, Im concerned about setting these values and other configuration parameters correctly because I am pursuing some performance issues where it is taking a very long time to process small amounts of data. I am hoping that some amount of tuning will resolve the problems. Any thoughts and insights most appreciated. Thanks, C G - Never miss a thing. Make Yahoo your homepage.
Re: Hadoop summit / workshop at Yahoo!
Hey All: Is this going forward? I'd like to make plans to attend and the sooner I can get plane tickets the happier the bean counters will be :-). Thx, C G > Ajay Anand wrote: >> >> Yahoo plans to host a summit / workshop on Apache Hadoop at our >> Sunnyvale campus on March 25th. Given the interest we are seeing from >> developers in a broad range of organizations, this seems like a good >> time to get together and brief each other on the progress that is >> being >> made. >> >> >> >> We would like to cover topics in the areas of extensions being >> developed >> for Hadoop, innovative applications being built and deployed on >> Hadoop, >> and future extensions to the platform. Some of the speakers who have >> already committed to present are from organizations such as IBM, >> Intel, >> Carnegie Mellon University, UC Berkeley, Facebook and Yahoo!, and >> we are >> actively recruiting other leaders in the space. >> >> >> >> If you have an innovative application you would like to talk about, >> please let us know. Although there are limitations on the amount of >> time >> we have, we would love to hear from you. You can contact me at >> [EMAIL PROTECTED] >> >> >> >> Thanks and looking forward to hearing about your cool apps, >> >> Ajay >> >> >> >> >> > > -- > View this message in context: > http://www.nabble.com/Hadoop-summit---workshop-at-Yahoo%21-tp14889262p15393386.html > Sent from the Hadoop lucene-users mailing list archive at Nabble.com. > - Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.
Re: Low complexity way to write a file to hdfs?
Ted: I am curious about how you read files without installing anything. Can you share your wisdom? Thanks, C G Ted Dunning <[EMAIL PROTECTED]> wrote: I am looking for a way for scripts to write data to HDFS without having to install anything. The /data and /listPaths URL's on the nameserver are ideal for reading files, but I can't find anything comparable to write files. Am I missing something? If not, I think I will file a JIRA and make /data accept POST events. If anybody has an opinion about that, please let me know (or put a comment on the JIRA, if and when). - Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now.