I have an idea on how to make splits suck less.... What we do is this on split: - do the split in RS, reopen regions w/o reassigning them - report split to META
Also fix the way regionserver works, and dont depend on region name to do put/get. If the put is for a region this RS controls, then just do it, even if the client has the 'wrong' RS. While you are at it, send a OOB or some other signal that the client needs to reload those regions (or not even, there is no 'failure' here). This would require changing a bunch of things, but we shouldnt need to go down just because of a split. On Sun, Dec 13, 2009 at 12:16 AM, Lars George <[email protected]> wrote: > Just as a note, I think I had the same issue. This is on my 7+1 cluster > during a MR import job: > > 2009-12-08 01:15:45,772 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > Flush requested on ma-docs,cb48e6aa06cd2937e095bfefbec7c357,1260256286643 > 2009-12-08 01:15:45,772 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: > Started memstore flush for region > ma-docs,cb48e6aa06cd2937e095bfefbec7c357,1260256286643. Current region > memstore size 64.2m > 2009-12-08 01:15:57,409 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer > Exception: java.net.SocketTimeoutException: 10000 millis timeout while > waiting for channel to be ready for write. ch : > java.nio.channels.SocketChannel[connected local=/192.168.99.38:51729remote=/ > 192.168.99.38:50010] > at > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) > at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146) > at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107) > at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) > at java.io.DataOutputStream.write(DataOutputStream.java:90) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2290) > > 2009-12-08 01:15:57,409 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 bad datanode[0] > 192.168.99.38:50010 > 2009-12-08 01:15:57,410 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 in pipeline > 192.168.99.38:50010, 192.168.99.37:50010: bad datanode 192.168.99.38:50010 > 2009-12-08 01:15:58,567 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed because recovery > from primary datanode 192.168.99.37:50010 failed 1 times. Pipeline was > 192.168.99.38:50010, 192.168.99.37:50010. Will retry... > 2009-12-08 01:15:58,569 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 bad datanode[0] > 192.168.99.38:50010 > 2009-12-08 01:15:58,569 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 in pipeline > 192.168.99.38:50010, 192.168.99.37:50010: bad datanode 192.168.99.38:50010 > 2009-12-08 01:15:58,583 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed because recovery > from primary datanode 192.168.99.37:50010 failed 2 times. Pipeline was > 192.168.99.38:50010, 192.168.99.37:50010. Will retry... > 2009-12-08 01:15:58,585 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 bad datanode[0] > 192.168.99.38:50010 > 2009-12-08 01:15:58,585 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 in pipeline > 192.168.99.38:50010, 192.168.99.37:50010: bad datanode 192.168.99.38:50010 > 2009-12-08 01:15:58,591 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed because recovery > from primary datanode 192.168.99.37:50010 failed 3 times. Pipeline was > 192.168.99.38:50010, 192.168.99.37:50010. Will retry... > 2009-12-08 01:15:58,593 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 bad datanode[0] > 192.168.99.38:50010 > 2009-12-08 01:15:58,593 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 in pipeline > 192.168.99.38:50010, 192.168.99.37:50010: bad datanode 192.168.99.38:50010 > 2009-12-08 01:15:58,598 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed because recovery > from primary datanode 192.168.99.37:50010 failed 4 times. Pipeline was > 192.168.99.38:50010, 192.168.99.37:50010. Will retry... > 2009-12-08 01:15:58,600 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 bad datanode[0] > 192.168.99.38:50010 > 2009-12-08 01:15:58,600 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 in pipeline > 192.168.99.38:50010, 192.168.99.37:50010: bad datanode 192.168.99.38:50010 > 2009-12-08 01:15:58,608 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed because recovery > from primary datanode 192.168.99.37:50010 failed 5 times. Pipeline was > 192.168.99.38:50010, 192.168.99.37:50010. Will retry... > 2009-12-08 01:15:58,610 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 bad datanode[0] > 192.168.99.38:50010 > 2009-12-08 01:15:58,610 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 in pipeline > 192.168.99.38:50010, 192.168.99.37:50010: bad datanode 192.168.99.38:50010 > 2009-12-08 01:15:58,615 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed because recovery > from primary datanode 192.168.99.37:50010 failed 6 times. Pipeline was > 192.168.99.38:50010, 192.168.99.37:50010. Marking primary datanode as bad. > 2009-12-08 01:15:58,625 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed because recovery > from primary datanode 192.168.99.38:50010 failed 1 times. Pipeline was > 192.168.99.38:50010. Will retry... > 2009-12-08 01:15:58,637 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed because recovery > from primary datanode 192.168.99.38:50010 failed 2 times. Pipeline was > 192.168.99.38:50010. Will retry... > 2009-12-08 01:15:58,654 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed because recovery > from primary datanode 192.168.99.38:50010 failed 3 times. Pipeline was > 192.168.99.38:50010. Will retry... > 2009-12-08 01:15:58,668 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed because recovery > from primary datanode 192.168.99.38:50010 failed 4 times. Pipeline was > 192.168.99.38:50010. Will retry... > 2009-12-08 01:15:58,678 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed because recovery > from primary datanode 192.168.99.38:50010 failed 5 times. Pipeline was > 192.168.99.38:50010. Will retry... > 2009-12-08 01:15:58,685 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed because recovery > from primary datanode 192.168.99.38:50010 failed 6 times. Pipeline was > 192.168.99.38:50010. Aborting... > 2009-12-08 01:15:58,685 FATAL > org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Replay of hlog > required. Forcing server shutdown > org.apache.hadoop.hbase.DroppedSnapshotException: region: > ma-docs,cb48e6aa06cd2937e095bfefbec7c357,1260256286643 > at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:946) > at > org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:839) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:241) > at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:149) > Caused by: java.io.IOException: Error Recovery for block > blk_2400329754585253075_931440 failed because recovery from primary > datanode 192.168.99.38:50010 failed 6 times. Pipeline was > 192.168.99.38:50010. Aborting... > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2584) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2078) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2241) > 2009-12-08 01:15:58,688 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: > request=11.9, regions=37, stores=296, storefiles=697, storefileIndexSize=66, > memstoreSize=1184, usedHeap=3319, maxHeap=4087, blockCacheSize=7033392, > blockCacheFree=850216848, blockCacheCount=0, blockCacheHitRatio=0 > 2009-12-08 01:15:58,688 INFO > org.apache.hadoop.hbase.regionserver.MemStoreFlusher: > regionserver/192.168.99.38:60020.cacheFlusher exiting > 2009-12-08 01:15:58,779 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 5 on 60020, call put([...@7237791f, > [Lorg.apache.hadoop.hbase.client.Put;@17f11cce) from 192.168.99.34:34211: > error: java.io.IOException: Server not running, aborting > java.io.IOException: Server not running, aborting > at > org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2351) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java:1828) > at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915) > 2009-12-08 01:15:58,796 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 13 on 60020, call put([...@569a24a9, > [Lorg.apache.hadoop.hbase.client.Put;@21dcffaa) from 192.168.99.36:42492: > error: java.io.IOException: Server not running, aborting > java.io.IOException: Server not running, aborting > at > org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2351) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java:1828) > at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915) > 2009-12-08 01:16:00,151 INFO org.apache.hadoop.ipc.HBaseServer: Stopping > server on 60020 > 2009-12-08 01:16:00,151 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 2 on 60020: exiting > 2009-12-08 01:16:00,152 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC > Server Responder > 2009-12-08 01:16:00,152 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 8 on 60020: exiting > ... > > On Sun, Dec 13, 2009 at 1:03 AM, stack <[email protected]> wrote: > >> I wrote hdfs-dev to see how to proceed. We could try running a vote to get >> it committed to 0.21. >> St.Ack >> >> >> On Sat, Dec 12, 2009 at 1:37 PM, Andrew Purtell <[email protected]> >> wrote: >> >> > I do. I think I saw it just last week with a failure case as follows on a >> > small testbed (aren't they all? :-/ ) that some of our devs are working >> > with: >> > >> > - Local RS and datanode are talking >> > >> > - Something happens to the datanode >> > org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream >> > java.net.SocketTimeoutException: 69000 millis timeout while waiting for >> > channel to be ready for read. ch : java.nio.channels.SocketChannel >> > org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: >> > java.io.IOException: Unable to create new block. >> > >> > - RS won't try talking to other datanodes elsewhere on the cluster >> > org.apache.hadoop.hdfs.DFSClient: Abandoning block >> > blk_7040605219500907455_6449696 >> > org.apache.hadoop.hdfs.DFSClient: Abandoning block >> > blk_-5367929502764356875_6449620 >> > org.apache.hadoop.hdfs.DFSClient: Abandoning block >> > blk_7075535856966512941_6449680 >> > org.apache.hadoop.hdfs.DFSClient: Abandoning block >> > blk_77095304474221514_6449685 >> > >> > - RS goes down >> > org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Replay of hlog >> > required. >> > Forcing server shutdown >> > org.apache.hadoop.hbase.DroppedSnapshotException ... >> > >> > Not a blocker in that the downed RS with working sync in 0.21 won't lose >> > data and can be restarted. But, a critical issue because it will be >> > frequently encountered and will cause processes on the cluster to shut >> down. >> > Without some kind of "god" monitor or human intervention eventually there >> > will be insufficient resources to carry all regions. >> > >> > - Andy >> > >> > >> > >> > >> > ________________________________ >> > From: Stack <[email protected]> >> > To: "[email protected]" <[email protected]> >> > Sent: Sat, December 12, 2009 1:01:49 PM >> > Subject: Re: [jira] Resolved: (HBASE-1972) Failed split results in closed >> > region and non-registration of daughters; fix the order in which things >> are >> > run >> > >> > So we think this critical to hbase? >> > Stack >> > >> > >> > >> > On Dec 12, 2009, at 12:43 PM, Andrew Purtell <[email protected]> >> wrote: >> > >> > > All HBase committers should jump on that issue and +1. We should make >> > that kind of statement for the record. >> > > >> > > >> > > >> > > >> > > ________________________________ >> > > From: stack (JIRA) <[email protected]> >> > > To: [email protected] >> > > Sent: Sat, December 12, 2009 12:39:18 PM >> > > Subject: [jira] Resolved: (HBASE-1972) Failed split results in closed >> > region and non-registration of daughters; fix the order in which things >> are >> > run >> > > >> > > >> > > [ >> > >> https://issues.apache.org/jira/browse/HBASE-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel >> ] >> > > >> > > stack resolved HBASE-1972. >> > > -------------------------- >> > > >> > > Resolution: Won't Fix >> > > >> > > Marking as invalid addressed by hdfs-630. Thanks for looking at this >> > cosmin. Want to open an issue on getting 630 into 0.21. There will be >> > pushback I'd imagine since not "critical" but might make 0.21.1 >> > > >> > >> Failed split results in closed region and non-registration of >> daughters; >> > fix the order in which things are run >> > >> >> > >> -------------------------------------------------------------------------------------------------------------- >> > >> >> > >> Key: HBASE-1972 >> > >> URL: https://issues.apache.org/jira/browse/HBASE-1972 >> > >> Project: Hadoop HBase >> > >> Issue Type: Bug >> > >> Reporter: stack >> > >> Priority: Blocker >> > >> Fix For: 0.21.0 >> > >> >> > >> >> > >> As part of a split, we go to close the region. The close fails >> because >> > flush failed -- a DN was down and HDFS refuses to move past it -- so we >> jump >> > up out of the close with an IOE. But the region has been closed yet its >> > still in the .META. as online. >> > >> Here is where the hole is: >> > >> 1. CompactSplitThread calls split. >> > >> 2. This calls HRegion splitRegion. >> > >> 3. splitRegion calls close(false). >> > >> 4. Down the end of the close, we get as far as the LOG.info("Closed " >> + >> > this)..... but a DFSClient running thread throws an exception because it >> > can't allocate block for the flush made as part of the close (Ain't sure >> > how... we should add more try/catch in here): >> > >> {code} >> > >> 2009-11-12 00:47:17,865 [regionserver/208.76.44.142:60020.compactor] >> > DEBUG org.apache.hadoop.hbase.regionserver.Store: Added hdfs:// >> > >> aa0-000-12.u.powerset.com:9002/hbase/TestTable/868626151/info/5071349140567656566 >> , >> > entries=46975, sequenceid=2350017, memsize=52.0m, filesize=46.5m to >> > TestTable,,1257986664542 >> > >> 2009-11-12 00:47:17,866 [regionserver/208.76.44.142:60020.compactor] >> > DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore >> flush >> > of ~52.0m for region TestTable,,1257986664542 in 7985ms, sequence >> > id=2350017, compaction requested=false >> > >> 2009-11-12 00:47:17,866 [regionserver/208.76.44.142:60020.compactor] >> > DEBUG org.apache.hadoop.hbase.regionserver.Store: closed info >> > >> 2009-11-12 00:47:17,866 [regionserver/208.76.44.142:60020.compactor] >> > INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed >> > TestTable,,1257986664542 >> > >> 2009-11-12 00:47:17,906 [Thread-315] INFO >> > org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream >> > java.io.IOException: Bad connect ack with firstBadLink as >> > 208.76.44.140:51010 >> > >> 2009-11-12 00:47:17,906 [Thread-315] INFO >> > org.apache.hadoop.hdfs.DFSClient: Abandoning block >> > blk_1351692500502810095_1391 >> > >> 2009-11-12 00:47:23,918 [Thread-315] INFO >> > org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream >> > java.io.IOException: Bad connect ack with firstBadLink as >> > 208.76.44.140:51010 >> > >> 2009-11-12 00:47:23,918 [Thread-315] INFO >> > org.apache.hadoop.hdfs.DFSClient: Abandoning block >> > blk_-3310646336307339512_1391 >> > >> 2009-11-12 00:47:29,982 [Thread-318] INFO >> > org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream >> > java.io.IOException: Bad connect ack with firstBadLink as >> > 208.76.44.140:51010 >> > >> 2009-11-12 00:47:29,982 [Thread-318] INFO >> > org.apache.hadoop.hdfs.DFSClient: Abandoning block >> > blk_3070440586900692765_1393 >> > >> 2009-11-12 00:47:35,997 [Thread-318] INFO >> > org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream >> > java.io.IOException: Bad connect ack with firstBadLink as >> > 208.76.44.140:51010 >> > >> 2009-11-12 00:47:35,997 [Thread-318] INFO >> > org.apache.hadoop.hdfs.DFSClient: Abandoning block >> > blk_-5656011219762164043_1393 >> > >> 2009-11-12 00:47:42,007 [Thread-318] INFO >> > org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream >> > java.io.IOException: Bad connect ack with firstBadLink as >> > 208.76.44.140:51010 >> > >> 2009-11-12 00:47:42,007 [Thread-318] INFO >> > org.apache.hadoop.hdfs.DFSClient: Abandoning block >> > blk_-2359634393837722978_1393 >> > >> 2009-11-12 00:47:48,017 [Thread-318] INFO >> > org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream >> > java.io.IOException: Bad connect ack with firstBadLink as >> > 208.76.44.140:51010 >> > >> 2009-11-12 00:47:48,017 [Thread-318] INFO >> > org.apache.hadoop.hdfs.DFSClient: Abandoning block >> > blk_-1626727145091780831_1393 >> > >> 2009-11-12 00:47:54,022 [Thread-318] WARN >> > org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: >> > java.io.IOException: Unable to create new block. >> > >> at >> > >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSClient.java:3100) >> > >> at >> > >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2681) >> > >> 2009-11-12 00:47:54,022 [Thread-318] WARN >> > org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Source >> file >> > >> "/hbase/TestTable/868626151/splits/1211221550/info/5071349140567656566.868626151" >> > - Aborting... >> > >> 2009-11-12 00:47:54,029 [regionserver/208.76.44.142:60020.compactor] >> > ERROR org.apache.hadoop.hbase.regionserver.CompactSplitThread: >> > Compaction/Split failed for region TestTable,,1257986664542 >> > >> java.io.IOException: Bad connect ack with firstBadLink as >> > 208.76.44.140:51010 >> > >> at >> > >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.createBlockOutputStream(DFSClient.java:3160) >> > >> at >> > >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSClient.java:3080) >> > >> at >> > >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2681) >> > >> {code} >> > >> Marking this as blocker. >> > > >> > > --This message is automatically generated by JIRA. >> > > - >> > > You can reply to this email to add a comment to the issue online. >> > > >> > > >> > >> > >> > >> > >> > >> >
