So we think this critical to hbase?
Stack
On Dec 12, 2009, at 12:43 PM, Andrew Purtell <[email protected]>
wrote:
All HBase committers should jump on that issue and +1. We should
make that kind of statement for the record.
________________________________
From: stack (JIRA) <[email protected]>
To: [email protected]
Sent: Sat, December 12, 2009 12:39:18 PM
Subject: [jira] Resolved: (HBASE-1972) Failed split results in
closed region and non-registration of daughters; fix the order in
which things are run
[ https://issues.apache.org/jira/browse/HBASE-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack resolved HBASE-1972.
--------------------------
Resolution: Won't Fix
Marking as invalid addressed by hdfs-630. Thanks for looking at this
cosmin. Want to open an issue on getting 630 into 0.21. There
will be pushback I'd imagine since not "critical" but might make
0.21.1
Failed split results in closed region and non-registration of
daughters; fix the order in which things are run
---
---
---
---
---
---
---
---
---
---
---
---
---
---
--------------------------------------------------------------------
Key: HBASE-1972
URL: https://issues.apache.org/jira/browse/HBASE-1972
Project: Hadoop HBase
Issue Type: Bug
Reporter: stack
Priority: Blocker
Fix For: 0.21.0
As part of a split, we go to close the region. The close fails
because flush failed -- a DN was down and HDFS refuses to move past
it -- so we jump up out of the close with an IOE. But the region
has been closed yet its still in the .META. as online.
Here is where the hole is:
1. CompactSplitThread calls split.
2. This calls HRegion splitRegion.
3. splitRegion calls close(false).
4. Down the end of the close, we get as far as the LOG.info("Closed
" + this)..... but a DFSClient running thread throws an exception
because it can't allocate block for the flush made as part of the
close (Ain't sure how... we should add more try/catch in here):
{code}
2009-11-12 00:47:17,865 [regionserver/
208.76.44.142:60020.compactor] DEBUG
org.apache.hadoop.hbase.regionserver.Store: Added hdfs://
aa0-000-12.u.powerset.com:9002/hbase/TestTable/868626151/info/
5071349140567656566, entries=46975, sequenceid=2350017,
memsize=52.0m, filesize=46.5m to TestTable,,1257986664542
2009-11-12 00:47:17,866 [regionserver/
208.76.44.142:60020.compactor] DEBUG
org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore
flush of ~52.0m for region TestTable,,1257986664542 in 7985ms,
sequence id=2350017, compaction requested=false
2009-11-12 00:47:17,866 [regionserver/
208.76.44.142:60020.compactor] DEBUG
org.apache.hadoop.hbase.regionserver.Store: closed info
2009-11-12 00:47:17,866 [regionserver/
208.76.44.142:60020.compactor] INFO
org.apache.hadoop.hbase.regionserver.HRegion: Closed TestTable,,1257986664542
2009-11-12 00:47:17,906 [Thread-315] INFO
org.apache.hadoop.hdfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Bad connect ack with
firstBadLink as 208.76.44.140:51010
2009-11-12 00:47:17,906 [Thread-315] INFO
org.apache.hadoop.hdfs.DFSClient: Abandoning block
blk_1351692500502810095_1391
2009-11-12 00:47:23,918 [Thread-315] INFO
org.apache.hadoop.hdfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Bad connect ack with
firstBadLink as 208.76.44.140:51010
2009-11-12 00:47:23,918 [Thread-315] INFO
org.apache.hadoop.hdfs.DFSClient: Abandoning block
blk_-3310646336307339512_1391
2009-11-12 00:47:29,982 [Thread-318] INFO
org.apache.hadoop.hdfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Bad connect ack with
firstBadLink as 208.76.44.140:51010
2009-11-12 00:47:29,982 [Thread-318] INFO
org.apache.hadoop.hdfs.DFSClient: Abandoning block
blk_3070440586900692765_1393
2009-11-12 00:47:35,997 [Thread-318] INFO
org.apache.hadoop.hdfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Bad connect ack with
firstBadLink as 208.76.44.140:51010
2009-11-12 00:47:35,997 [Thread-318] INFO
org.apache.hadoop.hdfs.DFSClient: Abandoning block
blk_-5656011219762164043_1393
2009-11-12 00:47:42,007 [Thread-318] INFO
org.apache.hadoop.hdfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Bad connect ack with
firstBadLink as 208.76.44.140:51010
2009-11-12 00:47:42,007 [Thread-318] INFO
org.apache.hadoop.hdfs.DFSClient: Abandoning block
blk_-2359634393837722978_1393
2009-11-12 00:47:48,017 [Thread-318] INFO
org.apache.hadoop.hdfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Bad connect ack with
firstBadLink as 208.76.44.140:51010
2009-11-12 00:47:48,017 [Thread-318] INFO
org.apache.hadoop.hdfs.DFSClient: Abandoning block
blk_-1626727145091780831_1393
2009-11-12 00:47:54,022 [Thread-318] WARN
org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception:
java.io.IOException: Unable to create new block.
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream
$DataStreamer.nextBlockOutputStream(DFSClient.java:3100)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream
$DataStreamer.run(DFSClient.java:2681)
2009-11-12 00:47:54,022 [Thread-318] WARN
org.apache.hadoop.hdfs.DFSClient: Could not get block locations.
Source file "/hbase/TestTable/868626151/splits/1211221550/info/
5071349140567656566.868626151" - Aborting...
2009-11-12 00:47:54,029 [regionserver/
208.76.44.142:60020.compactor] ERROR
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction/
Split failed for region TestTable,,1257986664542
java.io.IOException: Bad connect ack with firstBadLink as
208.76.44.140:51010
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream
$DataStreamer.createBlockOutputStream(DFSClient.java:3160)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream
$DataStreamer.nextBlockOutputStream(DFSClient.java:3080)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream
$DataStreamer.run(DFSClient.java:2681)
{code}
Marking this as blocker.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.