Rajeshbabu Chintaguntla created HBASE-12791: -----------------------------------------------
Summary: HBase does not attempt to clean up an aborted split when the regionserver shutting down Key: HBASE-12791 URL: https://issues.apache.org/jira/browse/HBASE-12791 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.0 Reporter: Rajeshbabu Chintaguntla Assignee: Rajeshbabu Chintaguntla Priority: Critical Fix For: 2.0.0, 0.98.10, 1.0.1 HBase not cleaning the daughter region directories from HDFS if region server shut down after creating the daughter region directories during the split. Here the logs. -> RS shutdown after creating the daughter regions. {code} 2014-12-31 09:05:41,406 DEBUG [regionserver60020-splits-1419996941385] zookeeper.ZKAssign: regionserver:60020-0x14a9701e53100d1, quorum=localhost:2181, baseZNode=/hbase Transitioned node 80c665138d4fa32da4d792d8ed13206f from RS_ZK_REQUEST_REGION_SPLIT to RS_ZK_REQUEST_REGION_SPLIT 2014-12-31 09:05:41,514 DEBUG [regionserver60020-splits-1419996941385] regionserver.HRegion: Closing t,,1419996880699.80c665138d4fa32da4d792d8ed13206f.: disabling compactions & flushes 2014-12-31 09:05:41,514 DEBUG [regionserver60020-splits-1419996941385] regionserver.HRegion: Updates disabled for region t,,1419996880699.80c665138d4fa32da4d792d8ed13206f. 2014-12-31 09:05:41,516 INFO [StoreCloserThread-t,,1419996880699.80c665138d4fa32da4d792d8ed13206f.-1] regionserver.HStore: Closed f 2014-12-31 09:05:41,518 INFO [regionserver60020-splits-1419996941385] regionserver.HRegion: Closed t,,1419996880699.80c665138d4fa32da4d792d8ed13206f. 2014-12-31 09:05:49,922 DEBUG [regionserver60020-splits-1419996941385] regionserver.MetricsRegionSourceImpl: Creating new MetricsRegionSourceImpl for table t dd9731ee43b104da565257ca1539aa8c 2014-12-31 09:05:49,922 DEBUG [regionserver60020-splits-1419996941385] regionserver.HRegion: Instantiated t,,1419996941401.dd9731ee43b104da565257ca1539aa8c. 2014-12-31 09:05:49,929 DEBUG [regionserver60020-splits-1419996941385] regionserver.MetricsRegionSourceImpl: Creating new MetricsRegionSourceImpl for table t 2e40a44511c0e187d357d651f13a1dab 2014-12-31 09:05:49,929 DEBUG [regionserver60020-splits-1419996941385] regionserver.HRegion: Instantiated t,row2,1419996941401.2e40a44511c0e187d357d651f13a1dab. Wed Dec 31 09:06:30 IST 2014 Terminating regionserver 2014-12-31 09:06:30,465 INFO [Thread-8] regionserver.ShutdownHook: Shutdown hook starting; hbase.shutdown.hook=true; fsShutdownHook=org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@42d2282e {code} -> Skipping rollback if RS stopped or stopping so we end up in dirty daughter regions in HDFS. {code} 2014-12-31 09:07:49,547 INFO [regionserver60020-splits-1419996941385] regionserver.SplitRequest: Skip rollback/cleanup of failed split of t,,1419996880699.80c665138d4fa32da4d792d8ed13206f. because server is stopped java.io.InterruptedIOException: Interrupted after 0 tries on 350 at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:156) {code} Because of this hbck always showing inconsistencies. {code} ERROR: Region { meta => null, hdfs => hdfs://localhost:9000/hbase/data/default/t/2e40a44511c0e187d357d651f13a1dab, deployed => } on HDFS, but not listed in hbase:meta or deployed on any region server ERROR: Region { meta => null, hdfs => hdfs://localhost:9000/hbase/data/default/t/dd9731ee43b104da565257ca1539aa8c, deployed => } on HDFS, but not listed in hbase:meta or deployed on any region server {code} If we try to repair then we end up in overlap regions in hbase:meta. -- This message was sent by Atlassian JIRA (v6.3.4#6332)