[ 
https://issues.apache.org/jira/browse/HBASE-12791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajeshbabu Chintaguntla updated HBASE-12791:
--------------------------------------------
    Description: 
HBase not cleaning the daughter region directories from HDFS  if region server 
shut down after creating the daughter region directories during the split.

Here the logs.

-> RS shutdown after creating the daughter regions.
{code}
2014-12-31 09:05:41,406 DEBUG [regionserver60020-splits-1419996941385] 
zookeeper.ZKAssign: regionserver:60020-0x14a9701e53100d1, 
quorum=localhost:2181, baseZNode=/hbase Transitioned node 
80c665138d4fa32da4d792d8ed13206f from RS_ZK_REQUEST_REGION_SPLIT to 
RS_ZK_REQUEST_REGION_SPLIT
2014-12-31 09:05:41,514 DEBUG [regionserver60020-splits-1419996941385] 
regionserver.HRegion: Closing 
t,,1419996880699.80c665138d4fa32da4d792d8ed13206f.: disabling compactions & 
flushes
2014-12-31 09:05:41,514 DEBUG [regionserver60020-splits-1419996941385] 
regionserver.HRegion: Updates disabled for region 
t,,1419996880699.80c665138d4fa32da4d792d8ed13206f.
2014-12-31 09:05:41,516 INFO  
[StoreCloserThread-t,,1419996880699.80c665138d4fa32da4d792d8ed13206f.-1] 
regionserver.HStore: Closed f
2014-12-31 09:05:41,518 INFO  [regionserver60020-splits-1419996941385] 
regionserver.HRegion: Closed t,,1419996880699.80c665138d4fa32da4d792d8ed13206f.
2014-12-31 09:05:49,922 DEBUG [regionserver60020-splits-1419996941385] 
regionserver.MetricsRegionSourceImpl: Creating new MetricsRegionSourceImpl for 
table t dd9731ee43b104da565257ca1539aa8c
2014-12-31 09:05:49,922 DEBUG [regionserver60020-splits-1419996941385] 
regionserver.HRegion: Instantiated 
t,,1419996941401.dd9731ee43b104da565257ca1539aa8c.
2014-12-31 09:05:49,929 DEBUG [regionserver60020-splits-1419996941385] 
regionserver.MetricsRegionSourceImpl: Creating new MetricsRegionSourceImpl for 
table t 2e40a44511c0e187d357d651f13a1dab
2014-12-31 09:05:49,929 DEBUG [regionserver60020-splits-1419996941385] 
regionserver.HRegion: Instantiated 
t,row2,1419996941401.2e40a44511c0e187d357d651f13a1dab.
Wed Dec 31 09:06:30 IST 2014 Terminating regionserver
2014-12-31 09:06:30,465 INFO  [Thread-8] regionserver.ShutdownHook: Shutdown 
hook starting; hbase.shutdown.hook=true; 
fsShutdownHook=org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@42d2282e
{code}
-> Skipping rollback if RS stopped or stopping so we end up in dirty daughter 
regions in HDFS.
{code}
2014-12-31 09:07:49,547 INFO  [regionserver60020-splits-1419996941385] 
regionserver.SplitRequest: Skip rollback/cleanup of failed split of 
t,,1419996880699.80c665138d4fa32da4d792d8ed13206f. because server is stopped
java.io.InterruptedIOException: Interrupted after 0 tries  on 350
        at 
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:156)
{code}

Because of this hbck always showing inconsistencies. 
{code}
ERROR: Region { meta => null, hdfs => 
hdfs://localhost:9000/hbase/data/default/t/2e40a44511c0e187d357d651f13a1dab, 
deployed =>  } on HDFS, but not listed in hbase:meta or deployed on any region 
server
ERROR: Region { meta => null, hdfs => 
hdfs://localhost:9000/hbase/data/default/t/dd9731ee43b104da565257ca1539aa8c, 
deployed =>  } on HDFS, but not listed in hbase:meta or deployed on any region 
server
{code}

If we try to repair then we end up in overlap regions in hbase:meta. and both 
daughter regions and parent are online.

  was:
HBase not cleaning the daughter region directories from HDFS  if region server 
shut down after creating the daughter region directories during the split.

Here the logs.

-> RS shutdown after creating the daughter regions.
{code}
2014-12-31 09:05:41,406 DEBUG [regionserver60020-splits-1419996941385] 
zookeeper.ZKAssign: regionserver:60020-0x14a9701e53100d1, 
quorum=localhost:2181, baseZNode=/hbase Transitioned node 
80c665138d4fa32da4d792d8ed13206f from RS_ZK_REQUEST_REGION_SPLIT to 
RS_ZK_REQUEST_REGION_SPLIT
2014-12-31 09:05:41,514 DEBUG [regionserver60020-splits-1419996941385] 
regionserver.HRegion: Closing 
t,,1419996880699.80c665138d4fa32da4d792d8ed13206f.: disabling compactions & 
flushes
2014-12-31 09:05:41,514 DEBUG [regionserver60020-splits-1419996941385] 
regionserver.HRegion: Updates disabled for region 
t,,1419996880699.80c665138d4fa32da4d792d8ed13206f.
2014-12-31 09:05:41,516 INFO  
[StoreCloserThread-t,,1419996880699.80c665138d4fa32da4d792d8ed13206f.-1] 
regionserver.HStore: Closed f
2014-12-31 09:05:41,518 INFO  [regionserver60020-splits-1419996941385] 
regionserver.HRegion: Closed t,,1419996880699.80c665138d4fa32da4d792d8ed13206f.
2014-12-31 09:05:49,922 DEBUG [regionserver60020-splits-1419996941385] 
regionserver.MetricsRegionSourceImpl: Creating new MetricsRegionSourceImpl for 
table t dd9731ee43b104da565257ca1539aa8c
2014-12-31 09:05:49,922 DEBUG [regionserver60020-splits-1419996941385] 
regionserver.HRegion: Instantiated 
t,,1419996941401.dd9731ee43b104da565257ca1539aa8c.
2014-12-31 09:05:49,929 DEBUG [regionserver60020-splits-1419996941385] 
regionserver.MetricsRegionSourceImpl: Creating new MetricsRegionSourceImpl for 
table t 2e40a44511c0e187d357d651f13a1dab
2014-12-31 09:05:49,929 DEBUG [regionserver60020-splits-1419996941385] 
regionserver.HRegion: Instantiated 
t,row2,1419996941401.2e40a44511c0e187d357d651f13a1dab.
Wed Dec 31 09:06:30 IST 2014 Terminating regionserver
2014-12-31 09:06:30,465 INFO  [Thread-8] regionserver.ShutdownHook: Shutdown 
hook starting; hbase.shutdown.hook=true; 
fsShutdownHook=org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@42d2282e
{code}
-> Skipping rollback if RS stopped or stopping so we end up in dirty daughter 
regions in HDFS.
{code}
2014-12-31 09:07:49,547 INFO  [regionserver60020-splits-1419996941385] 
regionserver.SplitRequest: Skip rollback/cleanup of failed split of 
t,,1419996880699.80c665138d4fa32da4d792d8ed13206f. because server is stopped
java.io.InterruptedIOException: Interrupted after 0 tries  on 350
        at 
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:156)
{code}

Because of this hbck always showing inconsistencies. 
{code}
ERROR: Region { meta => null, hdfs => 
hdfs://localhost:9000/hbase/data/default/t/2e40a44511c0e187d357d651f13a1dab, 
deployed =>  } on HDFS, but not listed in hbase:meta or deployed on any region 
server
ERROR: Region { meta => null, hdfs => 
hdfs://localhost:9000/hbase/data/default/t/dd9731ee43b104da565257ca1539aa8c, 
deployed =>  } on HDFS, but not listed in hbase:meta or deployed on any region 
server
{code}

If we try to repair then we end up in overlap regions in hbase:meta. 


> HBase does not attempt to clean up an aborted split when the regionserver 
> shutting down
> ---------------------------------------------------------------------------------------
>
>                 Key: HBASE-12791
>                 URL: https://issues.apache.org/jira/browse/HBASE-12791
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.98.0
>            Reporter: Rajeshbabu Chintaguntla
>            Assignee: Rajeshbabu Chintaguntla
>            Priority: Critical
>             Fix For: 2.0.0, 0.98.10, 1.0.1
>
>
> HBase not cleaning the daughter region directories from HDFS  if region 
> server shut down after creating the daughter region directories during the 
> split.
> Here the logs.
> -> RS shutdown after creating the daughter regions.
> {code}
> 2014-12-31 09:05:41,406 DEBUG [regionserver60020-splits-1419996941385] 
> zookeeper.ZKAssign: regionserver:60020-0x14a9701e53100d1, 
> quorum=localhost:2181, baseZNode=/hbase Transitioned node 
> 80c665138d4fa32da4d792d8ed13206f from RS_ZK_REQUEST_REGION_SPLIT to 
> RS_ZK_REQUEST_REGION_SPLIT
> 2014-12-31 09:05:41,514 DEBUG [regionserver60020-splits-1419996941385] 
> regionserver.HRegion: Closing 
> t,,1419996880699.80c665138d4fa32da4d792d8ed13206f.: disabling compactions & 
> flushes
> 2014-12-31 09:05:41,514 DEBUG [regionserver60020-splits-1419996941385] 
> regionserver.HRegion: Updates disabled for region 
> t,,1419996880699.80c665138d4fa32da4d792d8ed13206f.
> 2014-12-31 09:05:41,516 INFO  
> [StoreCloserThread-t,,1419996880699.80c665138d4fa32da4d792d8ed13206f.-1] 
> regionserver.HStore: Closed f
> 2014-12-31 09:05:41,518 INFO  [regionserver60020-splits-1419996941385] 
> regionserver.HRegion: Closed 
> t,,1419996880699.80c665138d4fa32da4d792d8ed13206f.
> 2014-12-31 09:05:49,922 DEBUG [regionserver60020-splits-1419996941385] 
> regionserver.MetricsRegionSourceImpl: Creating new MetricsRegionSourceImpl 
> for table t dd9731ee43b104da565257ca1539aa8c
> 2014-12-31 09:05:49,922 DEBUG [regionserver60020-splits-1419996941385] 
> regionserver.HRegion: Instantiated 
> t,,1419996941401.dd9731ee43b104da565257ca1539aa8c.
> 2014-12-31 09:05:49,929 DEBUG [regionserver60020-splits-1419996941385] 
> regionserver.MetricsRegionSourceImpl: Creating new MetricsRegionSourceImpl 
> for table t 2e40a44511c0e187d357d651f13a1dab
> 2014-12-31 09:05:49,929 DEBUG [regionserver60020-splits-1419996941385] 
> regionserver.HRegion: Instantiated 
> t,row2,1419996941401.2e40a44511c0e187d357d651f13a1dab.
> Wed Dec 31 09:06:30 IST 2014 Terminating regionserver
> 2014-12-31 09:06:30,465 INFO  [Thread-8] regionserver.ShutdownHook: Shutdown 
> hook starting; hbase.shutdown.hook=true; 
> fsShutdownHook=org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@42d2282e
> {code}
> -> Skipping rollback if RS stopped or stopping so we end up in dirty daughter 
> regions in HDFS.
> {code}
> 2014-12-31 09:07:49,547 INFO  [regionserver60020-splits-1419996941385] 
> regionserver.SplitRequest: Skip rollback/cleanup of failed split of 
> t,,1419996880699.80c665138d4fa32da4d792d8ed13206f. because server is stopped
> java.io.InterruptedIOException: Interrupted after 0 tries  on 350
>         at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:156)
> {code}
> Because of this hbck always showing inconsistencies. 
> {code}
> ERROR: Region { meta => null, hdfs => 
> hdfs://localhost:9000/hbase/data/default/t/2e40a44511c0e187d357d651f13a1dab, 
> deployed =>  } on HDFS, but not listed in hbase:meta or deployed on any 
> region server
> ERROR: Region { meta => null, hdfs => 
> hdfs://localhost:9000/hbase/data/default/t/dd9731ee43b104da565257ca1539aa8c, 
> deployed =>  } on HDFS, but not listed in hbase:meta or deployed on any 
> region server
> {code}
> If we try to repair then we end up in overlap regions in hbase:meta. and both 
> daughter regions and parent are online.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to