Enis Soztutar created HBASE-11036:
-------------------------------------

             Summary: Online schema change with region merge may cause data 
loss 
                 Key: HBASE-11036
                 URL: https://issues.apache.org/jira/browse/HBASE-11036
             Project: HBase
          Issue Type: Bug
            Reporter: Enis Soztutar
             Fix For: 0.99.0, 0.98.2


We have found out that online schema change and region merges may still cause 
issues about merged regions coming back online, and thus causing data loss. 

Recently ITBLL failed reporting 800K missing rows out of 720M. We've been 
running this test for some extended period of time, and this is the first we 
are seeing it, meaning that it is more rare. But it is still concerning. 

>From master's log:
The merge has happened:
{code}
2014-04-16 18:26:37,247 INFO  [AM.ZK.Worker-pool2-t73] 
master.AssignmentManager: Handled MERGED event; 
merged=IntegrationTestBigLinkedList,\xB2\xFE\x03s,1397672795119.80159738a0167e20a2e29fb2c46702f2.,
       
        
region_a=IntegrationTestBigLinkedList,\xB2\xFE\x03s,1397670978959.0ac116e4d7da87922d3a8f218ca21079.,
 
        
region_b=IntegrationTestBigLinkedList,\xB8\x03\x94\x15,1397672587456.1265d06478082ced65dd9a0c1c2b63c2.,
 on hor13n03.gq1.ygridcore.net,60020,1397672668647
{code}

The region server shows the merge is complete: 
{code}
2014-04-16 18:26:37,352 INFO  [regionserver60020-merges-1397672794972] 
regionserver.RegionMergeRequest: Regions merged, hbase:meta updated, and report 
to master. 
region_a=IntegrationTestBigLinkedList,\xB2\xFE\x03s,1397670978959.0ac116e4d7da87922d3a8f218ca21079.,
 
region_b=IntegrationTestBigLinkedList,\xB8\x03\x94\x15,1397672587456.1265d06478082ced65dd9a0c1c2b63c2..
 Region merge took 2sec
{code}

The new region was online on the region server for some time: 
{code}
2014-04-16 18:31:22,858 DEBUG [RS_OPEN_REGION-hor13n03:60020-1] 
handler.OpenRegionHandler: Opened 
IntegrationTestBigLinkedList,\xB2\xFE\x03s,1397672795119.80159738a0167e20a2e29fb2c46702f2.
 on hor13n03.gq1.ygridcore.net,60020,1397672668647
{code}
Then the region server was killed around 2014-04-16 18:31:26,254. 

The master started log splitting etc for the dead RS:
{code}
2014-04-16 18:31:28,942 INFO  [MASTER_SERVER_OPERATIONS-hor13n02:60000-3] 
handler.ServerShutdownHandler: Splitting logs for 
hor13n03.gq1.ygridcore.net,60020,1397672668647 before assignment.
..
2014-04-16 18:31:40,887 INFO  [MASTER_SERVER_OPERATIONS-hor13n02:60000-3] 
master.RegionStates: Transitioned {80159738a0167e20a2e29fb2c46702f2 state=OPEN, 
ts=1397673082874, server=hor13n03.gq1.ygridcore.net,60020,1397672668647} to 
{80159738a0167e20a2e29fb
2014-04-16 18:31:40,887 INFO  [MASTER_SERVER_OPERATIONS-hor13n02:60000-3] 
master.RegionStates: Offlined 80159738a0167e20a2e29fb2c46702f2 from 
hor13n03.gq1.ygridcore.net,60020,1397672668647
{code}

But this region was not assigned again at all. Instead, the master restarted 
shortly after, and reassigned the regions that were merged already: 
{code}
2014-04-16 18:34:02,569 INFO  [master:hor13n02:60000] 
master.ActiveMasterManager: Registered Active 
Master=hor13n02.gq1.ygridcore.net,60000,1397673241215
...
2014-04-16 18:34:10,412 INFO  [master:hor13n02:60000] master.AssignmentManager: 
Found regions out on cluster or in RIT; presuming failover
2014-04-16 18:34:10,412 WARN  [master:hor13n02:60000] master.ServerManager: 
Expiration of hor13n03.gq1.ygridcore.net,60020,1397671753021 but server not 
online
..
2014-04-16 18:34:10,880 INFO  [MASTER_SERVER_OPERATIONS-hor13n02:60000-3] 
master.AssignmentManager: Bulk assigning 28 region(s) across 4 server(s), 
round-robin=true
..
2014-04-16 18:34:10,882 DEBUG 
[hor13n02.gq1.ygridcore.net,60000,1397673241215-GeneralBulkAssigner-1] 
zookeeper.ZKAssign: master:60000-0x2456a4863640255, 
quorum=hor13n04.gq1.ygridcore.net:2181,hor13n03.gq1.ygridcore.net:2181,hor13n20.gq1.ygridcore.net:2181,
 baseZNode=/hbase Async create of unassigned node 0ac116e4d7da8792..
..
2014-04-16 18:34:13,077 INFO  [AM.ZK.Worker-pool2-t7] master.RegionStates: 
Onlined 0ac116e4d7da87922d3a8f218ca21079 on 
hor13n04.gq1.ygridcore.net,60020,1397672685370
{code}


Before the master went down, there were other logs that indicate something 
funky: 
{code}
2014-04-16 18:32:42,113 DEBUG 
[hor13n02.gq1.ygridcore.net,60000,1397672191204-org.apache.hadoop.hbase.master.BulkReOpen-0]
 master.AssignmentManager: Starting unassign of 
IntegrationTestBigLinkedList,\xB2\xFE\x03s,1397670978959.0ac116e4d7da87922d3a8f218ca
2014-04-16 18:32:42,113 INFO  
[hor13n02.gq1.ygridcore.net,60000,1397672191204-org.apache.hadoop.hbase.master.BulkReOpen-0]
 master.AssignmentManager: Attempting to unassign 
{0ac116e4d7da87922d3a8f218ca21079 state=MERGED, ts=1397672797223, 
server=hor13n03.gq1.ygridcore.net,60020,1397672668647}, ignored
{code}

This above log is due to bulk reopening because of online schema change: 
{code}
2014-04-16 18:32:32,081 DEBUG [FifoRpcScheduler.handler1-thread-27] 
handler.TableEventHandler: Ignoring table not disabled exception for supporting 
online schema changes.
2014-04-16 18:32:32,082 INFO  [FifoRpcScheduler.handler1-thread-27] 
handler.TableEventHandler: Handling table operation C_M_MODIFY_TABLE on table 
IntegrationTestBigLinkedList
2014-04-16 18:32:32,150 DEBUG [FifoRpcScheduler.handler1-thread-27] 
util.FSTableDescriptors: Wrote descriptor into: 
hdfs://hor13n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestBigLinkedList/.tabledesc/.tableinfo.0000000051
2014-04-16 18:32:32,159 DEBUG [FifoRpcScheduler.handler1-thread-27] 
util.FSTableDescriptors: Deleted table descriptor at 
hdfs://hor13n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestBigLinkedList/.tabledesc/.tableinfo.0000000050
2014-04-16 18:32:32,159 INFO  [FifoRpcScheduler.handler1-thread-27] 
util.FSTableDescriptors: Updated 
tableinfo=hdfs://hor13n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestBigLinkedList/.tabledesc/.tableinfo.0000000051
2014-04-16 18:32:32,160 INFO  [FifoRpcScheduler.handler1-thread-27] 
handler.TableEventHandler: Bucketing regions by region server...
2014-04-16 18:32:32,191 INFO  [FifoRpcScheduler.handler1-thread-27] 
handler.TableEventHandler: Skip {ENCODED => e81668b7b64a9c92dd3689db96a3adb0, 
NAME => 
'IntegrationTestBigLinkedList,\xC0\x03O,1397669732121.e81668b7b64a9c92dd3689db96a3adb0.',
 STARTKEY =
2014-04-16 18:32:32,191 INFO  [FifoRpcScheduler.handler1-thread-27] 
handler.TableEventHandler: Skip {ENCODED => a51376af5b2aaef5bde980d48b05d75d, 
NAME => 
'IntegrationTestBigLinkedList,\xE8\x01\x99,,1397671676090.a51376af5b2aaef5bde980d48b05d75d.',
 STARTK
2014-04-16 18:32:32,191 INFO  [FifoRpcScheduler.handler1-thread-27] 
handler.TableEventHandler: Reopening 51 regions on 5 region servers.
{code}

My theory is that this may have left the regions in RIT state, and when the 
master was restarted it saw those in RIT and assigned. 

Also notice that this is not an issue about a concurrent merge and online 
schema change that can be prevented with table locks. The merge happened 
minutes ago, however the online schema change still wanted to reassign the 
merged regions. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to