Devaraj Das created HBASE-12319:
-----------------------------------

             Summary: Inconsistencies during region recovery due to close/open 
of a region during recovery
                 Key: HBASE-12319
                 URL: https://issues.apache.org/jira/browse/HBASE-12319
             Project: HBase
          Issue Type: Bug
            Reporter: Devaraj Das


In one of my test runs, I saw the following:
{noformat}
2014-10-14 13:45:30,782 DEBUG [StoreOpener-51af4bd23dc32a940ad2dd5435f00e1d-1] 
regionserver.HStore: loaded 
hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d/test_cf/d6df5cfe15ca41d68c619489fbde4d04,
 isReference=false, isBulkLoadResult=false, seqid=141197, majorCompaction=true
2014-10-14 13:45:30,788 DEBUG [RS_OPEN_REGION-hor9n01:60020-1] 
regionserver.HRegion: Found 3 recovered edits file(s) under 
hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d
.............
.............
2014-10-14 13:45:31,916 WARN  [RS_OPEN_REGION-hor9n01:60020-1] 
regionserver.HRegion: Null or non-existent edits file: 
hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d/recovered.edits/0000000000000198080
{noformat}

The above logs is from a regionserver, say RS2. From the initial analysis it 
seemed like the master asked a certain regionserver to open the region (let's 
say RS1) and for some reason asked it to close soon after. The open was still 
proceeding on RS1 but the master reassigned the region to RS2. This also 
started the recovery but it ended up seeing an inconsistent view of the 
recovered-edits files (it reports missing files as per the logs above) since 
the first regionserver (RS1) deleted some files after it completed the 
recovery. When RS2 really opens the region, it might not see the recent data 
that was written by flushes on hor9n10 during the recovery process. Reads of 
that data would have inconsistencies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to