Viraj Jasani created PHOENIX-7039:
-------------------------------------

             Summary: Snapshot scanner should skip replay WAL and update seqid 
while opening region
                 Key: PHOENIX-7039
                 URL: https://issues.apache.org/jira/browse/PHOENIX-7039
             Project: Phoenix
          Issue Type: Bug
    Affects Versions: 5.1.3
            Reporter: Viraj Jasani
            Assignee: Viraj Jasani
             Fix For: 5.2.0, 5.1.4


When PhoenixRecordReader needs to iterate the records from the snapshot 
restored table, it uses TableSnapshotResultIterator to retrieve the snapshot 
manifest and the corresponding region manifests from the snapshot. 

TableSnapshotResultIterator#next initializes ScanningResultIterator using 
SnapshotScanner, which in turn opens the given region to perform scan. However, 
this region is opened by a client and not any regionserver and hence if the 
original region was split or merged, the current region would be holding 
reference to parent regions in the hbase archive dir. If the region is already 
removed from meta as well as file system after the successful split/merge 
operations, region initialization by client still leads to the creation of new 
seqid file in the region's root dir. While the region data is read from the 
archive dir, due to the region dir creation, we get a new orphan region with 
only .seqid file and no store file. At the same time, hbase archive dir still 
contains the old region dir with reference to parent region.

 

1. Snapshot creation:
{code:java}
2023-09-13 01:01:50,103 DEBUG [557)-snapshot-pool-2] snapshot.SnapshotManifest 
- Storing 
'TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1684558830177.b5d1b622ef045b52aede650db8690d53.'
 region-info for snapshot=SNAPSHOT_TABLE1_1694566851085_1694566876390_0
 {code}
2. Region getting archived after merge:
{code:java}
2023-09-13 02:46:58,177 DEBUG [gionserver-4:60020-8] backup.HFileArchiver - 
Archived from FileableStoreFile, 
hdfs://cluster1/hbase/data/default/TABLE1/53161e6b59b7a2dcdb85b26e676fd72a/0/aa5058a23c024463bb33bbb2abc68577.b5d1b622ef045b52aede650db8690d53
 
to 
hdfs://cluster1/hbase/archive/data/default/TABLE1/53161e6b59b7a2dcdb85b26e676fd72a/0/aa5058a23c024463bb33bbb2abc68577.b5d1b622ef045b52aede650db8690d53
 
 {code}
3. Region is deleted from meta and file system:
{code:java}
2023-09-13 02:50:26,054 DEBUG [PEWorker-53] backup.HFileArchiver - Deleted 
hdfs://cluster1/hbase/data/default/TABLE1/b5d1b622ef045b52aede650db8690d53

2023-09-13 02:50:26,123 INFO [PEWorker-53] hbase.MetaTableAccessor - Deleted 
TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1684558830177.b5d1b622ef045b52aede650db8690d53.

2023-09-13 02:50:26,340 INFO [PEWorker-58] procedure2.ProcedureExecutor - 
Finished pid=1006984, state=SUCCESS; GCMultipleMergedRegionsProcedure 
child=53161e6b59b7a2dcdb85b26e676fd72a, 
parents:[b5d1b622ef045b52aede650db8690d53], [cbf697faee6a0c3eaf8c17e1bf12239a] 
in 434 msec

2023-09-13 02:50:26,269 INFO [PEWorker-58] hbase.MetaTableAccessor - Deleted 
merge references in 
TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1685345080046.53161e6b59b7a2dcdb85b26e676fd72a.,
 deleted qualifiers merge0000, merge0001
 {code}
4. Snapshot scanner region init
{code:java}
2023-09-13 04:06:27,637 INFO [main] org.apache.phoenix.iterate.SnapshotScanner: 
Creating SnapshotScanner for region: 
{ENCODED => b5d1b622ef045b52aede650db8690d53, NAME => 
'TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1684558830177.b5d1b622ef045b52aede650db8690d53.',
 STARTKEY => 
'00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe', 
ENDKEY => 
'00DAG00000005sXa07\x80\x00\x01\x87\x80\x02P@a07AG0000183cN3017AG00002lPrRe'}
 {code}
5. Region dir with seqid gets created
{code:java}
2023-09-13 04:06:28,431 INFO [on default port 9000] hdfs.StateChange - DIR* 
completeFile: 
/hbase/data/default/TABLE1/b5d1b622ef045b52aede650db8690d53/recovered.edits/17042749.seqid
 is closed by DFSClient_attempt_1692995189831_25389_m_000797_0_-1558517803_1
 {code}
6. Remaining region init with store init completion:
{code:java}
2023-09-13 04:06:28,354 INFO [StoreOpener-b5d1b622ef045b52aede650db8690d53-1] 
org.apache.hadoop.hbase.regionserver.HStore: 
Store=b5d1b622ef045b52aede650db8690d53/0, memstore type=DefaultMemStore, 
storagePolicy=HOT, verifyBulkLoads=false, parallelPutCountPrintThreshold=50, 
encoding=FAST_DIFF, compression=NONE

2023-09-13 04:06:28,439 INFO [main] 
org.apache.hadoop.hbase.regionserver.HRegion: 
Opened b5d1b622ef045b52aede650db8690d53; 
next sequenceid=17042750; 
SteppingSplitPolicysuper{IncreasingToUpperBoundRegionSplitPolicy{initialSize=536870912,
 ConstantSizeRegionSplitPolicy{desiredMaxFileSize=11007665920, 
jitterRate=0.025168776512145996}}}, 
FlushLargeStoresPolicy{flushSizeLowerBound=-1}
 {code}
While opening region from the client side, we should provide flag to ensure the 
seqid file is not generated as per HBASE-21977.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to