Viraj Jasani created PHOENIX-7039:
-------------------------------------
Summary: Snapshot scanner should skip replay WAL and update seqid
while opening region
Key: PHOENIX-7039
URL: https://issues.apache.org/jira/browse/PHOENIX-7039
Project: Phoenix
Issue Type: Bug
Affects Versions: 5.1.3
Reporter: Viraj Jasani
Assignee: Viraj Jasani
Fix For: 5.2.0, 5.1.4
When PhoenixRecordReader needs to iterate the records from the snapshot
restored table, it uses TableSnapshotResultIterator to retrieve the snapshot
manifest and the corresponding region manifests from the snapshot.
TableSnapshotResultIterator#next initializes ScanningResultIterator using
SnapshotScanner, which in turn opens the given region to perform scan. However,
this region is opened by a client and not any regionserver and hence if the
original region was split or merged, the current region would be holding
reference to parent regions in the hbase archive dir. If the region is already
removed from meta as well as file system after the successful split/merge
operations, region initialization by client still leads to the creation of new
seqid file in the region's root dir. While the region data is read from the
archive dir, due to the region dir creation, we get a new orphan region with
only .seqid file and no store file. At the same time, hbase archive dir still
contains the old region dir with reference to parent region.
1. Snapshot creation:
{code:java}
2023-09-13 01:01:50,103 DEBUG [557)-snapshot-pool-2] snapshot.SnapshotManifest
- Storing
'TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1684558830177.b5d1b622ef045b52aede650db8690d53.'
region-info for snapshot=SNAPSHOT_TABLE1_1694566851085_1694566876390_0
{code}
2. Region getting archived after merge:
{code:java}
2023-09-13 02:46:58,177 DEBUG [gionserver-4:60020-8] backup.HFileArchiver -
Archived from FileableStoreFile,
hdfs://cluster1/hbase/data/default/TABLE1/53161e6b59b7a2dcdb85b26e676fd72a/0/aa5058a23c024463bb33bbb2abc68577.b5d1b622ef045b52aede650db8690d53
to
hdfs://cluster1/hbase/archive/data/default/TABLE1/53161e6b59b7a2dcdb85b26e676fd72a/0/aa5058a23c024463bb33bbb2abc68577.b5d1b622ef045b52aede650db8690d53
{code}
3. Region is deleted from meta and file system:
{code:java}
2023-09-13 02:50:26,054 DEBUG [PEWorker-53] backup.HFileArchiver - Deleted
hdfs://cluster1/hbase/data/default/TABLE1/b5d1b622ef045b52aede650db8690d53
2023-09-13 02:50:26,123 INFO [PEWorker-53] hbase.MetaTableAccessor - Deleted
TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1684558830177.b5d1b622ef045b52aede650db8690d53.
2023-09-13 02:50:26,340 INFO [PEWorker-58] procedure2.ProcedureExecutor -
Finished pid=1006984, state=SUCCESS; GCMultipleMergedRegionsProcedure
child=53161e6b59b7a2dcdb85b26e676fd72a,
parents:[b5d1b622ef045b52aede650db8690d53], [cbf697faee6a0c3eaf8c17e1bf12239a]
in 434 msec
2023-09-13 02:50:26,269 INFO [PEWorker-58] hbase.MetaTableAccessor - Deleted
merge references in
TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1685345080046.53161e6b59b7a2dcdb85b26e676fd72a.,
deleted qualifiers merge0000, merge0001
{code}
4. Snapshot scanner region init
{code:java}
2023-09-13 04:06:27,637 INFO [main] org.apache.phoenix.iterate.SnapshotScanner:
Creating SnapshotScanner for region:
{ENCODED => b5d1b622ef045b52aede650db8690d53, NAME =>
'TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1684558830177.b5d1b622ef045b52aede650db8690d53.',
STARTKEY =>
'00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe',
ENDKEY =>
'00DAG00000005sXa07\x80\x00\x01\x87\x80\x02P@a07AG0000183cN3017AG00002lPrRe'}
{code}
5. Region dir with seqid gets created
{code:java}
2023-09-13 04:06:28,431 INFO [on default port 9000] hdfs.StateChange - DIR*
completeFile:
/hbase/data/default/TABLE1/b5d1b622ef045b52aede650db8690d53/recovered.edits/17042749.seqid
is closed by DFSClient_attempt_1692995189831_25389_m_000797_0_-1558517803_1
{code}
6. Remaining region init with store init completion:
{code:java}
2023-09-13 04:06:28,354 INFO [StoreOpener-b5d1b622ef045b52aede650db8690d53-1]
org.apache.hadoop.hbase.regionserver.HStore:
Store=b5d1b622ef045b52aede650db8690d53/0, memstore type=DefaultMemStore,
storagePolicy=HOT, verifyBulkLoads=false, parallelPutCountPrintThreshold=50,
encoding=FAST_DIFF, compression=NONE
2023-09-13 04:06:28,439 INFO [main]
org.apache.hadoop.hbase.regionserver.HRegion:
Opened b5d1b622ef045b52aede650db8690d53;
next sequenceid=17042750;
SteppingSplitPolicysuper{IncreasingToUpperBoundRegionSplitPolicy{initialSize=536870912,
ConstantSizeRegionSplitPolicy{desiredMaxFileSize=11007665920,
jitterRate=0.025168776512145996}}},
FlushLargeStoresPolicy{flushSizeLowerBound=-1}
{code}
While opening region from the client side, we should provide flag to ensure the
seqid file is not generated as per HBASE-21977.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)