[jira] [Resolved] (PHOENIX-7039) Snapshot scanner should skip replay WAL and update seqid while opening region

Viraj Jasani (Jira) Sat, 30 Sep 2023 12:00:06 -0700


     [ 
https://issues.apache.org/jira/browse/PHOENIX-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Viraj Jasani resolved PHOENIX-7039.
-----------------------------------
    Resolution: Fixed

> Snapshot scanner should skip replay WAL and update seqid while opening region
> -----------------------------------------------------------------------------
>
>                 Key: PHOENIX-7039
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-7039
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 5.1.3
>            Reporter: Viraj Jasani
>            Assignee: Viraj Jasani
>            Priority: Major
>             Fix For: 5.2.0, 5.1.4
>
>
> When PhoenixRecordReader needs to iterate the records from the snapshot 
> restored table, it uses TableSnapshotResultIterator to retrieve the snapshot 
> manifest and the corresponding region manifests from the snapshot. 
> TableSnapshotResultIterator#next initializes ScanningResultIterator using 
> SnapshotScanner, which in turn opens the given region to perform scan. 
> However, this region is opened by a client and not any regionserver and hence 
> if the original region was split or merged, the current region would be 
> holding reference to parent regions in the hbase archive dir. If the region 
> is already removed from meta as well as file system (hbase data dir) after 
> the successful split/merge operations, region initialization by client still 
> leads to the creation of new seqid file in the region's data dir (on WAL 
> filesystem). While the region data is read from the archive dir, due to the 
> region dir creation in hbase data dir, we get a new orphan region with only 
> .seqid file and no store file. At the same time, hbase archive dir still 
> contains the old region dir with reference to parent region.
>  
> 1. Snapshot creation:
> {code:java}
> 2023-09-13 01:01:50,103 DEBUG [557)-snapshot-pool-2] 
> snapshot.SnapshotManifest - Storing 
> 'TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1684558830177.b5d1b622ef045b52aede650db8690d53.'
>  region-info for snapshot=SNAPSHOT_TABLE1_1694566851085_1694566876390_0
>  {code}
> 2. Region getting archived after merge:
> {code:java}
> 2023-09-13 02:46:58,177 DEBUG [gionserver-4:60020-8] backup.HFileArchiver - 
> Archived from FileableStoreFile, 
> hdfs://cluster1/hbase/data/default/TABLE1/53161e6b59b7a2dcdb85b26e676fd72a/0/aa5058a23c024463bb33bbb2abc68577.b5d1b622ef045b52aede650db8690d53
>  
> to 
> hdfs://cluster1/hbase/archive/data/default/TABLE1/53161e6b59b7a2dcdb85b26e676fd72a/0/aa5058a23c024463bb33bbb2abc68577.b5d1b622ef045b52aede650db8690d53
>  
>  {code}
> 3. Region is deleted from meta and file system:
> {code:java}
> 2023-09-13 02:50:26,054 DEBUG [PEWorker-53] backup.HFileArchiver - Deleted 
> hdfs://cluster1/hbase/data/default/TABLE1/b5d1b622ef045b52aede650db8690d53
> 2023-09-13 02:50:26,123 INFO [PEWorker-53] hbase.MetaTableAccessor - Deleted 
> TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1684558830177.b5d1b622ef045b52aede650db8690d53.
> 2023-09-13 02:50:26,340 INFO [PEWorker-58] procedure2.ProcedureExecutor - 
> Finished pid=1006984, state=SUCCESS; GCMultipleMergedRegionsProcedure 
> child=53161e6b59b7a2dcdb85b26e676fd72a, 
> parents:[b5d1b622ef045b52aede650db8690d53], 
> [cbf697faee6a0c3eaf8c17e1bf12239a] in 434 msec
> 2023-09-13 02:50:26,269 INFO [PEWorker-58] hbase.MetaTableAccessor - Deleted 
> merge references in 
> TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1685345080046.53161e6b59b7a2dcdb85b26e676fd72a.,
>  deleted qualifiers merge0000, merge0001
>  {code}
> 4. Snapshot scanner region init
> {code:java}
> 2023-09-13 04:06:27,637 INFO [main] 
> org.apache.phoenix.iterate.SnapshotScanner: 
> Creating SnapshotScanner for region: 
> {ENCODED => b5d1b622ef045b52aede650db8690d53, NAME => 
> 'TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1684558830177.b5d1b622ef045b52aede650db8690d53.',
>  STARTKEY => 
> '00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe', 
> ENDKEY => 
> '00DAG00000005sXa07\x80\x00\x01\x87\x80\x02P@a07AG0000183cN3017AG00002lPrRe'}
>  {code}
> 5. Region dir with seqid gets created
> {code:java}
> 2023-09-13 04:06:28,431 INFO [on default port 9000] hdfs.StateChange - DIR* 
> completeFile: 
> /hbase/data/default/TABLE1/b5d1b622ef045b52aede650db8690d53/recovered.edits/17042749.seqid
>  is closed by DFSClient_attempt_1692995189831_25389_m_000797_0_-1558517803_1
>  {code}
> 6. Remaining region init with store init completion:
> {code:java}
> 2023-09-13 04:06:28,354 INFO [StoreOpener-b5d1b622ef045b52aede650db8690d53-1] 
> org.apache.hadoop.hbase.regionserver.HStore: 
> Store=b5d1b622ef045b52aede650db8690d53/0, memstore type=DefaultMemStore, 
> storagePolicy=HOT, verifyBulkLoads=false, parallelPutCountPrintThreshold=50, 
> encoding=FAST_DIFF, compression=NONE
> 2023-09-13 04:06:28,439 INFO [main] 
> org.apache.hadoop.hbase.regionserver.HRegion: 
> Opened b5d1b622ef045b52aede650db8690d53; 
> next sequenceid=17042750; 
> SteppingSplitPolicysuper{IncreasingToUpperBoundRegionSplitPolicy{initialSize=536870912,
>  ConstantSizeRegionSplitPolicy{desiredMaxFileSize=11007665920, 
> jitterRate=0.025168776512145996}}}, 
> FlushLargeStoresPolicy{flushSizeLowerBound=-1}
>  {code}
> While opening region from the client side, we should provide flag to ensure 
> the seqid file is not generated as per HBASE-21977.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (PHOENIX-7039) Snapshot scanner should skip replay WAL and update seqid while opening region

Reply via email to