[ 
https://issues.apache.org/jira/browse/HBASE-24273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17094195#comment-17094195
 ] 

Huaxiang Sun commented on HBASE-24273:
--------------------------------------

[~stack], yeah, in both cases, we should not add these to orphans. So seems 
that the only source of truth is the new merged region in the meta, which has 
parent regions in mergeXXX columns and they are cleaned up during GC. Memory 
state in the master is reliable until it reboots. As [~timoha] pointed out, 
parents regions are removed from meta at early stage and they need to be 
recovered from merged region's mergeXXXX columns (assuming that procedures will 
restart after states are recovered in the new master, will check out the 
details to confirm). 

The cached info in hbck_chore seems to give stale info even after the above 
case is fixed. I am thinking that moving reporting of these orphanInFS to HBCK2 
since now meta table and region in FS are two sources it depends to check if a 
region at FS is a real orphan. What do you think? 

> HBCK's "Orphan Regions on FileSystem" reports regions with referenced HFiles
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-24273
>                 URL: https://issues.apache.org/jira/browse/HBASE-24273
>             Project: HBase
>          Issue Type: Bug
>          Components: hbck2
>    Affects Versions: 2.2.4
>         Environment: HBase 2.2.4
> Hadoop 3.1.3
>            Reporter: Andrey Elenskiy
>            Priority: Critical
>             Fix For: 3.0.0, 2.3.0
>
>
> This issue came up after merging regions. MergeTableRegionsProcedure removes 
> the parent regions from hbase:meta and creates HFile references in child 
> region to the old parent regions. Running `hbck_chore_run` right after the 
> `merge_region` will show the parent regions in "Orphan Regions on FileSystem" 
> until major compaction is run on child region which will remove HFile 
> references and cause Catalog Janitor to clean up the parent regions.
> There are probably other situations which can cause the same issue (maybe 
> region split?)
> Having "Orphan Regions on FileSystem" list parent regions and suggest to 
> "_hbase completebulkload_" is dangerous in this case as completing bulk load 
> will lead to stale HFile references in child region which will cause its OPEN 
> to fail because referenced HFile doesn't exist.
> Figuring out these things for database administrators is tedious, so I think 
> it would be reasonable to not consider regions with referenced HFiles to be 
> orphans (or maybe could give an extra hint saying that it has referenced 
> HFiles).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to