[ 
https://issues.apache.org/jira/browse/HBASE-25322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaolin Ha updated HBASE-25322:
-------------------------------
    Priority: Minor  (was: Blocker)

> Redundant Reference file in bottom region of split
> --------------------------------------------------
>
>                 Key: HBASE-25322
>                 URL: https://issues.apache.org/jira/browse/HBASE-25322
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 3.0.0-alpha-2
>            Reporter: Xiaolin Ha
>            Assignee: Xiaolin Ha
>            Priority: Minor
>             Fix For: 3.0.0-alpha-2, 2.4.9
>
>
> When we split a region ranges from (,), the bottom region should contain keys 
> of(,split key), and the top region should contain keys of [split key, ).
> Currently, if we do the following operations:
>  # put rowkeys 100,101,102,103,104,105 to a table, and flush the memstore to 
> make a hfile with rowkeys 100,101,102,103,104,105;
>  # put rowkeys 200,201,202,203,204,205 to the table, and flush the memstore 
> to make a hfile with rowkeys 200,201,202,203,204,205;
>  # split the table region, using split key 200;
>  # then the bottom region will has two Reference files, while the top region 
> only has one.
> But we expect the bottom region has only one Reference file as the the top 
> region.
> That's because when generating Reference files in child region,  the bottom 
> region used the `PrivateCellUtil.createLastOnRow(splitRow)` cell to compare 
> to first keys in the hfiles, while the top region used 
> `PrivateCellUtil.createFirstOnRow(splitRow)` cell to compare to last keys in 
> the hfiles.
> `LastOnRow(splitRow)` means the maximum row generated by the split row, while 
> `FirstOnRow(splitRow)` means the minimus row generated by the split row. The 
> split row should be in the top region. And we should use 
> `FirstOnRow(splitRow)` compare to hfile first and last keys in both bottom 
> and top region. 
> Though the redundant Reference file will not be read by the bottom region, 
> the compaction of the redundant Reference file will result in empty file if 
> only this redundant Reference file participates in a compaction.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to