sharmaar12 commented on PR #7149: URL: https://github.com/apache/hbase/pull/7149#issuecomment-3269427414
> Tested with the following steps: > > 1. Create new table 'andor' and added a row and flushed > > ``` > hbase:002:0> create 'andor', 'cf1' > Created table andor > Took 0.6623 seconds > => Hbase::Table - andor > hbase:005:0> put 'andor', 'r1', 'cf1', 'bela1' > Took 0.0099 seconds > hbase:007:0> flush 'andor' > Took 0.3376 seconds > > $ ls -al cf1/ > total 16 > drwxr-xr-x 3 andor staff 96 Sep 2 14:42 . > drwxr-xr-x 6 andor staff 192 Sep 2 14:23 .. > -rw-r--r-- 1 andor staff 4959 Sep 2 14:23 56eb73a0801c4c9c91164e74dfaecebe > ``` > > 2. Added another row and flushed again > > ``` > hbase:020:0> scan 'andor' > ROW COLUMN+CELL > r1 column=cf1:, timestamp=2025-09-02T14:23:00.052, value=bela1 > r2 column=cf1:, timestamp=2025-09-02T14:42:42.472, value=bela2 > 2 row(s) > Took 0.0029 seconds > > $ ls -al cf1/ > total 32 > drwxr-xr-x 4 andor staff 128 Sep 2 14:43 . > drwxr-xr-x 6 andor staff 192 Sep 2 14:23 .. > -rw-r--r-- 1 andor staff 4959 Sep 2 14:23 56eb73a0801c4c9c91164e74dfaecebe > -rw-r--r-- 1 andor staff 4959 Sep 2 14:42 ecd71932ecb44b639ede20b42bc7b397 > ``` > > 3. Moved away the second HFile (ecd71932ecb44b639ede20b42bc7b397) and done **refresh_hfiles** successfully. I only see the first row in the table. > > ``` > hbase:024:0> scan 'andor' > ROW COLUMN+CELL > r1 column=cf1:, timestamp=2025-09-02T14:23:00.052, value=bela1 > 1 row(s) > ``` > > 4. Moved back the second HFile (ecd71932ecb44b639ede20b42bc7b397) and done **refresh_hfiles** again, but I can't see the second row no matter what I do. > > ``` > $ ls -al cf1/ > total 32 > drwxr-xr-x 4 andor staff 128 Sep 2 14:43 . > drwxr-xr-x 6 andor staff 192 Sep 2 14:23 .. > -rw-r--r-- 1 andor staff 4959 Sep 2 14:23 56eb73a0801c4c9c91164e74dfaecebe > -rw-r--r-- 1 andor staff 4959 Sep 2 14:42 ecd71932ecb44b639ede20b42bc7b397 > > hbase:025:0> refresh_hfiles > Took 0.0017 seconds > => 21 > hbase:026:0> scan 'andor' > ROW COLUMN+CELL > r1 column=cf1:, timestamp=2025-09-02T14:23:00.052, value=bela1 > 1 row(s) > Took 0.0023 seconds > ``` > > 5. Restarting HBase solves the problem, I can see the second row again. (After discussion and help from @wchevreuil we have arrived at the following conclusion) **TL;DR:** The above scenarios is an invalid test expectation and will never arise in the real world active and read-replica scenarios. The issue is coming because the Hfile name conflict when removing (marking it as compacted) and adding the same file (compacted file) back to the store. In case of active and read-replica the HFile names will always be unique hence the name conflict will never arise. **Detailed Explanation:** We can break the refreshHFiles in two parts: **Step1:** Detecting/Loading the newly added HFiles **Step2:** Making the newly added HFiles available for reading **Our Test scenario is:** 1. Create Table with two rows, execute flush after each insert so that we have two HFiles. 2. Move one HFile to another directory 3. RefreshHFiles 4. Move back the same file in 2 to CF (Column Family/store) directory 5. Again RefreshHFiles (Expected behavior is we should see the two rows that we have added in 1) In both 3 and 5 we detect that one file is deleted and one file is newly added respectively. But why the data is not available for reading post 5 is because behavior in 3. Let see details of what happen in 3, say there are two HFiles `hf1` and `hf2` and you removed `hf1`, when we execute refereshHfiles then we detect that `hf1` is gone so we removed it from our SFT structure but internally HBase does not remove it, it actually mark it as compacted file. This is done so that for the transaction coming before this refresh should have access to it and for transaction coming after this refresh should not have access. These marked compacted files will get clean up by background chores. Note that the in memory structure has marked the file (with file name `hf1`) as compacted so anything coming after this point will not be able to access it. In 5, **Step1** of refreshHfiles successfully determine that `hf1` has been added newly, but when we try to open it for reading (**Step2**), it is not allowed because `hf1` (same name) is marked as compacted. So refreshing Hfiles is not an issue but reading the file is. If we try changing the name of the file before copying back then it work properly. Hence, we can safely assume, this scenario will not arise in case of Active and Read replica as the HFile names will always be unique and will not have such conflicts. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
