Wellington Chevreuil created HBASE-27021:
--------------------------------------------

             Summary: StoreFileInfo should set its initialPath in a consistent 
way
                 Key: HBASE-27021
                 URL: https://issues.apache.org/jira/browse/HBASE-27021
             Project: HBase
          Issue Type: Bug
            Reporter: Wellington Chevreuil
            Assignee: Wellington Chevreuil


Currently, StoreFileInfo provides overloaded public constructors where the 
related file path can be passed as either a Path or FileStatus instance. This 
can lead to the StoreFileInfo instances related to the same file entry to have 
different representations of the file path, which could create problems for 
functions relying on equality for comparing store files. One example I could 
find is the StoreEngine.refreshStoreFiles method, which list some files from 
the SFT, then compares against a list of files from the SFM to decide how it 
should update SFM internal cache. Here's a sample output from the 
TestHStore.testRefreshStoreFiles:

-------

2022-05-10T15:06:42,831 INFO [Time-limited test] regionserver.StoreEngine(399): 
Refreshing store files for 
org.apache.hadoop.hbase.regionserver.DefaultStoreEngine@69d58ac1 files to add: 
[file:/hbase/hbase-server/target/test-data/e3eac5ce-9bdf-8624-bcec-09c89790d682/TestStoretestRefreshStoreFiles/data/default/table/da6a3cf38941b37cd16438d554b13bbc/family/6e92c2f5cf1f40f7b8c6b6b34a176fa5,
 
file:/hbase/hbase-server/target/test-data/e3eac5ce-9bdf-8624-bcec-09c89790d682/TestStoretestRefreshStoreFiles/data/default/table/da6a3cf38941b37cd16438d554b13bbc/family/{*}fa4d5909da644d94873cbfdc6b5a07da{*}]
 files to remove: 
[/hbase/hbase-server/target/test-data/e3eac5ce-9bdf-8624-bcec-09c89790d682/TestStoretestRefreshStoreFiles/data/default/table/da6a3cf38941b37cd16438d554b13bbc/family/{*}fa4d5909da644d94873cbfdc6b5a07da{*}]

-------

The above will wrongly add it to SFM's list of compacted files, making a valid 
file potentially eligible for deletion and data loss.

I think we can avoid that by always converting Path instances passed in 
StoreFileInfo constructors to a FileStatus, for consistently build the internal 
StoreFileInfo path.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to