guluo created HBASE-28956: ----------------------------- Summary: RSMobFileCleanerChore may close the StoreFileReader object which is being used by Compaction thread Key: HBASE-28956 URL: https://issues.apache.org/jira/browse/HBASE-28956 Project: HBase Issue Type: Bug Components: Compaction, mob Reporter: guluo Assignee: guluo
For MOB table, RSMobFileCleanerChore is responsible for cleaning MOB files that are no longer referebced by the region located in the current RegionServer. RSMobFileCleanerChore get the the information of MOB files by reading the storefile, as fallow. ```java // RSMobFileCleanerChore.chore() sf.initReader(); byte[] mobRefData = sf.getMetadataValue(HStoreFile.MOB_FILE_REFS); byte[] bulkloadMarkerData = sf.getMetadataValue(HStoreFile.BULKLOAD_TASK_KEY); // close store file to avoid memory leaks sf.closeStoreFile(true); ``` There is an issue in here, if the StoreFileReader was not created by RSMobFileCleanerChore, but RSMobFileCleanerChore closed it, which will cause the thread that created the object to be unusable, resuting ERROR finally. Reproduction: This is an occasional problem, but the probability of its occurrence can be increased by making the following modifications. 1. Setting hbase.master.mob.cleaner.period from 24h to 10s, and restart hbase. 2. Puting some mob data into a MOB table. 3. At the same time, executing compaction command for the MOB table, and it is possible that this problem may occur. The error logs as follow. ERROR: java.io.IOException: Cannot invoke "org.apache.hadoop.hbase.regionserver.StoreFileReader.getMaxTimestamp()" because the return value of "org.apache.hadoop.hbase.regionserver.HStoreFile.getReader()" is null at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:512) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102) at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) Caused by: java.lang.NullPointerException: Cannot invoke "org.apache.hadoop.hbase.regionserver.StoreFileReader.getMaxTimestamp()" because the return value of "org.apache.hadoop.hbase.regionserver.HStoreFile.getReader()" is null at org.apache.hadoop.hbase.regionserver.DefaultStoreFileManager.lambda$getUnneededFiles$3(DefaultStoreFileManager.java:235) -- This message was sent by Atlassian Jira (v8.20.10#820010)