junegunn opened a new pull request, #8001:
URL: https://github.com/apache/hbase/pull/8001

   ## Context
   
   HBASE-30036 (#7993) consolidates redundant delete markers on flush, 
preventing them from growing unbounded in HFiles. However, markers still 
accumulate in the memstore before flush, degrading read performance. 
HBASE-29039 addresses this from the read path side. Both are needed for full 
coverage. There is an open PR (#6557), but the review process has been stalled. 
This is an alternative approach with fewer code changes, hopefully making it 
easier to reach consensus.
   
   ## Test result
   
   Using the test code in https://issues.apache.org/jira/browse/HBASE-30036.
   
   ### `DeleteFamily`
   
   <img width="2304" height="1920" alt="image" 
src="https://github.com/user-attachments/assets/7908b841-0eba-41f2-ac06-be57abd90d79";
 />
   
   - Substantial read performance improvement before flushes.
   - Without HBASE-30036, delete markers still accumulate in store files.
   
   ### `DeleteColumnContiguous`
   
   <img width="2304" height="1920" alt="image" 
src="https://github.com/user-attachments/assets/4b012b94-1659-4e37-8000-cabab28d898a";
 />
   
   - Substantial read performance improvement before flushes.
   - Without HBASE-30036, delete markers still accumulate in store files.
   
   ### `DeleteColumnInterleaved`
   
   <img width="2304" height="1920" alt="image" 
src="https://github.com/user-attachments/assets/cafa0ba9-3f93-496f-8a77-9da6d903aa1d";
 />
   
   - No difference, as expected. Already triggers SEEK_NEXT_COL via the masked 
put.
   
   ## Description
   
   When a DeleteColumn or DeleteFamily marker is encountered during a normal 
user scan, the matcher currently returns SKIP, forcing the scanner to advance 
one cell at a time. This causes read latency to degrade linearly with the 
number of accumulated delete markers for the same row or column.
   
   Since these are range deletes that mask all remaining versions of the 
column, seek past the entire column immediately via 
columns.getNextRowOrNextColumn(). This is safe because cells arrive in 
timestamp descending order, so any puts newer than the delete have already been 
processed.
   
   For DeleteFamily, also fix getKeyForNextColumn in ScanQueryMatcher to bypass 
the empty-qualifier guard (HBASE-18471) when the cell is a DeleteFamily marker. 
Without this, the seek barely advances past the current cell instead of jumping 
to the first real qualified column.
   
   The optimization is skipped when:
   - seePastDeleteMarkers is true (KEEP_DELETED_CELLS)
   - newVersionBehavior is enabled (sequence IDs determine visibility)
   - the delete marker is not tracked (visibility labels)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to