hudi-bot opened a new issue, #16986:
URL: https://github.com/apache/hudi/issues/16986
The HFile reader in HBase has two primary modes: # *Pread mode* - This is
the first mode that starts with "p"
# *Streaming mode* - This is the second mode that prefixes with "stream"
{*}Pread mode{*}: (point look up, already supported) * Uses random access
reads (positioned reads)
* Good for seeking to specific positions in the file
* Optimized for random access patterns
* Typically used for point lookups or when reading non-contiguous parts of
a file
* More efficient when you need to jump around in the file
{*}Streaming mode{*}: * Uses sequential reads
* Optimized for reading large contiguous sections of data
* More efficient when reading entire blocks or scanning through data
sequentially
* Better performance for scan operations or when reading a file from
beginning to end
* Reduces the number of I/O operations
The streaming mode we should add similar support in hudi hfile reader.
## JIRA info
- Link: https://issues.apache.org/jira/browse/HUDI-9353
- Type: Improvement
---
## Comments
05/May/25 16:59;daviszhang;next step:
# Need to showcase when batch look up on 1 index file group will happen
# in that case, when we use the old hfile reader streaming mode it out
performs the point look up mode;;;
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]