BiteTheDDDDt opened a new pull request, #63544:
URL: https://github.com/apache/doris/pull/63544
### What problem does this PR solve?
Issue Number: None
Related PR: #63436
Problem Summary: Revert #63436 because recent profiling shows the row id
read path spending significant CPU in `RleDecoder<bool>::GetNextRun`,
`BitReader::GetValue/GetVlqInt`, `FileColumnIterator::read_by_rowids`,
`BitShufflePageDecoder::read_by_rowids`, and LZ4 decompression. The reverted
change groups sparse row ids by segment and reads them in larger sorted
batches. For sparse nullable columns this can make
`FileColumnIterator::read_by_rowids` advance through large null-map ranges and
spend more CPU in RLE/bit decoding. Restore the previous per-row and
adjacent-file batching behavior while the sparse nullable row id access pattern
is investigated.
### Release note
None
### Check List (For Author)
- Test: Manual test
- `build-support/check-format.sh be/src/exec/rowid_fetcher.cpp
be/src/service/point_query_executor.cpp be/src/storage/segment/segment.cpp
be/src/storage/segment/segment.h`
- `ninja -C be/build_Release
src/exec/CMakeFiles/Exec.dir/rowid_fetcher.cpp.o
src/service/CMakeFiles/Service.dir/point_query_executor.cpp.o
src/storage/CMakeFiles/Storage.dir/segment/segment.cpp.o`
- Behavior changed: Yes. Restore row id fetch behavior before #63436.
- Does this need documentation: No
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]