BiteTheDDDDt opened a new pull request, #63544:
URL: https://github.com/apache/doris/pull/63544

   ### What problem does this PR solve?
   
   Issue Number: None
   
   Related PR: #63436
   
   Problem Summary: Revert #63436 because recent profiling shows the row id 
read path spending significant CPU in `RleDecoder<bool>::GetNextRun`, 
`BitReader::GetValue/GetVlqInt`, `FileColumnIterator::read_by_rowids`, 
`BitShufflePageDecoder::read_by_rowids`, and LZ4 decompression. The reverted 
change groups sparse row ids by segment and reads them in larger sorted 
batches. For sparse nullable columns this can make 
`FileColumnIterator::read_by_rowids` advance through large null-map ranges and 
spend more CPU in RLE/bit decoding. Restore the previous per-row and 
adjacent-file batching behavior while the sparse nullable row id access pattern 
is investigated.
   
   ### Release note
   
   None
   
   ### Check List (For Author)
   
   - Test: Manual test
       - `build-support/check-format.sh be/src/exec/rowid_fetcher.cpp 
be/src/service/point_query_executor.cpp be/src/storage/segment/segment.cpp 
be/src/storage/segment/segment.h`
       - `ninja -C be/build_Release 
src/exec/CMakeFiles/Exec.dir/rowid_fetcher.cpp.o 
src/service/CMakeFiles/Service.dir/point_query_executor.cpp.o 
src/storage/CMakeFiles/Storage.dir/segment/segment.cpp.o`
   - Behavior changed: Yes. Restore row id fetch behavior before #63436.
   - Does this need documentation: No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to