github-actions[bot] commented on code in PR #63434:
URL: https://github.com/apache/doris/pull/63434#discussion_r3272152208
##########
be/src/exec/rowid_fetcher.cpp:
##########
@@ -1072,22 +1077,17 @@ Status RowIdStorageReader::read_doris_format_row(
return Status::InternalError("Tablet {} does not have row store
for all columns",
tablet->tablet_id());
}
- for (auto row_id : row_ids) {
- RowLocation loc(rowset_id, segment->id(),
cast_set<uint32_t>(row_id));
- row_store_read_struct.row_store_buffer.clear();
- RETURN_IF_ERROR(scope_timer_run(
- [&]() {
- return tablet->lookup_row_data({}, loc, rowset, stats,
-
row_store_read_struct.row_store_buffer);
- },
- lookup_row_data_ms));
+ auto row_store_rows = ColumnString::create();
+ RETURN_IF_ERROR(scope_timer_run(
+ [&]() {
+ return tablet->lookup_row_data({}, segment_id, row_ids,
rowset, stats,
+ *row_store_rows);
Review Comment:
This batched row-store read can return wrong rows when `row_ids` are not
monotonically increasing. `read_batch_doris_format_row()` groups only
contiguous equal `file_id`s from the request; it does not sort by row id. The
request is built in result-row order
(`MaterializationSharedState::create_muiltget_result()` appends
`row_location.row_id` as rows arrive), so a batch for the same segment can be
in arbitrary order such as `[100, 1]`. `FileColumnIterator::read_by_rowids()`
seeks to `rowids[total_read_count]` and then advances within the current page,
so it assumes the remaining row ids are ordered within/after that page; the old
per-row loop did not have this assumption. Please either preserve the old
one-by-one path for unordered input, or sort `(row_id, original_index)` for the
iterator and restore the original output order before `jsonb_to_block()`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]