github-actions[bot] commented on code in PR #62661:
URL: https://github.com/apache/doris/pull/62661#discussion_r3417688818


##########
be/src/cloud/cloud_tablets_channel.cpp:
##########
@@ -64,6 +64,25 @@ Status CloudTabletsChannel::add_batch(const 
PTabletWriterAddBlockRequest& reques
         return Status::OK();
     }
 
+    if (request.is_receiver_side_random_bucket()) {
+        std::unordered_map<int64_t, DorisVector<uint32_t>> 
partition_to_rowidxs;
+        
RETURN_IF_ERROR(_build_partition_to_rowidxs_for_receiver_side_random_bucket(
+                request, &partition_to_rowidxs));
+        if (!partition_to_rowidxs.empty()) {
+            std::unordered_set<int64_t> partition_ids;
+            partition_ids.reserve(partition_to_rowidxs.size());
+            for (const auto& [partition_id, _] : partition_to_rowidxs) {
+                partition_ids.insert(partition_id);
+            }
+            {
+                std::lock_guard<std::mutex> l(_tablet_writers_lock);
+                RETURN_IF_ERROR(_init_writers_by_partition_ids(partition_ids));

Review Comment:
   This adaptive cloud path initializes every writer in each touched partition 
before the receiver chooses the current adaptive tablet. In the normal cloud 
path, when `skip_writing_empty_rowset_metadata` is true (the default), only 
writers that actually receive rows are `batch_init`'d; untouched writers stay 
uninitialized so close goes through `_commit_empty_rowset()` and skips writing 
empty rowset metadata. Here a batch with one row for partition P calls 
`_init_writers_by_partition_ids(P)`, so all tablet writers for P become 
`is_init=true`. At close, writers that never received rows take the normal 
initialized `commit_rowset()` path instead of the skip-empty path, creating 
empty rowset metadata and rowset builders for every bucket on the receiver. 
That negates the cloud lazy/skip-empty optimization and can reintroduce the 
per-partition memory/metadata blow-up adaptive routing is trying to avoid. 
Please initialize only the current adaptive tablet writer(s) selected for this 
request, 
 leaving the other partition writers uninitialized until they actually receive 
rows or close uses the existing empty-rowset handling.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to