gavinchou commented on PR #64167:
URL: https://github.com/apache/doris/pull/64167#issuecomment-4849865735

   Potential Nereids execution correctness issue for tenant-level colocate with 
slave bucket multiples.
   
   `UnassignedScanBucketOlapTableJob` derives a `colocateBucketNum` from 
`fragment.getColocateData()`, and `ThriftPlansBuilder` uses that colocate 
bucket count for fragment params. However, 
`DistributePlanner#getDestinationsByBuckets()` still builds/sorts receiver 
destinations with the first scan node table bucket count.
   
   Relevant code paths:
   - colocate bucket count is derived here: 
https://github.com/apache/doris/blob/cde59482ce5a548a2652c3aead57096a9c832f22/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/distribute/worker/job/UnassignedScanBucketOlapTableJob.java#L81-L85
   - thrift params use `getColocateBucketNum()`: 
https://github.com/apache/doris/blob/cde59482ce5a548a2652c3aead57096a9c832f22/fe/fe-core/src/main/java/org/apache/doris/qe/runtime/ThriftPlansBuilder.java#L440-L443
   - destination sorting still uses `getBucketNum()`: 
https://github.com/apache/doris/blob/cde59482ce5a548a2652c3aead57096a9c832f22/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/distribute/DistributePlanner.java#L232-L237
   
   If the receiver side first scan node is a larger-bucket slave table, for 
example slave 16 buckets and master 8 buckets, destinations can be built with 
16 slots while the colocate fragment is using 8 logical buckets. That looks 
like it can misroute bucket-shuffle rows. Should this path use 
`bucketJob.getColocateBucketNum()` when tenant-level colocate data is present?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to