gavinchou commented on PR #64167: URL: https://github.com/apache/doris/pull/64167#issuecomment-4849865735
Potential Nereids execution correctness issue for tenant-level colocate with slave bucket multiples. `UnassignedScanBucketOlapTableJob` derives a `colocateBucketNum` from `fragment.getColocateData()`, and `ThriftPlansBuilder` uses that colocate bucket count for fragment params. However, `DistributePlanner#getDestinationsByBuckets()` still builds/sorts receiver destinations with the first scan node table bucket count. Relevant code paths: - colocate bucket count is derived here: https://github.com/apache/doris/blob/cde59482ce5a548a2652c3aead57096a9c832f22/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/distribute/worker/job/UnassignedScanBucketOlapTableJob.java#L81-L85 - thrift params use `getColocateBucketNum()`: https://github.com/apache/doris/blob/cde59482ce5a548a2652c3aead57096a9c832f22/fe/fe-core/src/main/java/org/apache/doris/qe/runtime/ThriftPlansBuilder.java#L440-L443 - destination sorting still uses `getBucketNum()`: https://github.com/apache/doris/blob/cde59482ce5a548a2652c3aead57096a9c832f22/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/distribute/DistributePlanner.java#L232-L237 If the receiver side first scan node is a larger-bucket slave table, for example slave 16 buckets and master 8 buckets, destinations can be built with 16 slots while the colocate fragment is using 8 logical buckets. That looks like it can misroute bucket-shuffle rows. Should this path use `bucketJob.getColocateBucketNum()` when tenant-level colocate data is present? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
