github-actions[bot] commented on code in PR #64964:
URL: https://github.com/apache/doris/pull/64964#discussion_r3492405800


##########
fe/fe-core/src/main/java/org/apache/doris/qe/runtime/ThriftPlansBuilder.java:
##########
@@ -844,6 +833,20 @@ private static Set<Integer> 
setParamsForRecursiveCteNode(List<PipelineDistribute
         return fragmentToNotifyClose;
     }
 
+    static List<TRecCTETarget> buildRecCTETargets(List<AssignedJob> 
fragmentAssignedJobs,
+            RecursiveCteScanNode recursiveCteScanNode) {
+        List<TRecCTETarget> targets = new 
ArrayList<>(fragmentAssignedJobs.size());
+        for (AssignedJob assignedJob : fragmentAssignedJobs) {

Review Comment:
   This now includes every assigned job as a recursive CTE target, but that 
over-targets serial-source/local-shuffle fragments. When 
`ignore_storage_data_distribution` makes the recursive side use 
`LocalShuffleAssignedJob`s, `instanceToThrift()` sets `parallel_instances=1`; 
on BE the recursive scan is a serial operator, so 
`_build_pipeline_tasks_for_instance()` only builds the one-task scan pipeline 
for `instance_idx == 0`. `RecCTEScanLocalState` therefore registers only that 
fragment instance id, and `send_block_to_cte_scan()` will return `RecCTEScan 
not found` for the extra local-shuffle ids emitted here. The old address-dedup 
behavior accidentally selected the one receiver per BE that exists in this 
mode. Please keep the per-instance targets for normal parallel scans, but 
filter/collapse local-shuffle groups to the instance that actually owns the 
recursive scan task, and cover the `ignore_storage_data_distribution=true` 
local-shuffle recursive-side path.



##########
regression-test/suites/recursive_cte/rec_cte_parallel_targets.groovy:
##########
@@ -0,0 +1,69 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+suite("rec_cte_parallel_targets", "rec_cte") {
+    sql """set enable_local_shuffle=true"""
+    sql """set enable_local_shuffle_planner=true"""
+    sql """set parallel_pipeline_task_num=4"""
+
+    sql """drop table if exists rec_cte_parallel_targets"""
+    sql """
+        create table rec_cte_parallel_targets (
+            id int,
+            parent_id int,
+            dept_name varchar(50),
+            budget decimal(18, 2)
+        )
+        duplicate key(id)
+        distributed by hash(id) buckets 4
+        properties ("replication_num" = "1")
+    """
+
+    sql """
+        insert into rec_cte_parallel_targets values
+        (1, null, 'headquarters', 10000.00),
+        (10, 1, 'r_and_d', 5000.00),
+        (11, 1, 'marketing', 4000.00),
+        (101, 10, 'backend', 2000.00),
+        (102, 10, 'frontend', 1500.00),
+        (111, 11, 'promotion', 2000.00)
+    """
+
+    order_qt_rec_cte_parallel_targets """

Review Comment:
   This `order_qt_` block needs a checked-in expected-output file. In normal 
regression mode, `order_qt_rec_cte_parallel_targets` becomes the tag 
`rec_cte_parallel_targets`, and `SuiteContext` derives the expected file as 
`regression-test/data/recursive_cte/rec_cte_parallel_targets.out`. Since this 
PR only adds the `.groovy` suite and not that `.out` file, `Suite.groovy` will 
throw `Missing outputFile`/`Missing output block` before the query can validate 
the recursive CTE fix. Please add the generated `.out` file for this suite, or 
use a test helper that does not require a qt output block.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to