github-actions[bot] commented on code in PR #64633:
URL: https://github.com/apache/doris/pull/64633#discussion_r3466290098


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/SplitLimit.java:
##########
@@ -38,8 +40,13 @@ public Rule build() {
                 .then(limit -> {
                     long l = limit.getLimit();
                     long o = limit.getOffset();
+                    if (Utils.addOverflows(l, o)) {

Review Comment:
   Throwing here changes valid SQL into an analysis error. For a plan like 
`LogicalLimit(limit=Long.MAX_VALUE, offset=Long.MAX_VALUE) -> UnionAll(1,2,3)`, 
the offset just skips every available row, so `count(*)` should return `0`; the 
query is not invalid. `SplitLimit` is only a rewrite, so on overflow it should 
either leave the limit unsplit or use an unbounded local phase such as 
`Long.MAX_VALUE` and let the global limit/offset apply. The new regression test 
currently expects this exception, so it codifies the regression instead of 
preserving the empty-result semantics.



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/implementation/LogicalTopNToPhysicalTopN.java:
##########
@@ -46,13 +48,30 @@ public Rule build() {
      *     mergeTopN(limit, off, require gather) -> localTopN(off+limit, 0, 
require any)
      */
     private List<PhysicalTopN<? extends Plan>> twoPhaseSort(LogicalTopN<? 
extends Plan> logicalTopN) {
-        PhysicalTopN<Plan> localSort = new 
PhysicalTopN<>(logicalTopN.getOrderKeys(),
-                logicalTopN.getLimit() + logicalTopN.getOffset(), 0, 
SortPhase.LOCAL_SORT,
-                logicalTopN.getLogicalProperties(), logicalTopN.child(0));
         int sortPhaseNum = 0;
         if (ConnectContext.get() != null) {
             sortPhaseNum = 
ConnectContext.get().getSessionVariable().sortPhaseNum;
         }
+        // The two-phase local sort keeps limit + offset rows. When that 
overflows the long range
+        // (e.g. LIMIT and OFFSET both BIGINT_MAX), only the single-phase 
gather sort is safe: it
+        // applies limit and offset separately and cannot overflow. The 
two-phase MERGE_SORT cannot be
+        // used because ChildrenPropertiesRegulator forbids inserting a gather 
Exchange under
+        // GATHER_SORT, so the optimizer cannot produce a valid distributed 
plan with only one-phase.
+        // If sort_phase_num != 1 (i.e. the user did not explicitly request 
single-phase), report the
+        // error so the user knows to set sort_phase_num = 1 for this extreme 
boundary.
+        if (Utils.addOverflows(logicalTopN.getLimit(), 
logicalTopN.getOffset())) {
+            if (sortPhaseNum != 1) {

Review Comment:
   This still turns a valid query into a user-visible error under the default 
session. With `sort_phase_num` unset, `sortPhaseNum` is `0`, so `ORDER BY k 
LIMIT Long.MAX_VALUE OFFSET Long.MAX_VALUE` reaches this branch and throws even 
though the same method can build the semantic single-phase `GATHER_SORT` below. 
The default planner should keep a legal physical alternative, for example by 
returning the single-phase gather sort on overflow or by using a legal 
two-phase shape whose local phase keeps all rows and whose global phase applies 
the original limit/offset, rather than requiring users to know to set 
`sort_phase_num = 1`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to