alamb commented on code in PR #17563:
URL: https://github.com/apache/datafusion/pull/17563#discussion_r2353207049


##########
datafusion/sqllogictest/test_files/window.slt:
##########
@@ -5310,15 +5310,16 @@ logical_plan
 05)--------TableScan: t1 projection=[c1, c2]
 physical_plan
 01)SortPreservingMergeExec: [c1@0 ASC NULLS LAST, c2@1 ASC NULLS LAST, rank@2 
ASC NULLS LAST]
-02)--ProjectionExec: expr=[c1@0 as c1, c2@1 as c2, rank() PARTITION BY [t1.c1] 
ORDER BY [t1.c2 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT 
ROW@2 as rank]
-03)----BoundedWindowAggExec: wdw=[rank() PARTITION BY [t1.c1] ORDER BY [t1.c2 
ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW: Field { 
name: "rank() PARTITION BY [t1.c1] ORDER BY [t1.c2 ASC NULLS LAST] RANGE 
BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW", data_type: UInt64, nullable: 
false, dict_id: 0, dict_is_ordered: false, metadata: {} }, frame: RANGE BETWEEN 
UNBOUNDED PRECEDING AND CURRENT ROW], mode=[Sorted]
-04)------SortExec: expr=[c1@0 ASC NULLS LAST, c2@1 ASC NULLS LAST], 
preserve_partitioning=[true]
-05)--------CoalesceBatchesExec: target_batch_size=1
-06)----------RepartitionExec: partitioning=Hash([c1@0], 2), input_partitions=2
-07)------------CoalesceBatchesExec: target_batch_size=1
-08)--------------FilterExec: c1@0 = 2 OR c1@0 = 3
-09)----------------RepartitionExec: partitioning=RoundRobinBatch(2), 
input_partitions=1
-10)------------------DataSourceExec: partitions=1, partition_sizes=[1]
+02)--SortExec: expr=[c1@0 ASC NULLS LAST, c2@1 ASC NULLS LAST, rank@2 ASC 
NULLS LAST], preserve_partitioning=[true]

Review Comment:
   Won't this change cause a regression in performance for this query? The plan 
now has an extra sort where the plan before hand had no sort after the 
`BoundedWindowAggExec`



##########
datafusion/sqllogictest/test_files/window.slt:
##########
@@ -5310,15 +5310,16 @@ logical_plan
 05)--------TableScan: t1 projection=[c1, c2]
 physical_plan
 01)SortPreservingMergeExec: [c1@0 ASC NULLS LAST, c2@1 ASC NULLS LAST, rank@2 
ASC NULLS LAST]
-02)--ProjectionExec: expr=[c1@0 as c1, c2@1 as c2, rank() PARTITION BY [t1.c1] 
ORDER BY [t1.c2 ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT 
ROW@2 as rank]
-03)----BoundedWindowAggExec: wdw=[rank() PARTITION BY [t1.c1] ORDER BY [t1.c2 
ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW: Field { 
name: "rank() PARTITION BY [t1.c1] ORDER BY [t1.c2 ASC NULLS LAST] RANGE 
BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW", data_type: UInt64, nullable: 
false, dict_id: 0, dict_is_ordered: false, metadata: {} }, frame: RANGE BETWEEN 
UNBOUNDED PRECEDING AND CURRENT ROW], mode=[Sorted]
-04)------SortExec: expr=[c1@0 ASC NULLS LAST, c2@1 ASC NULLS LAST], 
preserve_partitioning=[true]
-05)--------CoalesceBatchesExec: target_batch_size=1
-06)----------RepartitionExec: partitioning=Hash([c1@0], 2), input_partitions=2
-07)------------CoalesceBatchesExec: target_batch_size=1
-08)--------------FilterExec: c1@0 = 2 OR c1@0 = 3
-09)----------------RepartitionExec: partitioning=RoundRobinBatch(2), 
input_partitions=1
-10)------------------DataSourceExec: partitions=1, partition_sizes=[1]
+02)--SortExec: expr=[c1@0 ASC NULLS LAST, c2@1 ASC NULLS LAST, rank@2 ASC 
NULLS LAST], preserve_partitioning=[true]

Review Comment:
   I think the same pattern (an additional SortExec) appears in the other plans 
below too



##########
datafusion/physical-plan/src/windows/mod.rs:
##########
@@ -636,7 +636,8 @@ pub fn get_window_mode(
 
 fn sort_options_resolving_constant(expr: Arc<dyn PhysicalExpr>) -> 
Vec<PhysicalSortExpr> {
     vec![
-        PhysicalSortExpr::new(Arc::clone(&expr), SortOptions::new(false, 
false)),
+        // TODO restore while avoiding exponential planning time

Review Comment:
   If we are going to go with this change, can we please file a ticket for this 
and leave a link to the ticket here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to