adriangb commented on code in PR #17028:
URL: https://github.com/apache/datafusion/pull/17028#discussion_r2252643695


##########
datafusion/physical-plan/src/sorts/sort.rs:
##########
@@ -1116,10 +1127,46 @@ impl ExecutionPlan for SortExec {
         self: Arc<Self>,
         children: Vec<Arc<dyn ExecutionPlan>>,
     ) -> Result<Arc<dyn ExecutionPlan>> {
-        let mut new_sort = SortExec::new(self.expr.clone(), 
Arc::clone(&children[0]))
-            .with_fetch(self.fetch)
-            .with_preserve_partitioning(self.preserve_partitioning);
-        new_sort.filter = self.filter.clone();
+        let mut new_sort = self.cloned();
+        assert!(
+            children.len() == 1,
+            "SortExec should have exactly one child"
+        );
+        new_sort.input = Arc::clone(&children[0]);
+        // Recompute the properties based on the new input since they may have 
changed.
+        let (cache, sort_prefix) = Self::compute_properties(
+            &new_sort.input,
+            new_sort.expr.clone(),
+            new_sort.preserve_partitioning,
+        )
+        .expect(concat!(
+            "Safety: we had already been calling 
`compute_properties(...).unwrap()` in `new()` ",
+            "and it seems to be okay",
+            "\n",
+            "We assumed that doing the same thing here directly instead ",
+            "of calling `new()` (as we did before this commit) is also okay 
but it's possible that ",
+            "implementations have drifted and this is no longer safe even if 
`new()` still works, ",
+            "for example if `new()` now does something different than just 
calling `compute_properties(...).unwrap()`",
+            "\n",
+            "This is clearly a bug, please report it!"

Review Comment:
   @alamb this already had the possibility to panic in it because it called 
`SortExec::new()`:
   
   
https://github.com/apache/datafusion/blob/eb2b8c07b2f564fbd9046c67de41ac4ebfe7ae84/datafusion/physical-plan/src/sorts/sort.rs#L865-L872
   
   
https://github.com/apache/datafusion/blob/eb2b8c07b2f564fbd9046c67de41ac4ebfe7ae84/datafusion/physical-plan/src/sorts/sort.rs#L1115-L1119
   
   Returning a `Result` would be a major breaking change to the 
`ExecutionPlan::with_new_children` API which I don't think we should do in this 
PR.
   
   I do think `ExecutionPlan::with_new_children` returning a Result would be a 
good thing. In general I think trait methods should err on the side of 
returning a result in case some implementation needs to. If none of them do I'd 
expect compilation to make it pretty much a non issue for performance. But 
maybe let's do that as it's own PR if we really want to.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to