alamb commented on issue #10313: URL: https://github.com/apache/datafusion/issues/10313#issuecomment-2089129451
> Would this ticket be an appropriate place to add tickets related to pushing down sorts to federated query engines? I know that this was discussed previously (i.e. #7871) and it seems that writing a custom optimizer is the current way to handle that. I added #7871 to the list above -- thank you. Yes I think this would be a good place to discuss > I will need to do this soon (federated sort pushdown) and it initially wasn't clear to me how to make this work in DataFusion. I can volunteer to write some docs on how to do this once I have an implementation that works. That would be great, thanks @phillipleblanc Right now, once `TableProvider::execute` gets called, the returned `ExecutionPlan` can report how it is already sorted. What we don't have is any way to have the optimizer tell a `ExecutionPlan` that it could reduce the work required in the DataFusion plan if the data was already sorted. I wonder if we could add something to `ExecutionPlan` trait similar to [`ExecutionPlan::repartitioned`](https://docs.rs/datafusion/latest/datafusion/physical_plan/trait.ExecutionPlan.html#method.repartitioned) like ```rust trait ExecutionPlan { ... /// return other possible orders that this ExecutionPlan could return /// (the DataFusion optimizer will use this information to potentially push Sorts /// into the Node fn pushable_sorts(&self) -> Result<Option<PotentialSortOrders>>> { return Ok(None) } /// return a node like this one except that it its output is sorted according to exprs fn resorted(&self) -> Result<Option<Arc<dyn ExecutionPlan>>> { return Ok(None) } ``` And then add a new optimizer pass that tries to push sorts into the plan nodes that report they can provide sorted data 🤔 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org