alamb commented on code in PR #17337:
URL: https://github.com/apache/datafusion/pull/17337#discussion_r2448213631
##########
datafusion/expr/src/logical_plan/plan.rs:
##########
@@ -2593,6 +2608,68 @@ impl PartialOrd for Window {
}
}
+/// Communicates the desired ordering of the output of a scan operation.
+///
+/// Preferred orderings can potentially help DataFusion optimize queries, even
in cases
+/// when the output does not completely follow that order. This is information
passed
+/// to the scan about what might help.
+///
+/// For example, a query with `ORDER BY time DESC LIMIT 10`, DataFusion's
dynamic
Review Comment:
How about also linking to the blog that explains this in more detail:
https://datafusion.apache.org/blog/2025/09/10/dynamic-filters/
##########
datafusion/optimizer/src/optimizer.rs:
##########
@@ -247,6 +248,8 @@ impl Optimizer {
Arc::new(EliminateOuterJoin::new()),
// Filters can't be pushed down past Limits, we should do
PushDownFilter after PushDownLimit
Arc::new(PushDownLimit::new()),
+ // Sort pushdown should happen before filter pushdown to maximize
optimization opportunities
Review Comment:
Most of the other sort operations happen at the ExecutionPlan layer. Can you
remind me why this pushdown is happening at the LogicalLevel?
##########
datafusion/expr/src/logical_plan/plan.rs:
##########
@@ -2593,6 +2608,68 @@ impl PartialOrd for Window {
}
}
+/// Communicates the desired ordering of the output of a scan operation.
Review Comment:
Most of this text is about the preferred_ordering field, so maybe it would
be best moved closer there.
I can imagine potentially adding other fields like `required_ordering` in
the future, which could communicate if the scan was required (if/when we extend
the ExecutionPlan API to communicate what type of sort pushdowns are supported
🤔 )
##########
datafusion/expr/src/logical_plan/plan.rs:
##########
@@ -2593,6 +2608,68 @@ impl PartialOrd for Window {
}
}
+/// Communicates the desired ordering of the output of a scan operation.
+///
+/// Preferred orderings can potentially help DataFusion optimize queries, even
in cases
+/// when the output does not completely follow that order. This is information
passed
+/// to the scan about what might help.
+///
+/// For example, a query with `ORDER BY time DESC LIMIT 10`, DataFusion's
dynamic
+/// predicates and TopK operator will work better if the data is roughly
ordered by descending
+/// time (more recent data first).
+///
+/// Implementers of [`TableProvider`] should use this information to optimize
the order in which data is output from the scan.
+///
+/// It is a hint and not a requirement:
+/// - If this information is completely ignored, e.g. data is scanned
randomly, the query will still be correct because a sort will be applied to the
data.
+/// - Partially ordered data will also be re-sorted but this may result in
optimizations like early stopping, additional data pruning, reduced memory
usage during the sort, etc.
+/// - If the scan produces exactly the requested ordering, and sets it's
properties to reflect this, upstream sorts may be optimized away.
+///
+/// Actually removing unnecessary sorts is done at the physical plan level:
logical operators like a join may or may not preserve ordering
+/// depending on what physical operator is chosen (e.g. HashJoin vs.
SortMergeJoin).
+/// If you as a [`TableProvider`] implementer would like to eliminiate
unnecessary sorts you should make sure the [`ExecutionPlan`]
+/// you produce reflects the ordering in it's properties.
+///
+/// [`TableProvider`]:
https://docs.rs/datafusion/latest/datafusion/catalog/trait.TableProvider.html
+/// [`ExecutionPlan`]:
https://docs.rs/datafusion/latest/datafusion/physical_plan/trait.ExecutionPlan.html
+#[derive(Clone, PartialEq, Eq, Hash, PartialOrd, Default)]
+pub struct ScanOrdering {
+ /// Optional preferred ordering for the scan that matches the output order
of upstream query nodes.
+ /// It is optional / best effort for the scan to produce this ordering.
+ /// If the scan produces this exact ordering and sets it's properties to
reflect this upstream sorts may be optimized away.
+ /// Otherwise the sorts may remain in place but partial ordering may be
exploited e.g. to do early stopping or reduce complexity of the sort.
+ /// Thus it is recommended for the scan to also do a best effort to
produce partially sorted data if possible.
+ pub preferred_ordering: Option<Vec<SortExpr>>,
Review Comment:
It comes down to what does a `Vec` of zero length mean? That would imply to
me there is no preferred ordering, so an Option<Vec<SortExpr>> is redundant
I would personally recommend making this non `pub` and adding an accessor
like `fn preferred_ordering(&self) -> Option<&Vec<...>> {}` and documenting in
doc comments what the invariants are
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]