wiedld commented on code in PR #13986:
URL: https://github.com/apache/datafusion/pull/13986#discussion_r1914292749
##########
datafusion/core/src/physical_planner.rs:
##########
@@ -2006,6 +2007,71 @@ fn tuple_err<T, R>(value: (Result<T>, Result<R>)) ->
Result<(T, R)> {
}
}
+/// Confirms that a given [`PhysicalOptimizerRule`] run conforms
+/// to the invariants per rule, and per [`ExecutionPlan`] invariants.
+struct InvariantChecker<'a> {
+ options: &'a OptimizerOptions,
+ rule: &'a Arc<dyn PhysicalOptimizerRule + Send + Sync>,
+}
+
+impl<'a> InvariantChecker<'a> {
+ /// Create an [`InvariantChecker`].
+ pub fn new(
+ options: &'a OptimizerOptions,
+ rule: &'a Arc<dyn PhysicalOptimizerRule + Send + Sync>,
+ ) -> Self {
+ Self { options, rule }
+ }
+
+ /// Checks that the plan change is permitted, returning an Error if not.
+ ///
+ /// In debug mode, this recursively walks the entire physical plan and
+ /// performs additional checks using Datafusions's [`check_plan_sanity`]
+ /// and any user defined [`ExecutionPlan::check_node_invariants`]
extensions.
+ pub fn check(
+ &mut self,
+ plan: &Arc<dyn ExecutionPlan>,
+ previous_schema: Arc<Schema>,
+ input_plan_is_valid: bool,
+ ) -> Result<()> {
+ // if the rule is not permitted to change the schema, confirm that it
did not change.
+ if self.rule.schema_check() && plan.schema() != previous_schema {
+ internal_err!("PhysicalOptimizer rule '{}' failed, due to generate
a different schema, original schema: {:?}, new schema: {:?}",
+ self.rule.name(),
+ previous_schema,
+ plan.schema()
+ )?
+ }
+
+ // if the rule requires that the new plan is executable, confirm that
it is.
+ #[cfg(debug_assertions)]
+ if self.rule.executable_check(input_plan_is_valid) {
+ plan.visit(self)?;
+ }
Review Comment:
In follow up PR, **with benchmarking impact checked**, I plan to have the
"executableness" check be run once in release build AFTER all optimizer runs
are completed.
As it stands now, the current implementation does no additional
work/checking in release mode.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]