andygrove commented on code in PR #1906:
URL:
https://github.com/apache/datafusion-ballista/pull/1906#discussion_r3488107332
##########
ballista/core/src/extension.rs:
##########
@@ -789,6 +789,22 @@ impl SessionConfigHelperExt for SessionConfig {
//
// See https://github.com/apache/datafusion-ballista/issues/1648
.set_bool("datafusion.optimizer.prefer_hash_join", false)
+ //
+ // DataFusion 54 plans uncorrelated scalar subqueries as a physical
+ // `ScalarSubqueryExec` wrapping a `ScalarSubqueryExpr` that reads
an
+ // in-process shared results container. That container cannot cross
+ // process or stage boundaries, and `datafusion-proto` can only
+ // deserialize the expr inside its surrounding exec, so when
Ballista
+ // splits a plan into stages the expr is serialized without its
exec
+ // and the executor fails to decode it. Disabling this option makes
+ // the optimizer rewrite uncorrelated scalar subqueries to joins,
+ // which Ballista distributes correctly.
+ //
+ // See https://github.com/apache/datafusion-ballista/issues/1909
+ .set_bool(
+
"datafusion.optimizer.enable_physical_uncorrelated_scalar_subquery",
+ false,
+ )
Review Comment:
I filed follow on issue
https://github.com/apache/datafusion-ballista/issues/1910 to optimize this
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]