andygrove opened a new issue, #1611:
URL: https://github.com/apache/datafusion-ballista/issues/1611

   ## Describe the bug
   
   When `ballista.planner.adaptive.enabled` is `true`, the 
`BALLISTA_SHUFFLE_SORT_BASED_ENABLED` flag is silently ignored and stages are 
always built with the hash-based `ShuffleWriterExec`.
   
   The static planner (`ballista/scheduler/src/planner.rs`) routes shuffle 
writer construction through `create_shuffle_writer_with_config`, which picks 
`SortShuffleWriterExec` when sort-based shuffle is enabled and the partitioning 
is `Hash`. The AQE path does not have an equivalent: 
`BallistaAdapter::adapt_to_ballista` in 
`ballista/scheduler/src/state/aqe/adapter.rs` unconditionally constructs 
`ShuffleWriterExec` (lines 90 and 116).
   
   Downstream AQE code reinforces this — `AdaptiveStageInfo::plan` 
(`aqe/planner.rs:526`) and `AdaptiveExecutionGraph::create_resolved_stage` 
(`aqe/mod.rs:196`) are typed as `Arc<ShuffleWriterExec>` rather than `Arc<dyn 
ShuffleWriter>`, so even fixing the adapter alone is not sufficient.
   
   This is also called out in `ballista/client/tests/sort_shuffle.rs:57`, where 
the integration tests disable AQE with the comment "AQE does not support sort 
shuffle at the moment".
   
   ## To Reproduce
   
   Enable both flags on a `SessionConfig` and run any query that introduces a 
shuffle:
   
   ```rust
   SessionConfig::new_with_ballista()
       .set_str(BALLISTA_SHUFFLE_SORT_BASED_ENABLED, "true")
       .set_bool(BALLISTA_ADAPTIVE_PLANNER_ENABLED, true);
   ```
   
   The resulting plan uses `ShuffleWriterExec` rather than 
`SortShuffleWriterExec`.
   
   ## Expected behavior
   
   When sort-based shuffle is enabled, the adaptive planner should produce 
`SortShuffleWriterExec` for hash-partitioned shuffles, matching the behavior of 
the static planner.
   
   ## Additional context
   
   Suggested approach:
   - Generalize `AdaptiveStageInfo::plan` and 
`AdaptiveExecutionGraph::create_resolved_stage` over the `ShuffleWriter` trait.
   - Have `BallistaAdapter::adapt_to_ballista` use the same writer-selection 
logic as `create_shuffle_writer_with_config`, or factor that logic into a 
shared helper.
   - Re-enable the sort-shuffle integration tests with AQE on once the change 
lands.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to