Kevin-Li-2025 commented on code in PR #23066:
URL: https://github.com/apache/datafusion/pull/23066#discussion_r3485982574


##########
datafusion/physical-plan/src/sorts/multi_level_merge.rs:
##########
@@ -373,13 +373,23 @@ impl MultiLevelMergeBuilder {
     ) -> Result<(Vec<SortedSpillFile>, usize)> {
         assert_ne!(buffer_len, 0, "Buffer length must be greater than 0");
         let mut number_of_spills_to_read_for_current_phase = 0;
+        let configured_fan_in = self
+            .spill_manager
+            .env()
+            .disk_manager
+            .max_spill_merge_fan_in();
+        let max_spill_files = effective_spill_merge_fan_in(configured_fan_in);
         // Track total memory needed for spill file buffers. When the
         // reservation has pre-reserved bytes (from 
sort_spill_reservation_bytes),
         // those bytes cover the first N spill files without additional pool
         // allocation, preventing starvation under memory pressure.
         let mut total_needed: usize = 0;
 
         for spill in &self.sorted_spill_files {
+            if number_of_spills_to_read_for_current_phase >= max_spill_files {

Review Comment:
   Addressed in e564acd51: added 
`spill_merge_phase_respects_configured_fan_in`, which builds four 
`SortedSpillFile`s with `max_spill_merge_fan_in = 2` and asserts one merge 
phase selects exactly two spill inputs while leaving two for the next phase.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to