Re: [PR] Enable repartitioning on MemTable. [datafusion]

via GitHub Mon, 21 Apr 2025 21:31:55 -0700


wiedld commented on code in PR #15409:
URL: https://github.com/apache/datafusion/pull/15409#discussion_r2053131373



##########
datafusion/core/tests/fuzz_cases/aggregate_fuzz.rs:
##########
@@ -520,7 +520,9 @@ async fn group_by_string_test(
     let expected = compute_counts(&input, column_name);
 
     let schema = input[0].schema();
-    let session_config = SessionConfig::new().with_batch_size(50);
+    let session_config = SessionConfig::new()
+        .with_batch_size(50)
+        .with_repartition_file_scans(false);

Review Comment:
   When updating the docs (see this commit: 
https://github.com/apache/datafusion/pull/15409/commits/d8929dc9d179e89868964f5f33dd2e387e54c299),
 it felt like we don't want to split out the 2 configurations IMO. Since the 
repartitioning config variable is used to both (a) repartition at the datasrc 
(which could be a file or memtable), as well as (b) insert a repartition 
operator later in the plan.
   
   Please take a look at the added docs (in the commit linked above), the lmk 
what you think. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Enable repartitioning on MemTable. [datafusion]

Reply via email to