[PR] perf: Improve benchmarks for row-to-columnar conversion in JVM shuffle [datafusion-comet]

via GitHub Mon, 26 Jan 2026 09:45:14 -0800


andygrove opened a new pull request, #3290:
URL: https://github.com/apache/datafusion-comet/pull/3290


   Add `jvm_shuffle.rs` benchmark that covers the full range of data types 
processed by `process_sorted_row_partition()` in JVM shuffle:
   
   - Primitive columns (100 Int64 columns)
   - Struct (flat with 5/10/20 fields)
   - Nested struct (2 levels deep)
   - Deeply nested struct (3 levels deep)
   - List<Int64>
   - Map<Int64, Int64>
   
   This replaces the old `row_columnar.rs` which only tested primitive columns.
   
   These benchmarks help measure the performance of the row-to-columnar 
conversion used by CometColumnarShuffle when writing shuffle data.
   
   These benchmarks are also added in 
https://github.com/apache/datafusion-comet/pull/3289 so I would like to add 
them to the main branch as well to make it easier to run comparative benchmarks.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] perf: Improve benchmarks for row-to-columnar conversion in JVM shuffle [datafusion-comet]

Reply via email to