This is an automated email from the ASF dual-hosted git repository.
alamb pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-rs.git
The following commit(s) were added to refs/heads/main by this push:
new f5d6dc3846 feat(parquet): add sparse-column writer benchmarks (#9654)
f5d6dc3846 is described below
commit f5d6dc3846c2321a4b97cb481ec6ae0907b08657
Author: Hippolyte Barraud <[email protected]>
AuthorDate: Tue Apr 7 09:15:09 2026 -0400
feat(parquet): add sparse-column writer benchmarks (#9654)
# Which issue does this PR close?
- None, but relates to #9652
# Rationale for this change
Measure sparse and all-null cases in benchmarks.
# What changes are included in this PR?
Add three new benchmark cases to the arrow_writer benchmark suite for
evaluating write performance on sparse and all-null data:
- `primitive_sparse_99pct_null`: a flat primitive column with 99% nulls,
exercising long RLE runs in definition levels.
- `list_primitive_sparse_99pct_null`: a list-of-primitive column with
99% nulls, exercising null batching in the list level builder.
- `primitive_all_null`: a flat primitive column with 100% nulls,
exercising the uniform_levels fast path for entirely-null columns.
# Are these changes tested?
N/A
# Are there any user-facing changes?
None.
Signed-off-by: Hippolyte Barraud <[email protected]>
---
parquet/benches/arrow_writer.rs | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/parquet/benches/arrow_writer.rs b/parquet/benches/arrow_writer.rs
index 0ee25873ab..909d419825 100644
--- a/parquet/benches/arrow_writer.rs
+++ b/parquet/benches/arrow_writer.rs
@@ -391,6 +391,15 @@ fn create_batches() -> Vec<(&'static str, RecordBatch)> {
let batch = create_list_primitive_bench_batch_non_null(BATCH_SIZE, 0.25,
0.75).unwrap();
batches.push(("list_primitive_non_null", batch));
+ let batch = create_primitive_bench_batch(BATCH_SIZE, 0.99, 0.75).unwrap();
+ batches.push(("primitive_sparse_99pct_null", batch));
+
+ let batch = create_list_primitive_bench_batch(BATCH_SIZE, 0.99,
0.75).unwrap();
+ batches.push(("list_primitive_sparse_99pct_null", batch));
+
+ let batch = create_primitive_bench_batch(BATCH_SIZE, 1.0, 0.75).unwrap();
+ batches.push(("primitive_all_null", batch));
+
batches
}