[
https://issues.apache.org/jira/browse/SPARK-57112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated SPARK-57112:
-----------------------------------
Labels: pull-request-available (was: )
> Add putNotNulls case to WritableColumnVectorBulkFillBenchmark
> -------------------------------------------------------------
>
> Key: SPARK-57112
> URL: https://issues.apache.org/jira/browse/SPARK-57112
> Project: Spark
> Issue Type: Improvement
> Components: Tests
> Affects Versions: 4.3.0
> Reporter: L. C. Hsieh
> Assignee: L. C. Hsieh
> Priority: Minor
> Labels: pull-request-available
>
> SPARK-57042 added WritableColumnVectorBulkFillBenchmark covering the
> constant-value bulk-fill APIs on WritableColumnVector. The benchmark
> does not yet cover putNotNulls(rowId, count), which is the inverse of
> putNulls and runs once per batch from WritableColumnVector.reset()
> when numNulls > 0.
> This change adds a putNotNulls case mirroring the existing putNulls
> case. It seeds one null into the column vector at setup so that
> putNotNulls' `if (!hasNull()) return;` early-out does not skip the
> fill during measurement.
> The benchmark addition lands separately so that an upstream-master
> baseline exists before SPARK-57111 (which switches putNotNulls to
> Arrays.fill / Platform.setMemory) is measured. With this case in
> master, the GHA `Run benchmarks` workflow will produce a like-for-like
> before/after comparison when SPARK-57111 is benchmarked.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]