[PR] [SPARK-57036][SQL] Use intrinsic bulk-fill APIs for constant-value WritableColumnVector methods [spark]

via GitHub Sat, 23 May 2026 17:19:30 -0700


viirya opened a new pull request, #56082:
URL: https://github.com/apache/spark/pull/56082


   ### What changes were proposed in this pull request?
   
   Six bulk-fill methods on the column vectors implement constant-value
   fills with degenerate per-element loops. This PR replaces them with
   intrinsic substitutions:
   
   | Method | Substitution |
   | --- | --- |
   | `OnHeapColumnVector.putBooleans(rowId, count, value)` | 
`Arrays.fill(byte[], ..., (byte) v)` |
   | `OnHeapColumnVector.putBytes(rowId, count, value)` | `Arrays.fill(byte[], 
...)` |
   | `OnHeapColumnVector.putShorts(rowId, count, value)` | 
`Arrays.fill(short[], ...)` |
   | `OnHeapColumnVector.putLongs(rowId, count, value)` | `Arrays.fill(long[], 
...)` |
   | `OffHeapColumnVector.putBooleans(rowId, count, value)` | 
`Platform.setMemory` with small-count fallback |
   | `OffHeapColumnVector.putBytes(rowId, count, value)` | `Platform.setMemory` 
with small-count fallback |
   
   The two OffHeap methods share a `SET_MEMORY_THRESHOLD = 128` constant.
   Below the threshold, an inline byte loop avoids the JNI fixed cost of
   `Unsafe.setMemory`; at or above, `setMemory` dominates and the gain
   accelerates up to ~10x at `count >= 4096`.
   
   ### Why are the changes needed?
   
   The bulk-fill APIs on `WritableColumnVector` are the natural call to
   make from any column writer, but their implementations were per-element
   loops. Switching to intrinsics:
   
   - `Arrays.fill` is backed by HotSpot's `_jbyte_fill` / `_jshort_fill` /
     `_jlong_fill` intrinsic stubs; on byte/short arrays C2 can usually
     auto-vectorize the original loop and gains are modest, but for
     `long[]` and at small counts the intrinsic is meaningfully faster.
   - `Unsafe.setMemory` lowers to a native memset. For OffHeap byte fills
     the original per-byte `Platform.putByte` loop cannot be vectorized
     through the JNI call, so the gain is dramatic at large counts.
   
   Measured on Apple M4 Max + OpenJDK 21.0.8, using a new
   `WritableColumnVectorBulkFillBenchmark` (added in a separate change,
   not part of this PR), Rate (M elements/s):
   
   **OffHeap byte fills (putBytes / putBooleans)**, threshold path:
   
   | count   | baseline | patched | delta |
   | ------: | -------: | ------: | ----- |
   | 8       | ~1,900   | ~1,840  | parity (small-count fallback) |
   | 64      | ~3,800   | ~3,760  | parity |
   | 512     | ~4,150   | ~13,100 | +3.2x |
   | 4,096   | ~4,340   | ~31,900 | +7.4x |
   | 65,536  | ~4,275   | ~43,700 | +10.2x |
   
   **OnHeap byte fills**:
   
   | count   | baseline | patched | delta |
   | ------: | -------: | ------: | ----- |
   | 8       | ~2,620   | ~3,230  | +23%  |
   | 64      | ~19,000  | ~25,400 | +33%  |
   | 512     | ~68,800  | ~86,200 | +25%  |
   | 4,096   | ~128,400 | ~133,300| +4%   |
   | 65,536  | ~143,200 | ~143,600| saturated (byte memory bandwidth) |
   
   **OnHeap longs**: +1-14% in the small/medium range, saturated by
   memory bandwidth at large counts. Included for consistency with the
   byte methods.
   
   OffHeap multi-byte fills (putShorts / putInts / putLongs / putFloats /
   putDoubles) are out of scope: `Platform.setMemory` is byte-only and a
   value=0 short-circuit alternative was tried and showed no measurable
   gain.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Existing tests; no behavior change. Ran locally:
   
   - `VectorizedRleValuesReaderSuite`
   - `ColumnVectorSuite`
   - `ColumnarBatchSuite`
   - `ParquetIOSuite`
   
   237 tests, all pass.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Generated-by: Claude Code (Claude Opus 4.7)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] [SPARK-57036][SQL] Use intrinsic bulk-fill APIs for constant-value WritableColumnVector methods [spark]

Reply via email to