[jira] [Updated] (SPARK-57111) Use intrinsic bulk-fill APIs for putNotNulls

ASF GitHub Bot (Jira) Wed, 27 May 2026 14:28:18 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-57111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


ASF GitHub Bot updated SPARK-57111:
-----------------------------------
    Labels: pull-request-available  (was: )

> Use intrinsic bulk-fill APIs for putNotNulls
> --------------------------------------------
>
>                 Key: SPARK-57111
>                 URL: https://issues.apache.org/jira/browse/SPARK-57111
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 4.3.0
>            Reporter: L. C. Hsieh
>            Assignee: L. C. Hsieh
>            Priority: Major
>              Labels: pull-request-available
>
> Follow-up to SPARK-57024 / SPARK-57036.
>   
>   WritableColumnVector exposes a putNotNulls(rowId, count) method that
>   clears a run of the nulls bitmap. It's called once per batch from
>   WritableColumnVector.reset() (when numNulls > 0) and from the
>   appendNotNulls() path. Both OnHeap and OffHeap implementations are
>   per-element loops:
>   
>     // OnHeap
>     for (int i = 0; i < count; ++i) {
>       nulls[rowId + i] = (byte)0;
>     }
>   
>     // OffHeap
>     long offset = nulls + rowId;
>     for (int i = 0; i < count; ++i, ++offset) {
>       Platform.putByte(null, offset, (byte) 0);
>     }
>   
>   This is the same pattern fixed for putNulls in SPARK-57024 and for
>   putBytes / putBooleans in SPARK-57036. The same intrinsic substitutions
>   apply:
>     - OnHeap.putNotNulls -> Arrays.fill(byte[], ..., (byte) 0)
>     - OffHeap.putNotNulls -> Platform.setMemory(addr, (byte) 0, count)
>       with the existing SET_MEMORY_THRESHOLD = 128 fallback to an inline
>       byte loop for small counts.
>   
>   Measured locally on Apple M4 Max + OpenJDK 21 via
>   WritableColumnVectorBulkFillBenchmark (a new putNotNulls case added by
>   parity (the C2 compiler already auto-vectorizes the original byte
>   loop).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SPARK-57111) Use intrinsic bulk-fill APIs for putNotNulls

Reply via email to