malinjawi opened a new pull request, #12016:
URL: https://github.com/apache/gluten/pull/12016
### What changes were proposed in this pull request?
This patch fixes the native Delta write path for partitioned optimized
writes in the Velox backend.
The change:
- Writes each native partition stripe as its own accounting unit.
- Enforces `maxRecordsPerFile` within native partition stripes by slicing
columnar batches when needed.
- Updates `recordsInFile` by the actual written chunk row count instead of
the original input batch row count.
- Preserves partition columns in split output only when Delta's write
contract includes partition columns in `dataColumns`.
- Ensures native Delta stats aggregation receives only Delta data columns
when the written batch contains extra partition columns.
- Adds Delta 4.0 tests for optimized partitioned native writes and
Iceberg-compatible Delta partitioned writes with stats enabled.
The same writer/stat fixes are applied to Delta 3.3 and Delta 4.0 sources.
### Why are the changes needed?
The existing native partitioned writer split incoming Velox batches by
partition, but it accounted file layout at the original batch level. That can
make partitioned optimized writes violate `maxRecordsPerFile` when a single
native partition stripe is larger than the file limit.
There is also a conditional stats-schema issue when Delta keeps partition
columns in the writer batch: the native stats tracker must still compute Delta
AddFile stats over the data columns only.
### Does this PR introduce any user-facing change?
No public API change. This improves correctness/layout behavior for native
Delta writes.
### How was this patch tested?
Built locally and ran:
```bash
JAVA_HOME=/Library/Java/JavaVirtualMachines/zulu-17.jdk/Contents/Home \
./dev/run-scala-test.sh \
-Pjava-17,spark-4.0,scala-2.13,backends-velox,hadoop-3.3,spark-ut,delta \
-pl backends-velox \
-s org.apache.spark.sql.delta.DeltaNativeWriteSuite
```
Result: 10 tests passed, 0 failures.
I also ran a targeted local benchmark for partitioned optimized Delta writes
with stats enabled, comparing native Delta write enabled vs disabled on the
same branch.
Rows: 2,000,000, 2 warmups, 5 measured runs, 8 input partitions, 16
partition values, `maxRecordsPerFile=100000`.
| workload | mode | median ms | avg ms | rows/sec | files | bytes | speedup |
| --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
| partitioned | native | 1557.4 | 1617.6 | 1,284,217 | 32 | 28,856,342 |
1.66x |
| partitioned | native disabled | 2588.1 | 2588.5 | 772,758 | 32 |
21,744,978 | baseline |
| iceberg-v1 | native | 1545.8 | 1525.8 | 1,293,817 | 32 | 28,878,678 |
1.64x |
| iceberg-v1 | native disabled | 2538.0 | 2558.3 | 788,012 | 32 | 21,790,000
| baseline |
Note: the targeted benchmark shows native output files are larger in this
local setup; this PR focuses on partitioned layout correctness and native write
accounting, not file-size/compression tuning.
Related issue: #10215
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]