iemejia opened a new pull request, #12390:
URL: https://github.com/apache/gluten/pull/12390
## What changes were proposed in this pull request?
Optimize the Delta Lake integration's planning-time performance, targeting
two hot paths: DV (Deletion Vector) materialization on the driver and
post-transform rule application.
### DV Materialization (`DeltaDeletionVectorScanInfo.normalize`)
- **Cache table path resolution**: resolve once per partition instead of
per-file. Eliminates N-1 redundant `FileSystem.exists()` calls (HTTP HEAD
requests on object stores).
- **Cache Hadoop Configuration**: create one instance per partition instead
of per-file deep clones.
- **Read raw DV bytes directly**: for on-disk DVs, read the raw Portable
Roaring Bitmap bytes via `DeletionVectorStore.readRangeFromStream` (with
checksum verification) instead of deserializing into Java Roaring objects and
re-serializing. The on-disk format already matches what Velox expects.
- **(delta40)** Cache reflective method lookup for `parseDescriptor` in a
`lazy val`.
### Post-transform Rules (`DeltaPostTransformRules`)
- **Early-exit guard**: skip all Delta rules if no `DeltaScanTransformer` is
present. Eliminates 5 full plan traversals for non-Delta queries.
- **Fused rule execution**: combine 3 Delta rules under a single registered
rule.
- **Shallow `containsNativeDeltaScan`**: O(1) direct child/grandchild check
instead of O(n^2) subtree traversal.
- **Pre-computed `inputFileRelatedNames`**: static `Set[String]` instead of
allocating 3 Expression objects per call.
- **Batched `createPhysicalAttributes`**: single call with full attribute
list instead of per-column.
### Allocation Reduction
- **`scanFilters` as `lazy val`**: avoids rebuilding the physicalByExprId
map and expression tree walk on every call (invoked 3+ times per scan node).
- **`UnsafeByteOperations.unsafeWrap`**: zero-copy ByteString for DV bytes
instead of `ByteString.copyFrom`.
## Measured Results (local filesystem, 100 DV-bearing files)
```
Benchmark Before After Speedup
------- ------ ----- -------
DV Materialization (100 files) 22 ms 7 ms 3.3x
Post-transform rules (Delta) 37 us 20 us 1.8x
Post-transform rules (parquet) 4908 ns 220 ns 22.3x
```
### Projected impact on object stores (100 DV files)
| Storage | Before | After | Speedup |
|---------|--------|-------|---------|
| Local FS | 22 ms | 7 ms | 3.3x |
| ABFS | 2-24 sec | 1.0-1.1 s | 2-22x |
| GCS | 3-30 sec | 1.0-1.1 s | 3-27x |
| S3 | 5-45 sec | 1.1-1.2 s | 5-38x |
## How was this patch tested?
- All existing Delta tests pass (`VeloxDeltaSuite`,
`DeltaDeletionVectorScanInfoSuite`)
- Added targeted unit tests:
- `post-transform rules are no-op on non-Delta plans` (validates
early-exit guard)
- `post-transform rules produce DeltaScanTransformer for Delta tables`
(validates offloading)
- `scanFilters returns consistent results on repeated access` (validates
lazy val caching)
- Added `DeltaPlanningBenchmark` for reproducible before/after measurement
- Scalastyle, Checkstyle, Spotless: all pass with zero violations
## Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude claude-opus-4.6
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]