malinjawi opened a new pull request, #12216: URL: https://github.com/apache/gluten/pull/12216
## What changes This is the next stacked Delta DV MoR slice after #12215. It adds the minimal persistent-DV DELETE correctness path for Velox Delta native execution while keeping Delta OSS semantics for action generation, stats, and transaction behavior. Stack order: 1. #12197 - DV scan info extraction utility 2. #12198 - JVM Delta DV scan handoff 3. #12215 - DML row-index scan safety 4. This PR - persistent DV DELETE correctness path This PR should remain draft until the earlier scan-safety PR has reviewer confidence and this branch has native CI signal. ## Scope - Adds `GlutenDeleteCommand` for persistent-DV row-condition DELETEs. - Routes only eligible persistent-DV DELETE commands through the Gluten Delta command wrapper. - Uses Delta's existing DML deletion-vector helpers for touched-file discovery and action generation. - Keeps ordinary DELETE, metadata-only DELETE, and full-table DELETE on the existing path. - Adds Delta 3.3 and Delta 4.0 coverage. ## Intentionally deferred - Native bitmap aggregation as the default DELETE bitmap construction path. - Plain Parquet target-scan optimization. - Checksum shortcuts or stats rewrites beyond Delta's existing behavior. - DELETE diagnostics/benchmark suite, which stays in the follow-up branch. ## Validation Local validation after rebasing onto `origin/split/delta-dv-dml-scan-safety` at `26dcd2dd38887d7fef1076ff7d2f210b3390f69a`: - `git diff --check origin/split/delta-dv-dml-scan-safety...HEAD` - `env JAVA_HOME=/opt/homebrew/opt/openjdk@17/libexec/openjdk.jdk/Contents/Home PATH=/opt/homebrew/opt/openjdk@17/bin:$PATH ./build/mvn -q test-compile -pl backends-velox -am -Pjava-17,spark-3.5,backends-velox,hadoop-3.3,spark-ut,delta -DskipTests` - `env JAVA_HOME=/opt/homebrew/opt/openjdk@17/libexec/openjdk.jdk/Contents/Home PATH=/opt/homebrew/opt/openjdk@17/bin:$PATH ./build/mvn -q test-compile -pl backends-velox -am -Pjava-17,spark-4.0,scala-2.13,backends-velox,hadoop-3.3,spark-ut,delta -DskipTests` Focused local ScalaTest execution was attempted for `DeleteSQLWithDeletionVectorsSuite` but this Mac checkout cannot start the Velox backend because `darwin/aarch64/libgluten.dylib` is not available. The run reached Spark startup and aborted before executing tests with `FileNotFoundException: darwin/aarch64/libgluten.dylib`. Treat native CI as the runtime correctness gate for this draft. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
