Zoltan Borok-Nagy has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/21099 )
Change subject: IMPALA-12860: Invoke validateDataFilesExist for RowDelta operations ...................................................................... IMPALA-12860: Invoke validateDataFilesExist for RowDelta operations We must invoke validateDataFilesExist for RowDelta operations (DELETE/ UPDATE/MERGE). Without this a concurrent RewriteFiles (compaction) and RowDelta can corrupt a table. IcebergBufferedDeleteSink now also collects the filenames of the data files that are referenced in the position delete files. It adds them to the DML exec state which is then collected by the Coordinator. The Coordinator passes the file paths to CatalogD which executes Iceberg's RowDelta operation and now invokes validateDataFilesExist() with the file paths. Additionally it also invokes validateDeletedFiles(). This patch set also resolves IMPALA-12640 which is about replacing IcebergDeleteSink with IcebergBufferedDeleteSink, as from now on we use the buffered version for all DML operations that write position delete files. Testing: * adds new stress test with DELETE + UPDATE + OPTIMIZE Change-Id: I4869eb863ff0afe8f691ccf2fd681a92d36b405c Reviewed-on: http://gerrit.cloudera.org:8080/21099 Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Reviewed-by: Gabor Kaszab <gaborkas...@cloudera.com> --- M be/src/exec/CMakeLists.txt M be/src/exec/iceberg-buffered-delete-sink.cc M be/src/exec/iceberg-buffered-delete-sink.h M be/src/exec/iceberg-delete-sink-config.cc D be/src/exec/iceberg-delete-sink.cc D be/src/exec/iceberg-delete-sink.h M be/src/exec/multi-table-sink.cc M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/service/client-request-state.cc M common/protobuf/control_service.proto M common/thrift/CatalogService.thrift M common/thrift/DataSinks.thrift M fe/src/main/java/org/apache/impala/analysis/IcebergDeleteImpl.java M fe/src/main/java/org/apache/impala/planner/IcebergBufferedDeleteSink.java D fe/src/main/java/org/apache/impala/planner/IcebergDeleteSink.java M fe/src/main/java/org/apache/impala/planner/TableSink.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-delete.test M tests/stress/test_update_stress.py 20 files changed, 163 insertions(+), 580 deletions(-) Approvals: Impala Public Jenkins: Verified Gabor Kaszab: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/21099 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I4869eb863ff0afe8f691ccf2fd681a92d36b405c Gerrit-Change-Number: 21099 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com> Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>