Zoltan Borok-Nagy has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/20903 )

Change subject: IMPALA-12708: An UPDATE creates 2 new snapshots in Iceberg 
tables
......................................................................

IMPALA-12708: An UPDATE creates 2 new snapshots in Iceberg tables

The current implementation of UPDATE creates the delete file(s) and the
new data file(s) for the updated row(s). These files are committed in
one Iceberg transaction, but the transaction adds two snapshots to the
table. The first contains the delete file(s), the second adds the new
data file(s) of the updated row(s). Only the final snapshot (which
holds the consistent table state) is observable by concurrent readers,
but still, the commit history can look strange with these "phantom
snapshots".

So instead of doing a RowDelta and AppendFiles operation in a single
transaction, with this change we are doing a single RowDelta operation
only.

Another issue was that we also committed empty operations (e.g. UPDATEs
with zero records). These created redundant snapshots in the table
history. This patch also fixes that.

Testing:
 * added e2e test that checks the table history

Change-Id: I2ceb80b939c644388707b21061bf55451234dcd3
Reviewed-on: http://gerrit.cloudera.org:8080/20903
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Reviewed-by: Zoltan Borok-Nagy <borokna...@cloudera.com>
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java
M tests/query_test/test_iceberg.py
4 files changed, 134 insertions(+), 60 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Zoltan Borok-Nagy: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/20903
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I2ceb80b939c644388707b21061bf55451234dcd3
Gerrit-Change-Number: 20903
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <npaptak...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tma...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>

Reply via email to