[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Zoltan Borok-Nagy has uploaded this change for review. ( http://gerrit.cloudera.org:8080/20677 Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables This patch adds limited UPDATE support for Iceberg tables. The limitations mean users cannot update Iceberg tables if any of the followings are true: * UPDATE value partition column * UPDATE table that went through partition evolution * Table has SORT BY properties The above limitations will be resolved by part 3. This patch implements UPDATEs with the merge-on-read technique. This means the UPDATE statement writes both data files and delete files. Data files contain the updated records, delete files contain the position delete records of the old data records that have been touched. To achieve the above this patch introduces a new sink: MultiDataSink. We can configure multiple TableSinks for a single MultiDataSink object. During execution, the row batches sent to the MultiDataSink will be forwarded to all the TableSinks that have been registered. The UPDATE statement for an Iceberg table creates a source select statement with all table columns and virtual columns INPUT__FILE__NAME and FILE__POSITION. E.g. imagine we have a table 'tbl' with schema (i int, s string, k int), and we update the table with: UPDATE tbl SET k = 5 WHERE i % 100 = 11; The generated source statement will be ==> SELECT i, s, 5, INPUT__FILE__NAME, FILE__POSITION FROM tbl WHERE i % 100 = 11; Then we create two table sinks that refer to expressions from the above source statement: Insert sink (i, s, 5) Delete sink (INPUT__FILE__NAME, FILE__POSITION) The tuples in the rowbatch of MultiDataSink contain slots for all the above expressions (i, s, 5, INPUT__FILE__NAME, FILE__POSITION). MultiDataSink forwards each row batch to each registered TableSink. They will pick their relevant expressions from the tuple and write data/delete files. The tuples are sorted by INPUTE__FILE__NAME and FILE__POSITION because we need to write the delete records in this order. For partitioned tables we need to shuffle and sort the input tuples. In this case we also add virtual columns "PARTITION__SPEC__ID" and "ICEBERG__PARTITION__SERIALIZED" to the source statement and shuffle and sort the rows based on them. Data files and delete files are now separated in the DmlExecState, so at the end of the operation we'll have two sets of files. We use these two sets to create a new Iceberg snapshot. Why this patch has the limitations? - Because we are shuffling and sorting rows based on the delete records and their partitions. This means that the new data files might not get written in an efficient way, e.g. there will be too many of them, or we will need to keep too many open file handles during writing. Also, if the table has SORT BY properties, we cannot respect it as the input rows are ordered in a way to favor the position deletes. Patch 3 will introduce a buffering writer for position delete files. This means we will shuffle and sort records based on the data records' partitions and SORT BY properties while delete records get buffered and written out at the end (sorted by file_path and position). In some edge cases the delete records might not get written efficiently, but it is a smaller problem then inefficient data files. Testing: * negative tests * Basice e2e testing with all supported data types Testing TODO: * partitioned tables * authz * planner test * Impala/Hive interop tests * concurrent tests Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 --- M be/src/exec/CMakeLists.txt M be/src/exec/data-sink.cc M be/src/exec/iceberg-delete-sink.cc M be/src/exec/iceberg-delete-sink.h A be/src/exec/multi-table-sink.cc A be/src/exec/multi-table-sink.h M be/src/exec/table-sink-base.cc M be/src/exec/table-sink-base.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/service/client-request-state.cc M common/protobuf/control_service.proto M common/thrift/DataSinks.thrift M common/thrift/ImpalaService.thrift M common/thrift/Types.thrift M fe/src/main/java/org/apache/impala/analysis/DeleteStmt.java M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java M fe/src/main/java/org/apache/impala/analysis/DmlStatementBase.java M fe/src/main/java/org/apache/impala/analysis/IcebergDeleteImpl.java M fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java A fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/KuduModifyImpl.java M fe/src/main/java/org/apache/impala/analysis/ModifyImpl.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/OptimizeStmt.java M fe/src/mai
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/14359/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 07 Nov 2023 17:46:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 1: (11 comments) I read through once and had some quick comments. This not a proper review :-( http://gerrit.cloudera.org:8080/#/c/20677/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20677/1//COMMIT_MSG@14 PS1, Line 14: * Table has SORT BY properties Is there also a restriction that TBLPROPERTIES must specify merge-on-read? Edit: I see Impala does set this, but there may be other tables that don't have it? Also only on iceberg v2 tables? http://gerrit.cloudera.org:8080/#/c/20677/1//COMMIT_MSG@83 PS1, Line 83: Basice Spelling: "Basic" http://gerrit.cloudera.org:8080/#/c/20677/1/be/src/exec/multi-table-sink.h File be/src/exec/multi-table-sink.h: http://gerrit.cloudera.org:8080/#/c/20677/1/be/src/exec/multi-table-sink.h@32 PS1, Line 32: DataSink* CreateSink(RuntimeState* state) const override; It would be good at some point to have descriptions for all the methods. http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java File fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java: http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java@65 PS1, Line 65: additinal Spelling: additional http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java File fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java: http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java@52 PS1, Line 52: protected This and next line could be private I think http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java@68 PS1, Line 68: spesc Spelling: "specs" http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java@160 PS1, Line 160: cast Nit: "Cast" http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java@161 PS1, Line 161: table Nit: "table." http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/analysis/ModifyImpl.java File fe/src/main/java/org/apache/impala/analysis/ModifyImpl.java: http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/analysis/ModifyImpl.java@42 PS1, Line 42: abstract void buildAndValidateSelectExprs(Analyzer analyzer, Add description http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java File fe/src/main/java/org/apache/impala/planner/MultiDataSink.java: http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java@30 PS1, Line 30: public class MultiDataSink extends DataSink { Add description http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/planner/Planner.java File fe/src/main/java/org/apache/impala/planner/Planner.java: http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/planner/Planner.java@276 PS1, Line 276: repartition Nit: "Repartition" -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 09 Nov 2023 17:45:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Noemi Pap-Takacs has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 1: (1 comment) I could not read it all through, just one question. http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/analysis/KuduModifyImpl.java File fe/src/main/java/org/apache/impala/analysis/KuduModifyImpl.java: http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/analysis/KuduModifyImpl.java@167 PS1, Line 167: sortExprs_.add(ref); 'sortExprs_' is defined in ModifyImpl.java and has the following comment: 'For every column of the target table that is referenced in the optional 'sort.columns' table property, this list will contain the corresponding result expr from 'resultExprs_'. Before insertion, all rows will be sorted by these exprs. If the list is empty, no additional sorting by non-partitioning columns will be performed.The column list must not contain partition columns and must be empty for non-Hdfs tables.' This comment does not align with line 167. As far as I understand, the comment suggests, that partitioning columns should not be added to 'sortExprs_'. Should 'sortExprs_' be empty in KuduModifyImpl since Kudu is not a HDFS table? Does anything rely on this variable? Is it shown in the plan? Or maybe the comment needs to be updated? -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Comment-Date: Mon, 13 Nov 2023 10:43:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Hello Andrew Sherman, Daniel Becker, Gabor Kaszab, Noemi Pap-Takacs, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20677 to look at the new patch set (#2). Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables This patch adds limited UPDATE support for Iceberg tables. The limitations mean users cannot update Iceberg tables if any of the followings are true: * UPDATE value of partitioning column * UPDATE table that went through partition evolution * Table has SORT BY properties The above limitations will be resolved by part 3. The usual limitations like writing non-Parquet files, using copy-on-write, modifying V1 tables are out of scope of IMPALA-12313. This patch implements UPDATEs with the merge-on-read technique. This means the UPDATE statement writes both data files and delete files. Data files contain the updated records, delete files contain the position delete records of the old data records that have been touched. To achieve the above this patch introduces a new sink: MultiDataSink. We can configure multiple TableSinks for a single MultiDataSink object. During execution, the row batches sent to the MultiDataSink will be forwarded to all the TableSinks that have been registered. The UPDATE statement for an Iceberg table creates a source select statement with all table columns and virtual columns INPUT__FILE__NAME and FILE__POSITION. E.g. imagine we have a table 'tbl' with schema (i int, s string, k int), and we update the table with: UPDATE tbl SET k = 5 WHERE i % 100 = 11; The generated source statement will be ==> SELECT i, s, 5, INPUT__FILE__NAME, FILE__POSITION FROM tbl WHERE i % 100 = 11; Then we create two table sinks that refer to expressions from the above source statement: Insert sink (i, s, 5) Delete sink (INPUT__FILE__NAME, FILE__POSITION) The tuples in the rowbatch of MultiDataSink contain slots for all the above expressions (i, s, 5, INPUT__FILE__NAME, FILE__POSITION). MultiDataSink forwards each row batch to each registered TableSink. They will pick their relevant expressions from the tuple and write data/delete files. The tuples are sorted by INPUTE__FILE__NAME and FILE__POSITION because we need to write the delete records in this order. For partitioned tables we need to shuffle and sort the input tuples. In this case we also add virtual columns "PARTITION__SPEC__ID" and "ICEBERG__PARTITION__SERIALIZED" to the source statement and shuffle and sort the rows based on them. Data files and delete files are now separated in the DmlExecState, so at the end of the operation we'll have two sets of files. We use these two sets to create a new Iceberg snapshot. Why this patch has the limitations? - Because we are shuffling and sorting rows based on the delete records and their partitions. This means that the new data files might not get written in an efficient way, e.g. there will be too many of them, or we will need to keep too many open file handles during writing. Also, if the table has SORT BY properties, we cannot respect it as the input rows are ordered in a way to favor the position deletes. Patch 3 will introduce a buffering writer for position delete files. This means we will shuffle and sort records based on the data records' partitions and SORT BY properties while delete records get buffered and written out at the end (sorted by file_path and position). In some edge cases the delete records might not get written efficiently, but it is a smaller problem then inefficient data files. Testing: * negative tests * planner tests * Basic e2e testing with all supported data types Testing TODO: * partitioned tables * authz * Impala/Hive interop tests * concurrent tests Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 --- M be/src/exec/CMakeLists.txt M be/src/exec/data-sink.cc M be/src/exec/iceberg-delete-sink.cc M be/src/exec/iceberg-delete-sink.h A be/src/exec/multi-table-sink.cc A be/src/exec/multi-table-sink.h M be/src/exec/table-sink-base.cc M be/src/exec/table-sink-base.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/service/client-request-state.cc M common/protobuf/control_service.proto M common/thrift/DataSinks.thrift M common/thrift/ImpalaService.thrift M common/thrift/Types.thrift M fe/src/main/java/org/apache/impala/analysis/DeleteStmt.java M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java M fe/src/main/java/org/apache/impala/analysis/DmlStatementBase.java M fe/src/main/java/org/apache/impala/analysis/IcebergDeleteImpl.java M fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java A fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/s
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/20677/2/be/src/exec/multi-table-sink.h File be/src/exec/multi-table-sink.h: http://gerrit.cloudera.org:8080/#/c/20677/2/be/src/exec/multi-table-sink.h@51 PS2, Line 51: /// MultiTableSink has multiple child table sink objects. It sends the received row batches line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/20677/2/fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java File fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java: http://gerrit.cloudera.org:8080/#/c/20677/2/fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java@196 PS2, Line 196: for (Map.Entry tableIdEntry : additionalTargetTableIds_.entrySet()) { line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Comment-Date: Fri, 17 Nov 2023 15:10:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 2: (12 comments) Thanks for the comments. I'll add the remaining tests next week. http://gerrit.cloudera.org:8080/#/c/20677/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20677/1//COMMIT_MSG@14 PS1, Line 14: * Table has SORT BY properties > Is there also a restriction that TBLPROPERTIES must specify merge-on-read? I listed the limitations that I'm planning to resolve in the scope of IMPALA-12313. I added an extra sentence to make it clear. http://gerrit.cloudera.org:8080/#/c/20677/1//COMMIT_MSG@83 PS1, Line 83: ting: > Spelling: "Basic" Done http://gerrit.cloudera.org:8080/#/c/20677/1/be/src/exec/multi-table-sink.h File be/src/exec/multi-table-sink.h: http://gerrit.cloudera.org:8080/#/c/20677/1/be/src/exec/multi-table-sink.h@32 PS1, Line 32: class MultiTableSinkConfig : public DataSinkConfig { > It would be good at some point to have descriptions for all the methods. Added comments. http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java File fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java: http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java@65 PS1, Line 65: additiona > Spelling: additional Done http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java File fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java: http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java@52 PS1, Line 52: private L > This and next line could be private I think Done http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java@68 PS1, Line 68: specs > Spelling: "specs" Done http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java@160 PS1, Line 160: targ > Nit: "Cast" Done http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java@161 PS1, Line 161: n> co > Nit: "table." Done http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/analysis/KuduModifyImpl.java File fe/src/main/java/org/apache/impala/analysis/KuduModifyImpl.java: http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/analysis/KuduModifyImpl.java@167 PS1, Line 167: selectList.add(new S > 'sortExprs_' is defined in ModifyImpl.java and has the following comment: Thanks for catching this. It didn't cause a problem because * addPartitioningColumn() was actually unused => removed it * addKeyColumn was always invoked with isSortingColumn=false => removed this parameter. * Planner.createPreDmlSort() ignored the partition exprs, only looked at sort exprs => now it uses both to behave like the commit message suggests, and in IcebergDeleteImpl and IcebergUpdateImpl I don't add the partition exprs to 'sortExprs_' anymore. http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/analysis/ModifyImpl.java File fe/src/main/java/org/apache/impala/analysis/ModifyImpl.java: http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/analysis/ModifyImpl.java@42 PS1, Line 42: /** > Add description Done http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java File fe/src/main/java/org/apache/impala/planner/MultiDataSink.java: http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java@30 PS1, Line 30: /** > Add description Done http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/planner/Planner.java File fe/src/main/java/org/apache/impala/planner/Planner.java: http://gerrit.cloudera.org:8080/#/c/20677/1/fe/src/main/java/org/apache/impala/planner/Planner.java@276 PS1, Line 276: Repartition > Nit: "Repartition" Done -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 17 Nov 2023 15:12:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/14473/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 17 Nov 2023 15:36:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Hello Andrew Sherman, Daniel Becker, Gabor Kaszab, Noemi Pap-Takacs, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20677 to look at the new patch set (#3). Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables This patch adds limited UPDATE support for Iceberg tables. The limitations mean users cannot update Iceberg tables if any of the followings are true: * UPDATE value of partitioning column * UPDATE table that went through partition evolution * Table has SORT BY properties The above limitations will be resolved by part 3. The usual limitations like writing non-Parquet files, using copy-on-write, modifying V1 tables are out of scope of IMPALA-12313. This patch implements UPDATEs with the merge-on-read technique. This means the UPDATE statement writes both data files and delete files. Data files contain the updated records, delete files contain the position delete records of the old data records that have been touched. To achieve the above this patch introduces a new sink: MultiDataSink. We can configure multiple TableSinks for a single MultiDataSink object. During execution, the row batches sent to the MultiDataSink will be forwarded to all the TableSinks that have been registered. The UPDATE statement for an Iceberg table creates a source select statement with all table columns and virtual columns INPUT__FILE__NAME and FILE__POSITION. E.g. imagine we have a table 'tbl' with schema (i int, s string, k int), and we update the table with: UPDATE tbl SET k = 5 WHERE i % 100 = 11; The generated source statement will be ==> SELECT i, s, 5, INPUT__FILE__NAME, FILE__POSITION FROM tbl WHERE i % 100 = 11; Then we create two table sinks that refer to expressions from the above source statement: Insert sink (i, s, 5) Delete sink (INPUT__FILE__NAME, FILE__POSITION) The tuples in the rowbatch of MultiDataSink contain slots for all the above expressions (i, s, 5, INPUT__FILE__NAME, FILE__POSITION). MultiDataSink forwards each row batch to each registered TableSink. They will pick their relevant expressions from the tuple and write data/delete files. The tuples are sorted by INPUTE__FILE__NAME and FILE__POSITION because we need to write the delete records in this order. For partitioned tables we need to shuffle and sort the input tuples. In this case we also add virtual columns "PARTITION__SPEC__ID" and "ICEBERG__PARTITION__SERIALIZED" to the source statement and shuffle and sort the rows based on them. Data files and delete files are now separated in the DmlExecState, so at the end of the operation we'll have two sets of files. We use these two sets to create a new Iceberg snapshot. Why this patch has the limitations? - Because we are shuffling and sorting rows based on the delete records and their partitions. This means that the new data files might not get written in an efficient way, e.g. there will be too many of them, or we will need to keep too many open file handles during writing. Also, if the table has SORT BY properties, we cannot respect it as the input rows are ordered in a way to favor the position deletes. Patch 3 will introduce a buffering writer for position delete files. This means we will shuffle and sort records based on the data records' partitions and SORT BY properties while delete records get buffered and written out at the end (sorted by file_path and position). In some edge cases the delete records might not get written efficiently, but it is a smaller problem then inefficient data files. Testing: * negative tests * planner tests * update all supported data types * partitioned tables * Impala/Hive interop tests * authz tests * concurrent tests Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 --- M be/src/exec/CMakeLists.txt M be/src/exec/data-sink.cc M be/src/exec/iceberg-delete-sink.cc M be/src/exec/iceberg-delete-sink.h A be/src/exec/multi-table-sink.cc A be/src/exec/multi-table-sink.h M be/src/exec/table-sink-base.cc M be/src/exec/table-sink-base.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/service/client-request-state.cc M common/protobuf/control_service.proto M common/thrift/DataSinks.thrift M common/thrift/ImpalaService.thrift M common/thrift/Types.thrift M fe/src/main/java/org/apache/impala/analysis/DeleteStmt.java M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java M fe/src/main/java/org/apache/impala/analysis/DmlStatementBase.java M fe/src/main/java/org/apache/impala/analysis/IcebergDeleteImpl.java M fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java A fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/i
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/14517/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 24 Nov 2023 20:13:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9958/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 24 Nov 2023 20:14:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 3: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/9958/ -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Sat, 25 Nov 2023 00:41:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 3: (50 comments) http://gerrit.cloudera.org:8080/#/c/20677/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20677/3//COMMIT_MSG@10 PS3, Line 10: any of : the followings are Nit: "any of the following is". http://gerrit.cloudera.org:8080/#/c/20677/3//COMMIT_MSG@46 PS3, Line 46: Insert sink (i, s, 5) Just an idea: How difficult / expensive would it be to check the rows we update whether the new row value differs from the old one? We could skip rows that already have the desired value. Even if it's worth the effort it should probably be in a different change. http://gerrit.cloudera.org:8080/#/c/20677/3//COMMIT_MSG@66 PS3, Line 66: Why this patch has Nit: "Why does this patch have..." would be better. http://gerrit.cloudera.org:8080/#/c/20677/3//COMMIT_MSG@80 PS3, Line 80: smaller problem :then inefficient data files Why is it a smaller problem? Is it because we expect that there is more data in data files compared to delete files? http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/exec/iceberg-delete-sink.cc File be/src/exec/iceberg-delete-sink.cc: http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/exec/iceberg-delete-sink.cc@117 PS3, Line 117: existing_partition Not modified in this change, but does this variable do anything? http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/exec/iceberg-delete-sink.cc@137 PS3, Line 137: dml_exec_state_ Does 'dml_exec_state_' need to be the same as 'state->dml_exec_state()', which was used before this patch? If so, we could add a DCHECK. http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/exec/multi-table-sink.h File be/src/exec/multi-table-sink.h: http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/exec/multi-table-sink.h@69 PS3, Line 69: /// END. I'd consider writing the explanatory comment of BEGIN also here ("Following methods just delegate...") because in case other methods are added it could become difficult to understand what END refers to. http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/exec/multi-table-sink.cc File be/src/exec/multi-table-sink.cc: http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/exec/multi-table-sink.cc@52 PS3, Line 52: (TableSinkBase*) Instead of a C-style cast we could use static_cast, maybe with a DCHECK containing a dynamic cast. http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/exec/multi-table-sink.cc@94 PS3, Line 94: closed_ = true; Shouldn't DataSink::Close() already set 'closed_' to true? We could also replace it with DCHECK(closed_). http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/exec/table-sink-base.cc File be/src/exec/table-sink-base.cc: http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/exec/table-sink-base.cc@419 PS3, Line 419: if (dml_exec_state == nullptr) dml_exec_state = state->dml_exec_state(); When is it possible that 'dml_exec_state' is not null and is not the same as 'state->dml_exec_state()'? For example, is it possible if 'is_delete' is false? http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/exec/table-sink-base.cc@430 PS3, Line 430: IsIceberg() If 'is_delete' is true, shouldn't this always be true also? We could add a DCHECK for it. http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/runtime/dml-exec-state.cc File be/src/runtime/dml-exec-state.cc: http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/runtime/dml-exec-state.cc@483 PS3, Line 483: ASDF What does ASDF mean? http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/runtime/dml-exec-state.cc@565 PS3, Line 565: is_iceberg Shouldn't 'is_iceberg' always be true for delete files? http://gerrit.cloudera.org:8080/#/c/20677/3/common/protobuf/control_service.proto File common/protobuf/control_service.proto: http://gerrit.cloudera.org:8080/#/c/20677/3/common/protobuf/control_service.proto@95 PS3, Line 95: 8 Did you intentionally skip 7? http://gerrit.cloudera.org:8080/#/c/20677/3/common/thrift/DataSinks.thrift File common/thrift/DataSinks.thrift: http://gerrit.cloudera.org:8080/#/c/20677/3/common/thrift/DataSinks.thrift@187 PS3, Line 187: 10: optional list data_sinks Maybe 'child_data_sinks' would be better. Could also add a comment that it is used for the children of MULTI_DATA_SINKs. http://gerrit.cloudera.org:8080/#/c/20677/3/fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java File fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java: http://gerrit.cloudera.org:8080/#/c/20677/3/fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java@65 PS3, Line 65: private final Map additionalTargetTableIds_ = new HashMap<>(); Shouldn't this come directly after 'targetTable_'? http://gerrit.cloudera.org:8080/#/c/20677/3/fe/src/main/java/org/apache/impala/analysis/IcebergDeleteImpl.java Fi
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/20677/3/fe/src/main/java/org/apache/impala/util/IcebergUtil.java File fe/src/main/java/org/apache/impala/util/IcebergUtil.java: http://gerrit.cloudera.org:8080/#/c/20677/3/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@1155 PS3, Line 1155: getIcebergPartitionTransformExpr The two 'getIcebergPartitionTransformExpr()' functions are only used here. Do they need to be public? -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 30 Nov 2023 13:36:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 4: (4 comments) http://gerrit.cloudera.org:8080/#/c/20677/4/tests/query_test/test_iceberg.py File tests/query_test/test_iceberg.py: http://gerrit.cloudera.org:8080/#/c/20677/4/tests/query_test/test_iceberg.py@1321 PS4, Line 1321: d flake8: E999 SyntaxError: invalid syntax http://gerrit.cloudera.org:8080/#/c/20677/4/tests/stress/test_update_stress.py File tests/stress/test_update_stress.py: http://gerrit.cloudera.org:8080/#/c/20677/4/tests/stress/test_update_stress.py@100 PS4, Line 100: u flake8: F821 undefined name 'updates' http://gerrit.cloudera.org:8080/#/c/20677/4/tests/stress/test_update_stress.py@101 PS4, Line 101: u flake8: F821 undefined name 'updates' http://gerrit.cloudera.org:8080/#/c/20677/4/tests/stress/test_update_stress.py@102 PS4, Line 102: u flake8: F821 undefined name 'updates' -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 01 Dec 2023 14:49:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Hello Andrew Sherman, Tamas Mate, Daniel Becker, Gabor Kaszab, Noemi Pap-Takacs, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20677 to look at the new patch set (#4). Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables This patch adds limited UPDATE support for Iceberg tables. The limitations mean users cannot update Iceberg tables if any of the following is true: * UPDATE value of partitioning column * UPDATE table that went through partition evolution * Table has SORT BY properties The above limitations will be resolved by part 3. The usual limitations like writing non-Parquet files, using copy-on-write, modifying V1 tables are out of scope of IMPALA-12313. This patch implements UPDATEs with the merge-on-read technique. This means the UPDATE statement writes both data files and delete files. Data files contain the updated records, delete files contain the position delete records of the old data records that have been touched. To achieve the above this patch introduces a new sink: MultiDataSink. We can configure multiple TableSinks for a single MultiDataSink object. During execution, the row batches sent to the MultiDataSink will be forwarded to all the TableSinks that have been registered. The UPDATE statement for an Iceberg table creates a source select statement with all table columns and virtual columns INPUT__FILE__NAME and FILE__POSITION. E.g. imagine we have a table 'tbl' with schema (i int, s string, k int), and we update the table with: UPDATE tbl SET k = 5 WHERE i % 100 = 11; The generated source statement will be ==> SELECT i, s, 5, INPUT__FILE__NAME, FILE__POSITION FROM tbl WHERE i % 100 = 11; Then we create two table sinks that refer to expressions from the above source statement: Insert sink (i, s, 5) Delete sink (INPUT__FILE__NAME, FILE__POSITION) The tuples in the rowbatch of MultiDataSink contain slots for all the above expressions (i, s, 5, INPUT__FILE__NAME, FILE__POSITION). MultiDataSink forwards each row batch to each registered TableSink. They will pick their relevant expressions from the tuple and write data/delete files. The tuples are sorted by INPUTE__FILE__NAME and FILE__POSITION because we need to write the delete records in this order. For partitioned tables we need to shuffle and sort the input tuples. In this case we also add virtual columns "PARTITION__SPEC__ID" and "ICEBERG__PARTITION__SERIALIZED" to the source statement and shuffle and sort the rows based on them. Data files and delete files are now separated in the DmlExecState, so at the end of the operation we'll have two sets of files. We use these two sets to create a new Iceberg snapshot. Why does this patch has the limitations? - Because we are shuffling and sorting rows based on the delete records and their partitions. This means that the new data files might not get written in an efficient way, e.g. there will be too many of them, or we will need to keep too many open file handles during writing. Also, if the table has SORT BY properties, we cannot respect it as the input rows are ordered in a way to favor the position deletes. Patch 3 will introduce a buffering writer for position delete files. This means we will shuffle and sort records based on the data records' partitions and SORT BY properties while delete records get buffered and written out at the end (sorted by file_path and position). In some edge cases the delete records might not get written efficiently, but it is a smaller problem then inefficient data files. Testing: * negative tests * planner tests * update all supported data types * partitioned tables * Impala/Hive interop tests * authz tests * concurrent tests Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 --- M be/src/exec/CMakeLists.txt M be/src/exec/data-sink.cc M be/src/exec/iceberg-delete-sink.cc M be/src/exec/iceberg-delete-sink.h A be/src/exec/multi-table-sink.cc A be/src/exec/multi-table-sink.h M be/src/exec/table-sink-base.cc M be/src/exec/table-sink-base.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/service/client-request-state.cc M common/protobuf/control_service.proto M common/thrift/DataSinks.thrift M common/thrift/ImpalaService.thrift M common/thrift/Types.thrift M fe/src/main/java/org/apache/impala/analysis/DeleteStmt.java M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java M fe/src/main/java/org/apache/impala/analysis/DmlStatementBase.java M fe/src/main/java/org/apache/impala/analysis/IcebergDeleteImpl.java M fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java A fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/ja
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9977/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 01 Dec 2023 14:56:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Hello Andrew Sherman, Tamas Mate, Daniel Becker, Gabor Kaszab, Noemi Pap-Takacs, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20677 to look at the new patch set (#5). Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables This patch adds limited UPDATE support for Iceberg tables. The limitations mean users cannot update Iceberg tables if any of the following is true: * UPDATE value of partitioning column * UPDATE table that went through partition evolution * Table has SORT BY properties The above limitations will be resolved by part 3. The usual limitations like writing non-Parquet files, using copy-on-write, modifying V1 tables are out of scope of IMPALA-12313. This patch implements UPDATEs with the merge-on-read technique. This means the UPDATE statement writes both data files and delete files. Data files contain the updated records, delete files contain the position delete records of the old data records that have been touched. To achieve the above this patch introduces a new sink: MultiDataSink. We can configure multiple TableSinks for a single MultiDataSink object. During execution, the row batches sent to the MultiDataSink will be forwarded to all the TableSinks that have been registered. The UPDATE statement for an Iceberg table creates a source select statement with all table columns and virtual columns INPUT__FILE__NAME and FILE__POSITION. E.g. imagine we have a table 'tbl' with schema (i int, s string, k int), and we update the table with: UPDATE tbl SET k = 5 WHERE i % 100 = 11; The generated source statement will be ==> SELECT i, s, 5, INPUT__FILE__NAME, FILE__POSITION FROM tbl WHERE i % 100 = 11; Then we create two table sinks that refer to expressions from the above source statement: Insert sink (i, s, 5) Delete sink (INPUT__FILE__NAME, FILE__POSITION) The tuples in the rowbatch of MultiDataSink contain slots for all the above expressions (i, s, 5, INPUT__FILE__NAME, FILE__POSITION). MultiDataSink forwards each row batch to each registered TableSink. They will pick their relevant expressions from the tuple and write data/delete files. The tuples are sorted by INPUTE__FILE__NAME and FILE__POSITION because we need to write the delete records in this order. For partitioned tables we need to shuffle and sort the input tuples. In this case we also add virtual columns "PARTITION__SPEC__ID" and "ICEBERG__PARTITION__SERIALIZED" to the source statement and shuffle and sort the rows based on them. Data files and delete files are now separated in the DmlExecState, so at the end of the operation we'll have two sets of files. We use these two sets to create a new Iceberg snapshot. Why does this patch has the limitations? - Because we are shuffling and sorting rows based on the delete records and their partitions. This means that the new data files might not get written in an efficient way, e.g. there will be too many of them, or we will need to keep too many open file handles during writing. Also, if the table has SORT BY properties, we cannot respect it as the input rows are ordered in a way to favor the position deletes. Patch 3 will introduce a buffering writer for position delete files. This means we will shuffle and sort records based on the data records' partitions and SORT BY properties while delete records get buffered and written out at the end (sorted by file_path and position). In some edge cases the delete records might not get written efficiently, but it is a smaller problem then inefficient data files. Testing: * negative tests * planner tests * update all supported data types * partitioned tables * Impala/Hive interop tests * authz tests * concurrent tests Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 --- M be/src/exec/CMakeLists.txt M be/src/exec/data-sink.cc M be/src/exec/iceberg-delete-sink.cc M be/src/exec/iceberg-delete-sink.h A be/src/exec/multi-table-sink.cc A be/src/exec/multi-table-sink.h M be/src/exec/table-sink-base.cc M be/src/exec/table-sink-base.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/service/client-request-state.cc M common/protobuf/control_service.proto M common/thrift/DataSinks.thrift M common/thrift/ImpalaService.thrift M common/thrift/Types.thrift M fe/src/main/java/org/apache/impala/analysis/DeleteStmt.java M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java M fe/src/main/java/org/apache/impala/analysis/DmlStatementBase.java M fe/src/main/java/org/apache/impala/analysis/IcebergDeleteImpl.java M fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java A fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/ja
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 4: (55 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/20677/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20677/3//COMMIT_MSG@10 PS3, Line 10: any of : the following is t > Nit: "any of the following is". Done http://gerrit.cloudera.org:8080/#/c/20677/3//COMMIT_MSG@46 PS3, Line 46: Insert sink (i, s, 5) > Just an idea: How difficult / expensive would it be to check the rows we up That's a good idea. In simple cases this might be doable by adding extra predicates: E.g.: UPDATE tbl SET k = 3 WHERE i > 4; ==> UPDATE tbl SET k = 3 WHERE i > 4 AND k != 3; Other than these planner-side extra predicates, I wouldn't complicate the backend executors. But yeah, let's think more about this and do it separately. Filed IMPALA-12588. http://gerrit.cloudera.org:8080/#/c/20677/3//COMMIT_MSG@66 PS3, Line 66: Why does this patc > Nit: "Why does this patch have..." would be better. Done http://gerrit.cloudera.org:8080/#/c/20677/3//COMMIT_MSG@80 PS3, Line 80: smaller problem :then inefficient data files > Why is it a smaller problem? Is it because we expect that there is more dat Yes, I was thinking about having as few data files as possible and having the data in a good order, so you get good encoding and page filtering capabilities. I don't think the other way around, i.e. as few delete files as possible but more data files than necessary, would be beneficial. http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/exec/iceberg-delete-sink.cc File be/src/exec/iceberg-delete-sink.cc: http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/exec/iceberg-delete-sink.cc@117 PS3, Line 117: .size(), 2); > Not modified in this change, but does this variable do anything? Removed. http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/exec/iceberg-delete-sink.cc@137 PS3, Line 137: "If thi > Does 'dml_exec_state_' need to be the same as 'state->dml_exec_state()', wh No, dml_exec_state_ is the delete sink's own object. When we are doing the UPDATE and having two sinks (insert sink, delete sink), it's not possible to use the same dml_exec_state object simultaneously without violating some preconditions. So IcebergDeleteSink now has its own which is merged into state->dml_exec_state() in Close(). Added a comment about this to make it clear. http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/exec/multi-table-sink.h File be/src/exec/multi-table-sink.h: http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/exec/multi-table-sink.h@69 PS3, Line 69: /// END: Methods above just delegate calls to the child sinks in 'table_sinks_'. > I'd consider writing the explanatory comment of BEGIN also here ("Following Done http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/exec/multi-table-sink.cc File be/src/exec/multi-table-sink.cc: http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/exec/multi-table-sink.cc@52 PS3, Line 52: > Instead of a C-style cast we could use static_cast, maybe with a DCHECK con Since this is not hot code I'm using dynamic_cast and DCHECK_NOTNULL. http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/exec/multi-table-sink.cc@94 PS3, Line 94: DataSink::Close(state); > Shouldn't DataSink::Close() already set 'closed_' to true? We could also re Done http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/exec/table-sink-base.cc File be/src/exec/table-sink-base.cc: http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/exec/table-sink-base.cc@419 PS3, Line 419: if (dml_exec_state == nullptr) dml_exec_state = state->dml_exec_state(); > When is it possible that 'dml_exec_state' is not null and is not the same a By default we are using state->dml_exec_state(), unless the caller provides an explicit dml_exec_state. Currently the Iceberg Delete Sink provides explicit dml exec state because it has its own. But I don't want to make assumptions here like 'is_delete' <==> 'dml_exec_state != nullptr', because they are independent parameters, the connection between them might be just temporary. In fact, in part 3 the IcebergDeleteSink might lose its own dml_exec_state as UPDATEs will use a different delete sink. http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/exec/table-sink-base.cc@430 PS3, Line 430: > If 'is_delete' is true, shouldn't this always be true also? We could add a Done http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/runtime/dml-exec-state.cc File be/src/runtime/dml-exec-state.cc: http://gerrit.cloudera.org:8080/#/c/20677/3/be/src/runtime/dml-exec-state.cc@483 PS3, Line 483: et_n > What does ASDF mean? ASDF does not mean anything, it's just easy to type and search for in log messages during debugging. Removed. http://gerrit.cloudera.org:8080/#/c/20677/3
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 4: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/14566/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 01 Dec 2023 15:16:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/14567/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 01 Dec 2023 15:23:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 5: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/9977/ -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 01 Dec 2023 19:22:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9979/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 04 Dec 2023 11:58:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Hello Andrew Sherman, Tamas Mate, Daniel Becker, Gabor Kaszab, Noemi Pap-Takacs, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20677 to look at the new patch set (#7). Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables This patch adds limited UPDATE support for Iceberg tables. The limitations mean users cannot update Iceberg tables if any of the following is true: * UPDATE value of partitioning column * UPDATE table that went through partition evolution * Table has SORT BY properties The above limitations will be resolved by part 3. The usual limitations like writing non-Parquet files, using copy-on-write, modifying V1 tables are out of scope of IMPALA-12313. This patch implements UPDATEs with the merge-on-read technique. This means the UPDATE statement writes both data files and delete files. Data files contain the updated records, delete files contain the position delete records of the old data records that have been touched. To achieve the above this patch introduces a new sink: MultiDataSink. We can configure multiple TableSinks for a single MultiDataSink object. During execution, the row batches sent to the MultiDataSink will be forwarded to all the TableSinks that have been registered. The UPDATE statement for an Iceberg table creates a source select statement with all table columns and virtual columns INPUT__FILE__NAME and FILE__POSITION. E.g. imagine we have a table 'tbl' with schema (i int, s string, k int), and we update the table with: UPDATE tbl SET k = 5 WHERE i % 100 = 11; The generated source statement will be ==> SELECT i, s, 5, INPUT__FILE__NAME, FILE__POSITION FROM tbl WHERE i % 100 = 11; Then we create two table sinks that refer to expressions from the above source statement: Insert sink (i, s, 5) Delete sink (INPUT__FILE__NAME, FILE__POSITION) The tuples in the rowbatch of MultiDataSink contain slots for all the above expressions (i, s, 5, INPUT__FILE__NAME, FILE__POSITION). MultiDataSink forwards each row batch to each registered TableSink. They will pick their relevant expressions from the tuple and write data/delete files. The tuples are sorted by INPUTE__FILE__NAME and FILE__POSITION because we need to write the delete records in this order. For partitioned tables we need to shuffle and sort the input tuples. In this case we also add virtual columns "PARTITION__SPEC__ID" and "ICEBERG__PARTITION__SERIALIZED" to the source statement and shuffle and sort the rows based on them. Data files and delete files are now separated in the DmlExecState, so at the end of the operation we'll have two sets of files. We use these two sets to create a new Iceberg snapshot. Why does this patch has the limitations? - Because we are shuffling and sorting rows based on the delete records and their partitions. This means that the new data files might not get written in an efficient way, e.g. there will be too many of them, or we will need to keep too many open file handles during writing. Also, if the table has SORT BY properties, we cannot respect it as the input rows are ordered in a way to favor the position deletes. Patch 3 will introduce a buffering writer for position delete files. This means we will shuffle and sort records based on the data records' partitions and SORT BY properties while delete records get buffered and written out at the end (sorted by file_path and position). In some edge cases the delete records might not get written efficiently, but it is a smaller problem then inefficient data files. Testing: * negative tests * planner tests * update all supported data types * partitioned tables * Impala/Hive interop tests * authz tests * concurrent tests Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 --- M be/src/exec/CMakeLists.txt M be/src/exec/data-sink.cc M be/src/exec/iceberg-delete-sink.cc M be/src/exec/iceberg-delete-sink.h A be/src/exec/multi-table-sink.cc A be/src/exec/multi-table-sink.h M be/src/exec/table-sink-base.cc M be/src/exec/table-sink-base.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/service/client-request-state.cc M common/protobuf/control_service.proto M common/thrift/DataSinks.thrift M common/thrift/ImpalaService.thrift M common/thrift/Types.thrift M fe/src/main/java/org/apache/impala/analysis/DeleteStmt.java M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java M fe/src/main/java/org/apache/impala/analysis/DmlStatementBase.java M fe/src/main/java/org/apache/impala/analysis/IcebergDeleteImpl.java M fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java A fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/ja
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/14580/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 04 Dec 2023 12:25:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 04 Dec 2023 16:39:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 7: (18 comments) http://gerrit.cloudera.org:8080/#/c/20677/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20677/7//COMMIT_MSG@49 PS7, Line 49: The tuples in the rowbatch of MultiDataSink contain slots for all the Idea for the future: in case of a data and a delete sink, couldn't we only send the relevant fields to each sink, i.e. (i, s, 5) to the data sink and (INPUT__FILE__NAME, FILE__POSITION) to the delete sink? Or are we just passing pointers to them without copying the data? http://gerrit.cloudera.org:8080/#/c/20677/7//COMMIT_MSG@66 PS7, Line 66: has Nit: have. http://gerrit.cloudera.org:8080/#/c/20677/7/be/src/exec/iceberg-delete-sink.cc File be/src/exec/iceberg-delete-sink.cc: http://gerrit.cloudera.org:8080/#/c/20677/7/be/src/exec/iceberg-delete-sink.cc@120 PS7, Line 120: auto I think the concrete type (ScalarExpr*) would be more readable. Applies to the next line too. http://gerrit.cloudera.org:8080/#/c/20677/7/be/src/exec/iceberg-delete-sink.cc@125 PS7, Line 125: auto filepath_eval = output_expr_evals_[0]; I think the concrete type (ScalarExprEvaluator*) would be more readable. Applies to the next line too. http://gerrit.cloudera.org:8080/#/c/20677/7/be/src/exec/iceberg-delete-sink.cc@131 PS7, Line 131: val Why do we take the 'val' field? It is an int64_t which is then implicitly converted (back) to BigIntVal. Even if the original BigIntVal was NULL, the int64_t will be converted to a non-NULL BigIntVal. http://gerrit.cloudera.org:8080/#/c/20677/7/be/src/exec/iceberg-delete-sink.cc@133 PS7, Line 133: string filepath(reinterpret_cast(filepath_sv.ptr), filepath_sv.len); Couldn't we store the StringVal instance in 'prev_file_path_' instead of an std::string? The buffer should be valid during this function, and then we wouldn't have to copy the string. http://gerrit.cloudera.org:8080/#/c/20677/7/be/src/exec/iceberg-delete-sink.cc@137 PS7, Line 137: UPDATE statement We could write "UPDATE statement with a JOIN". http://gerrit.cloudera.org:8080/#/c/20677/7/be/src/exec/iceberg-delete-sink.cc@140 PS7, Line 140: prev_file_path_ = filepath; 'prev_file_path_' and 'prev_position_' are only used in this function, they could be local variables. Or are you concerned about the re-allocation between row batches? http://gerrit.cloudera.org:8080/#/c/20677/7/be/src/exec/table-sink-base.cc File be/src/exec/table-sink-base.cc: http://gerrit.cloudera.org:8080/#/c/20677/7/be/src/exec/table-sink-base.cc@430 PS7, Line 430: dml_exec_state->AddCreatedDeleteFile(*partition, Optional: we could still add DCHECK(IsIceberg()) here. http://gerrit.cloudera.org:8080/#/c/20677/7/be/src/runtime/dml-exec-state.h File be/src/runtime/dml-exec-state.h: http://gerrit.cloudera.org:8080/#/c/20677/7/be/src/runtime/dml-exec-state.h@82 PS7, Line 82: /// 'insert_stats' contains stats for the Iceberg delete file. Could mention that it has to be Iceberg. http://gerrit.cloudera.org:8080/#/c/20677/7/fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java File fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java: http://gerrit.cloudera.org:8080/#/c/20677/7/fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java@59 PS7, Line 59: Nit: unneeded spaces. http://gerrit.cloudera.org:8080/#/c/20677/7/fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java File fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java: http://gerrit.cloudera.org:8080/#/c/20677/7/fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java@74 PS7, Line 74: public List getSortExprs() { Could add @Override. http://gerrit.cloudera.org:8080/#/c/20677/7/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java File fe/src/main/java/org/apache/impala/planner/MultiDataSink.java: http://gerrit.cloudera.org:8080/#/c/20677/7/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java@132 PS7, Line 132: MAX_FS_WRITERS I think we should also mention that 'maxNumberOfMultiSinks_' is set to 'MAX_FS_WRITERS'. http://gerrit.cloudera.org:8080/#/c/20677/3/testdata/workloads/functional-query/queries/QueryTest/iceberg-update-basic.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-update-basic.test: http://gerrit.cloudera.org:8080/#/c/20677/3/testdata/workloads/functional-query/queries/QueryTest/iceberg-update-basic.test@176 PS3, Line 176: update ice_alltypes set bigint_col = 33, int_col = 3, string_col = 'three'; > It wasn't intentional, but I'm thinking about keeping it anyway, especially Great, this query being duplicate was the thing that gave me the idea for the optimisation :) http://gerrit.cloudera.org:8080/#/c/20677/3/testdata/workloads/functional-query/queries
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 7: (1 comment) Great change Zoltan! Started to review and tested it a bit. The MultiTableSink counters in the profile seem to be off, the counters are not updated based on the sum of "child" sinks. http://gerrit.cloudera.org:8080/#/c/20677/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20677/7//COMMIT_MSG@75 PS7, Line 75: Patch nit: Part -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 06 Dec 2023 14:16:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Hello Andrew Sherman, Tamas Mate, Daniel Becker, Gabor Kaszab, Noemi Pap-Takacs, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20677 to look at the new patch set (#8). Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables This patch adds limited UPDATE support for Iceberg tables. The limitations mean users cannot update Iceberg tables if any of the following is true: * UPDATE value of partitioning column * UPDATE table that went through partition evolution * Table has SORT BY properties The above limitations will be resolved by part 3. The usual limitations like writing non-Parquet files, using copy-on-write, modifying V1 tables are out of scope of IMPALA-12313. This patch implements UPDATEs with the merge-on-read technique. This means the UPDATE statement writes both data files and delete files. Data files contain the updated records, delete files contain the position delete records of the old data records that have been touched. To achieve the above this patch introduces a new sink: MultiDataSink. We can configure multiple TableSinks for a single MultiDataSink object. During execution, the row batches sent to the MultiDataSink will be forwarded to all the TableSinks that have been registered. The UPDATE statement for an Iceberg table creates a source select statement with all table columns and virtual columns INPUT__FILE__NAME and FILE__POSITION. E.g. imagine we have a table 'tbl' with schema (i int, s string, k int), and we update the table with: UPDATE tbl SET k = 5 WHERE i % 100 = 11; The generated source statement will be ==> SELECT i, s, 5, INPUT__FILE__NAME, FILE__POSITION FROM tbl WHERE i % 100 = 11; Then we create two table sinks that refer to expressions from the above source statement: Insert sink (i, s, 5) Delete sink (INPUT__FILE__NAME, FILE__POSITION) The tuples in the rowbatch of MultiDataSink contain slots for all the above expressions (i, s, 5, INPUT__FILE__NAME, FILE__POSITION). MultiDataSink forwards each row batch to each registered TableSink. They will pick their relevant expressions from the tuple and write data/delete files. The tuples are sorted by INPUTE__FILE__NAME and FILE__POSITION because we need to write the delete records in this order. For partitioned tables we need to shuffle and sort the input tuples. In this case we also add virtual columns "PARTITION__SPEC__ID" and "ICEBERG__PARTITION__SERIALIZED" to the source statement and shuffle and sort the rows based on them. Data files and delete files are now separated in the DmlExecState, so at the end of the operation we'll have two sets of files. We use these two sets to create a new Iceberg snapshot. Why does this patch have the limitations? - Because we are shuffling and sorting rows based on the delete records and their partitions. This means that the new data files might not get written in an efficient way, e.g. there will be too many of them, or we will need to keep too many open file handles during writing. Also, if the table has SORT BY properties, we cannot respect it as the input rows are ordered in a way to favor the position deletes. Patch 3 will introduce a buffering writer for position delete files. This means we will shuffle and sort records based on the data records' partitions and SORT BY properties while delete records get buffered and written out at the end (sorted by file_path and position). In some edge cases the delete records might not get written efficiently, but it is a smaller problem then inefficient data files. Testing: * negative tests * planner tests * update all supported data types * partitioned tables * Impala/Hive interop tests * authz tests * concurrent tests Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 --- M be/src/exec/CMakeLists.txt M be/src/exec/data-sink.cc M be/src/exec/iceberg-delete-sink.cc M be/src/exec/iceberg-delete-sink.h A be/src/exec/multi-table-sink.cc A be/src/exec/multi-table-sink.h M be/src/exec/table-sink-base.cc M be/src/exec/table-sink-base.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/service/client-request-state.cc M common/protobuf/control_service.proto M common/thrift/DataSinks.thrift M common/thrift/ImpalaService.thrift M common/thrift/Types.thrift M fe/src/main/java/org/apache/impala/analysis/DeleteStmt.java M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java M fe/src/main/java/org/apache/impala/analysis/DmlStatementBase.java M fe/src/main/java/org/apache/impala/analysis/IcebergDeleteImpl.java M fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java A fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/j
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Hello Andrew Sherman, Tamas Mate, Daniel Becker, Gabor Kaszab, Noemi Pap-Takacs, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20677 to look at the new patch set (#9). Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables This patch adds limited UPDATE support for Iceberg tables. The limitations mean users cannot update Iceberg tables if any of the following is true: * UPDATE value of partitioning column * UPDATE table that went through partition evolution * Table has SORT BY properties The above limitations will be resolved by part 3. The usual limitations like writing non-Parquet files, using copy-on-write, modifying V1 tables are out of scope of IMPALA-12313. This patch implements UPDATEs with the merge-on-read technique. This means the UPDATE statement writes both data files and delete files. Data files contain the updated records, delete files contain the position delete records of the old data records that have been touched. To achieve the above this patch introduces a new sink: MultiDataSink. We can configure multiple TableSinks for a single MultiDataSink object. During execution, the row batches sent to the MultiDataSink will be forwarded to all the TableSinks that have been registered. The UPDATE statement for an Iceberg table creates a source select statement with all table columns and virtual columns INPUT__FILE__NAME and FILE__POSITION. E.g. imagine we have a table 'tbl' with schema (i int, s string, k int), and we update the table with: UPDATE tbl SET k = 5 WHERE i % 100 = 11; The generated source statement will be ==> SELECT i, s, 5, INPUT__FILE__NAME, FILE__POSITION FROM tbl WHERE i % 100 = 11; Then we create two table sinks that refer to expressions from the above source statement: Insert sink (i, s, 5) Delete sink (INPUT__FILE__NAME, FILE__POSITION) The tuples in the rowbatch of MultiDataSink contain slots for all the above expressions (i, s, 5, INPUT__FILE__NAME, FILE__POSITION). MultiDataSink forwards each row batch to each registered TableSink. They will pick their relevant expressions from the tuple and write data/delete files. The tuples are sorted by INPUTE__FILE__NAME and FILE__POSITION because we need to write the delete records in this order. For partitioned tables we need to shuffle and sort the input tuples. In this case we also add virtual columns "PARTITION__SPEC__ID" and "ICEBERG__PARTITION__SERIALIZED" to the source statement and shuffle and sort the rows based on them. Data files and delete files are now separated in the DmlExecState, so at the end of the operation we'll have two sets of files. We use these two sets to create a new Iceberg snapshot. Why does this patch have the limitations? - Because we are shuffling and sorting rows based on the delete records and their partitions. This means that the new data files might not get written in an efficient way, e.g. there will be too many of them, or we will need to keep too many open file handles during writing. Also, if the table has SORT BY properties, we cannot respect it as the input rows are ordered in a way to favor the position deletes. Part 3 will introduce a buffering writer for position delete files. This means we will shuffle and sort records based on the data records' partitions and SORT BY properties while delete records get buffered and written out at the end (sorted by file_path and position). In some edge cases the delete records might not get written efficiently, but it is a smaller problem then inefficient data files. Testing: * negative tests * planner tests * update all supported data types * partitioned tables * Impala/Hive interop tests * authz tests * concurrent tests Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 --- M be/src/exec/CMakeLists.txt M be/src/exec/data-sink.cc M be/src/exec/iceberg-delete-sink.cc M be/src/exec/iceberg-delete-sink.h A be/src/exec/multi-table-sink.cc A be/src/exec/multi-table-sink.h M be/src/exec/table-sink-base.cc M be/src/exec/table-sink-base.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/service/client-request-state.cc M common/protobuf/control_service.proto M common/thrift/DataSinks.thrift M common/thrift/ImpalaService.thrift M common/thrift/Types.thrift M fe/src/main/java/org/apache/impala/analysis/DeleteStmt.java M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java M fe/src/main/java/org/apache/impala/analysis/DmlStatementBase.java M fe/src/main/java/org/apache/impala/analysis/IcebergDeleteImpl.java M fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java A fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/ja
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 9: (15 comments) Thanks for the comments. I'll look at the profile counters. Maybe we can fix them in a separete patch. http://gerrit.cloudera.org:8080/#/c/20677/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20677/7//COMMIT_MSG@49 PS7, Line 49: The tuples in the rowbatch of MultiDataSink contain slots for all the > Idea for the future: in case of a data and a delete sink, couldn't we only Inside fragments we are just passing pointers. Across fragments we are serializing and shuffling the row batches. To send only the releveant fields, we would need a fork exchange operator. In the design doc I had a note about this (see "Fork the transmission..."): https://docs.google.com/document/d/1GuRiJ3jjqkwINsSCKYaWwcfXHzbMrsd3WEMDOB11Xqw/edit#heading=h.5bmfhbmb4qdk http://gerrit.cloudera.org:8080/#/c/20677/7//COMMIT_MSG@66 PS7, Line 66: hav > Nit: have. Done http://gerrit.cloudera.org:8080/#/c/20677/7//COMMIT_MSG@75 PS7, Line 75: Part > nit: Part Done http://gerrit.cloudera.org:8080/#/c/20677/7/be/src/exec/iceberg-delete-sink.cc File be/src/exec/iceberg-delete-sink.cc: http://gerrit.cloudera.org:8080/#/c/20677/7/be/src/exec/iceberg-delete-sink.cc@120 PS7, Line 120: Scal > I think the concrete type (ScalarExpr*) would be more readable. Applies to Done http://gerrit.cloudera.org:8080/#/c/20677/7/be/src/exec/iceberg-delete-sink.cc@125 PS7, Line 125: ScalarExprEvaluator* filepath_eval = output_expr_evals_[0]; > I think the concrete type (ScalarExprEvaluator*) would be more readable. Ap Done http://gerrit.cloudera.org:8080/#/c/20677/7/be/src/exec/iceberg-delete-sink.cc@131 PS7, Line 131: > Why do we take the 'val' field? It is an int64_t which is then implicitly c Done http://gerrit.cloudera.org:8080/#/c/20677/7/be/src/exec/iceberg-delete-sink.cc@133 PS7, Line 133: string filepath(reinterpret_cast(filepath_sv.ptr), filepath_sv.len); > Couldn't we store the StringVal instance in 'prev_file_path_' instead of an We need prev_file_path_ to be valid across invocations, e.g. when there is a duplicate between row batches. http://gerrit.cloudera.org:8080/#/c/20677/7/be/src/exec/iceberg-delete-sink.cc@137 PS7, Line 137: UPDATE statement > We could write "UPDATE statement with a JOIN". Done http://gerrit.cloudera.org:8080/#/c/20677/7/be/src/exec/iceberg-delete-sink.cc@140 PS7, Line 140: prev_file_path_ = filepath; > 'prev_file_path_' and 'prev_position_' are only used in this function, they We need them to keep their states across invocations, e.g. when the duplicates are at row batch boundaries. http://gerrit.cloudera.org:8080/#/c/20677/7/be/src/exec/table-sink-base.cc File be/src/exec/table-sink-base.cc: http://gerrit.cloudera.org:8080/#/c/20677/7/be/src/exec/table-sink-base.cc@430 PS7, Line 430: DCHECK(IsIceberg()); > Optional: we could still add DCHECK(IsIceberg()) here. Done http://gerrit.cloudera.org:8080/#/c/20677/7/be/src/runtime/dml-exec-state.h File be/src/runtime/dml-exec-state.h: http://gerrit.cloudera.org:8080/#/c/20677/7/be/src/runtime/dml-exec-state.h@82 PS7, Line 82: /// can only be called for Iceberg tables. > Could mention that it has to be Iceberg. Done http://gerrit.cloudera.org:8080/#/c/20677/7/fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java File fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java: http://gerrit.cloudera.org:8080/#/c/20677/7/fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java@59 PS7, Line 59: // > Nit: unneeded spaces. Done http://gerrit.cloudera.org:8080/#/c/20677/7/fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java File fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java: http://gerrit.cloudera.org:8080/#/c/20677/7/fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java@74 PS7, Line 74: @Override > Could add @Override. Done http://gerrit.cloudera.org:8080/#/c/20677/7/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java File fe/src/main/java/org/apache/impala/planner/MultiDataSink.java: http://gerrit.cloudera.org:8080/#/c/20677/7/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java@132 PS7, Line 132: MAX_FS_WRITERS > I think we should also mention that 'maxNumberOfMultiSinks_' is set to 'MAX I added a comment to the variable. http://gerrit.cloudera.org:8080/#/c/20677/3/testdata/workloads/functional-query/queries/QueryTest/iceberg-update-basic.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-update-basic.test: http://gerrit.cloudera.org:8080/#/c/20677/3/testdata/workloads/functional-query/queries/QueryTest/iceberg-update-basic.test@200 PS3, Line 200: insert into ref_table values (0, 111, 'IMPALA', '2023-11-07'), (3, 222, 'ICEBER
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/14598/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 06 Dec 2023 14:44:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 9: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 9 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 06 Dec 2023 15:23:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 9: (4 comments) I went through this patch mostly for my own education. Left some minor questions or comments, nothing serious. Great work Zoli! http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/analysis/ModifyImpl.java File fe/src/main/java/org/apache/impala/analysis/ModifyImpl.java: http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/analysis/ModifyImpl.java@54 PS9, Line 54: abstract void substituteResultExprs(ExprSubstitutionMap smap, Analyzer analyzer); nit: could you add a comment for this function http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/util/IcebergUtil.java File fe/src/main/java/org/apache/impala/util/IcebergUtil.java: http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@1214 PS9, Line 1214: private static Expr getIcebergPartitionTransformExpr(IcebergPartitionField partField, Peter's DROP PARTITION patch added some new functionality for converting strings into partition values (?) if I'm not mistaken. Just mentioning it here to see if any of those could be re-used. http://gerrit.cloudera.org:8080/#/c/20677/9/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test File testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test: http://gerrit.cloudera.org:8080/#/c/20677/9/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test@753 PS9, Line 753: AuthorizationException: User '$USER' does not have privileges to access: functional_parquet.iceberg_v2_delete_positional Isn't the error msg misleading. It says the user doesn't have permissions on the table, but in fact the issue is that there is column masking on the table. This use case anyway made me think: wouldn't it make sense to allow the user to update these tables if they don't change the masked columns? I can understand that from implementation pow this could be difficult. http://gerrit.cloudera.org:8080/#/c/20677/9/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test File testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test: http://gerrit.cloudera.org:8080/#/c/20677/9/testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test@647 PS9, Line 647: masked tables nit: I guess these are not masked tables, but tables with row filtering. -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 9 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 07 Dec 2023 13:36:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 9: (8 comments) http://gerrit.cloudera.org:8080/#/c/20677/9/be/src/exec/iceberg-delete-sink.h File be/src/exec/iceberg-delete-sink.h: http://gerrit.cloudera.org:8080/#/c/20677/9/be/src/exec/iceberg-delete-sink.h@81 PS9, Line 81: /// Verifies that the row batch does not contain duplicated rows. Could you add an explanation of intent, it is not clear for me why this is needed. http://gerrit.cloudera.org:8080/#/c/20677/9/be/src/exec/iceberg-delete-sink.h@82 PS9, Line 82: VerifyRowBatch Nit: For readability could you rename to something like: VerifyRowsNotDuplicated/VerifyRowDistinctiveness http://gerrit.cloudera.org:8080/#/c/20677/9/be/src/exec/multi-table-sink.cc File be/src/exec/multi-table-sink.cc: http://gerrit.cloudera.org:8080/#/c/20677/9/be/src/exec/multi-table-sink.cc@52 PS9, Line 52: tsinkBase nit: tsink_base http://gerrit.cloudera.org:8080/#/c/20677/9/be/src/runtime/dml-exec-state.h File be/src/runtime/dml-exec-state.h: http://gerrit.cloudera.org:8080/#/c/20677/9/be/src/runtime/dml-exec-state.h@72 PS9, Line 72: void UpdatePartition(const std::string& partition_name, : int64_t num_rows_delta, const DmlStatsPB* insert_stats, : bool is_delete = false); nit: should fit into 2 lines http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java File fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java: http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java@120 PS9, Line 120: id nit: ++nextTableId_ and return nextTableId_ http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java File fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java: http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java@87 PS9, Line 87: public List getDeletePartitionExprs(Analyzer analyzer) : throws AnalysisException { nit: should fit into 1 line http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java@94 PS9, Line 94: public List getDeleteResultExprs(Analyzer analyzer) : throws AnalysisException { nit: should fit into 1 line http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/planner/Planner.java File fe/src/main/java/org/apache/impala/planner/Planner.java: http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/planner/Planner.java@185 PS9, Line 185: nit: indentation -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 9 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 07 Dec 2023 17:55:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 9: (1 comment) Looks good to me! I can plus two after these comments. http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/analysis/UpdateStmt.java File fe/src/main/java/org/apache/impala/analysis/UpdateStmt.java: http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/analysis/UpdateStmt.java@63 PS9, Line 63: // Currently only Kudu tables are supported. Not anymore :) -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 9 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 08 Dec 2023 11:32:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Noemi Pap-Takacs has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 9: (2 comments) http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java File fe/src/main/java/org/apache/impala/planner/MultiDataSink.java: http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java@137 PS9, Line 137: // If there are more nodes than instances where the fragment was initially This comment is a bit hard to understand. It seems to me that the two parts of the sentence are about two different cases: 1. more and 2. less nodes than instances. Maybe it was supposed to mean the following: If there are less nodes than instances where the fragment was initially planned to run on, then the instances will be distributed evenly across them. (If there are more nodes, then I guess some nodes get exactly one instance and some get none, so the number of instances limit the number of nodes that run the fragment.) Do I not understand it correctly? It would be nice to add a clearer explanation. http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java@138 PS9, Line 138: then nit: on -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 9 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 08 Dec 2023 16:24:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Hello Andrew Sherman, Tamas Mate, Daniel Becker, Gabor Kaszab, Noemi Pap-Takacs, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/20677 to look at the new patch set (#10). Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables This patch adds limited UPDATE support for Iceberg tables. The limitations mean users cannot update Iceberg tables if any of the following is true: * UPDATE value of partitioning column * UPDATE table that went through partition evolution * Table has SORT BY properties The above limitations will be resolved by part 3. The usual limitations like writing non-Parquet files, using copy-on-write, modifying V1 tables are out of scope of IMPALA-12313. This patch implements UPDATEs with the merge-on-read technique. This means the UPDATE statement writes both data files and delete files. Data files contain the updated records, delete files contain the position delete records of the old data records that have been touched. To achieve the above this patch introduces a new sink: MultiDataSink. We can configure multiple TableSinks for a single MultiDataSink object. During execution, the row batches sent to the MultiDataSink will be forwarded to all the TableSinks that have been registered. The UPDATE statement for an Iceberg table creates a source select statement with all table columns and virtual columns INPUT__FILE__NAME and FILE__POSITION. E.g. imagine we have a table 'tbl' with schema (i int, s string, k int), and we update the table with: UPDATE tbl SET k = 5 WHERE i % 100 = 11; The generated source statement will be ==> SELECT i, s, 5, INPUT__FILE__NAME, FILE__POSITION FROM tbl WHERE i % 100 = 11; Then we create two table sinks that refer to expressions from the above source statement: Insert sink (i, s, 5) Delete sink (INPUT__FILE__NAME, FILE__POSITION) The tuples in the rowbatch of MultiDataSink contain slots for all the above expressions (i, s, 5, INPUT__FILE__NAME, FILE__POSITION). MultiDataSink forwards each row batch to each registered TableSink. They will pick their relevant expressions from the tuple and write data/delete files. The tuples are sorted by INPUTE__FILE__NAME and FILE__POSITION because we need to write the delete records in this order. For partitioned tables we need to shuffle and sort the input tuples. In this case we also add virtual columns "PARTITION__SPEC__ID" and "ICEBERG__PARTITION__SERIALIZED" to the source statement and shuffle and sort the rows based on them. Data files and delete files are now separated in the DmlExecState, so at the end of the operation we'll have two sets of files. We use these two sets to create a new Iceberg snapshot. Why does this patch have the limitations? - Because we are shuffling and sorting rows based on the delete records and their partitions. This means that the new data files might not get written in an efficient way, e.g. there will be too many of them, or we will need to keep too many open file handles during writing. Also, if the table has SORT BY properties, we cannot respect it as the input rows are ordered in a way to favor the position deletes. Part 3 will introduce a buffering writer for position delete files. This means we will shuffle and sort records based on the data records' partitions and SORT BY properties while delete records get buffered and written out at the end (sorted by file_path and position). In some edge cases the delete records might not get written efficiently, but it is a smaller problem then inefficient data files. Testing: * negative tests * planner tests * update all supported data types * partitioned tables * Impala/Hive interop tests * authz tests * concurrent tests Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 --- M be/src/exec/CMakeLists.txt M be/src/exec/data-sink.cc M be/src/exec/iceberg-delete-sink.cc M be/src/exec/iceberg-delete-sink.h A be/src/exec/multi-table-sink.cc A be/src/exec/multi-table-sink.h M be/src/exec/table-sink-base.cc M be/src/exec/table-sink-base.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/service/client-request-state.cc M common/protobuf/control_service.proto M common/thrift/DataSinks.thrift M common/thrift/ImpalaService.thrift M common/thrift/Types.thrift M fe/src/main/java/org/apache/impala/analysis/DeleteStmt.java M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java M fe/src/main/java/org/apache/impala/analysis/DmlStatementBase.java M fe/src/main/java/org/apache/impala/analysis/IcebergDeleteImpl.java M fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java A fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/j
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 10: (15 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/20677/9/be/src/exec/iceberg-delete-sink.h File be/src/exec/iceberg-delete-sink.h: http://gerrit.cloudera.org:8080/#/c/20677/9/be/src/exec/iceberg-delete-sink.h@81 PS9, Line 81: /// Verifies that the row batch does not contain duplicated rows. > Could you add an explanation of intent, it is not clear for me why this is Added explanation. http://gerrit.cloudera.org:8080/#/c/20677/9/be/src/exec/iceberg-delete-sink.h@82 PS9, Line 82: the context of > Nit: For readability could you rename to something like: VerifyRowsNotDupli Done http://gerrit.cloudera.org:8080/#/c/20677/9/be/src/exec/multi-table-sink.cc File be/src/exec/multi-table-sink.cc: http://gerrit.cloudera.org:8080/#/c/20677/9/be/src/exec/multi-table-sink.cc@52 PS9, Line 52: tsink_bas > nit: tsink_base Done http://gerrit.cloudera.org:8080/#/c/20677/9/be/src/runtime/dml-exec-state.h File be/src/runtime/dml-exec-state.h: http://gerrit.cloudera.org:8080/#/c/20677/9/be/src/runtime/dml-exec-state.h@72 PS9, Line 72: void UpdatePartition(const std::string& partition_name, int64_t num_rows_delta, : const DmlStatsPB* insert_stats, bool is_delete = false); : > nit: should fit into 2 lines Done http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java File fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java: http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java@120 PS9, Line 120: id > nit: ++nextTableId_ and return nextTableId_ We want to use the original value (before the increment) in the put() and return stmt. http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java File fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java: http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java@87 PS9, Line 87: public List getDeletePartitionExprs(Analyzer analyzer) throws AnalysisException { : if (!originalTargetTable_.is > nit: should fit into 1 line Done http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java@94 PS9, Line 94: return getSlotRefs(analyzer, Lists.newArrayList( : "INPUT__FILE__NAME", "FI > nit: should fit into 1 line Done http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/analysis/ModifyImpl.java File fe/src/main/java/org/apache/impala/analysis/ModifyImpl.java: http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/analysis/ModifyImpl.java@54 PS9, Line 54: /** > nit: could you add a comment for this function Sure, added comment. http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/analysis/UpdateStmt.java File fe/src/main/java/org/apache/impala/analysis/UpdateStmt.java: http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/analysis/UpdateStmt.java@63 PS9, Line 63: // Currently Kudu and Iceberg tables are sup > Not anymore :) Done :) http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java File fe/src/main/java/org/apache/impala/planner/MultiDataSink.java: http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java@137 PS9, Line 137: > This comment is a bit hard to understand. It seems to me that the two parts Sorry about the confusion. This was copy-pasted from HdfsTableSink, and unfortunately not even being used by the planner. Filed IMPALA-12587 to fix that, until that I think it's best to remove these unused methods to avoid further confusion. http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java@138 PS9, Line 138: > nit: on Done http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/planner/Planner.java File fe/src/main/java/org/apache/impala/planner/Planner.java: http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/planner/Planner.java@185 PS9, Line 185: root > nit: indentation Done http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/util/IcebergUtil.java File fe/src/main/java/org/apache/impala/util/IcebergUtil.java: http://gerrit.cloudera.org:8080/#/c/20677/9/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@1214 PS9, Line 1214: private static Expr getIcebergPartitionTransformExpr(IcebergPartitionField partField, > Peter's DROP PARTITION patch added some new fun
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Noemi Pap-Takacs has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 10: Code-Review+1 Thanks, LGTM -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 10 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 08 Dec 2023 16:56:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/14624/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 10 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 08 Dec 2023 17:08:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 11: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 11 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 08 Dec 2023 17:17:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 10: Code-Review+2 Fantastic change, thanks! -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 10 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 08 Dec 2023 17:16:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 11: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9987/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 11 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 08 Dec 2023 17:17:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 11: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/9987/ -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 11 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 08 Dec 2023 21:52:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 11: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9989/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 11 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 08 Dec 2023 22:38:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables This patch adds limited UPDATE support for Iceberg tables. The limitations mean users cannot update Iceberg tables if any of the following is true: * UPDATE value of partitioning column * UPDATE table that went through partition evolution * Table has SORT BY properties The above limitations will be resolved by part 3. The usual limitations like writing non-Parquet files, using copy-on-write, modifying V1 tables are out of scope of IMPALA-12313. This patch implements UPDATEs with the merge-on-read technique. This means the UPDATE statement writes both data files and delete files. Data files contain the updated records, delete files contain the position delete records of the old data records that have been touched. To achieve the above this patch introduces a new sink: MultiDataSink. We can configure multiple TableSinks for a single MultiDataSink object. During execution, the row batches sent to the MultiDataSink will be forwarded to all the TableSinks that have been registered. The UPDATE statement for an Iceberg table creates a source select statement with all table columns and virtual columns INPUT__FILE__NAME and FILE__POSITION. E.g. imagine we have a table 'tbl' with schema (i int, s string, k int), and we update the table with: UPDATE tbl SET k = 5 WHERE i % 100 = 11; The generated source statement will be ==> SELECT i, s, 5, INPUT__FILE__NAME, FILE__POSITION FROM tbl WHERE i % 100 = 11; Then we create two table sinks that refer to expressions from the above source statement: Insert sink (i, s, 5) Delete sink (INPUT__FILE__NAME, FILE__POSITION) The tuples in the rowbatch of MultiDataSink contain slots for all the above expressions (i, s, 5, INPUT__FILE__NAME, FILE__POSITION). MultiDataSink forwards each row batch to each registered TableSink. They will pick their relevant expressions from the tuple and write data/delete files. The tuples are sorted by INPUTE__FILE__NAME and FILE__POSITION because we need to write the delete records in this order. For partitioned tables we need to shuffle and sort the input tuples. In this case we also add virtual columns "PARTITION__SPEC__ID" and "ICEBERG__PARTITION__SERIALIZED" to the source statement and shuffle and sort the rows based on them. Data files and delete files are now separated in the DmlExecState, so at the end of the operation we'll have two sets of files. We use these two sets to create a new Iceberg snapshot. Why does this patch have the limitations? - Because we are shuffling and sorting rows based on the delete records and their partitions. This means that the new data files might not get written in an efficient way, e.g. there will be too many of them, or we will need to keep too many open file handles during writing. Also, if the table has SORT BY properties, we cannot respect it as the input rows are ordered in a way to favor the position deletes. Part 3 will introduce a buffering writer for position delete files. This means we will shuffle and sort records based on the data records' partitions and SORT BY properties while delete records get buffered and written out at the end (sorted by file_path and position). In some edge cases the delete records might not get written efficiently, but it is a smaller problem then inefficient data files. Testing: * negative tests * planner tests * update all supported data types * partitioned tables * Impala/Hive interop tests * authz tests * concurrent tests Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Reviewed-on: http://gerrit.cloudera.org:8080/20677 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/exec/CMakeLists.txt M be/src/exec/data-sink.cc M be/src/exec/iceberg-delete-sink.cc M be/src/exec/iceberg-delete-sink.h A be/src/exec/multi-table-sink.cc A be/src/exec/multi-table-sink.h M be/src/exec/table-sink-base.cc M be/src/exec/table-sink-base.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/service/client-request-state.cc M common/protobuf/control_service.proto M common/thrift/DataSinks.thrift M common/thrift/ImpalaService.thrift M common/thrift/Types.thrift M fe/src/main/java/org/apache/impala/analysis/DeleteStmt.java M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java M fe/src/main/java/org/apache/impala/analysis/DmlStatementBase.java M fe/src/main/java/org/apache/impala/analysis/IcebergDeleteImpl.java M fe/src/main/java/org/apache/impala/analysis/IcebergModifyImpl.java A fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/
[Impala-ASF-CR] IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/20677 ) Change subject: IMPALA-12313: (part 2) Limited UPDATE support for Iceberg tables .. Patch Set 11: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/20677 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff0ef6075a2b6ebe130d15daa389ac1a505a7a08 Gerrit-Change-Number: 20677 Gerrit-PatchSet: 11 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Sat, 09 Dec 2023 03:04:04 + Gerrit-HasComments: No