Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/24093 )
Change subject: IMPALA-14592: Enable OPTIMIZE for Iceberg V3 tables ...................................................................... Patch Set 2: (5 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/24093/1/fe/src/main/java/org/apache/impala/analysis/OptimizeStmt.java File fe/src/main/java/org/apache/impala/analysis/OptimizeStmt.java: http://gerrit.cloudera.org:8080/#/c/24093/1/fe/src/main/java/org/apache/impala/analysis/OptimizeStmt.java@a172 PS1, Line 172: : : : : > Should we keep this statement in but changing just the version for comparis Makes sense, done. http://gerrit.cloudera.org:8080/#/c/24093/1/fe/src/main/java/org/apache/impala/analysis/OptimizeStmt.java@282 PS1, Line 282: } > nit: these methods will be useful for UPDATE/MERGE as well, currently I don Yeah, we have similar auxiliary methods in IcebergUtil (e.g. getIcebergPartitionTransformExpr), though that class has already grown too large. Probably we should introduce a new class, IcebergCommonExprs, and move these common expressions there. Too keep this change minimal and focused, filed IMPALA-14837 to track the refactoring efforts. http://gerrit.cloudera.org:8080/#/c/24093/1/fe/src/main/java/org/apache/impala/analysis/OptimizeStmt.java@312 PS1, Line 312: return createCoalesceFn(analyzer, : createSlotRef(analyzer, "_file_last_updated_sequence_number"), : createSlotRef(analyzer, "ICEBERG__DATA__SEQUENCE__NUMBER")); : } : : /** : * Creates a COALESCE function call with the given expressions as parameters. : */ : private Expr createCoalesceFn(Analyzer analyzer, Expr... exprs) : throws AnalysisException { : > nit: Most of this function is the same as getRowIdExpr(). Do you think it i Makes sense, done. http://gerrit.cloudera.org:8080/#/c/24093/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-optimize.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-optimize.test: http://gerrit.cloudera.org:8080/#/c/24093/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-optimize.test@25 PS1, Line 25: SELECT _file_row_id, _file_last_updated_sequence_number, * FROM optimize_nopart; > Shouldn't we check all the same fields as above? The compacted data files have new firs-row-id and data-sequence-numbers which was a bit confusing at first sight, so I omitted them. I can add them back if you think they're worth checking. Anyway I've added a new test for upgraded tables where I'm showing them. http://gerrit.cloudera.org:8080/#/c/24093/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-optimize.test@108 PS1, Line 108: > Maybe we could add a full table compaction after this to check how _file_ro Good idea, done. -- To view, visit http://gerrit.cloudera.org:8080/24093 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1c3cc4b9aaa46e494e1aa4583c1a6aafecad48de Gerrit-Change-Number: 24093 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Peter Rozsa <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Fri, 13 Mar 2026 15:23:15 +0000 Gerrit-HasComments: Yes
