Noemi Pap-Takacs has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/24093 )

Change subject: IMPALA-14592: Enable OPTIMIZE for Iceberg V3 tables
......................................................................


Patch Set 1:

(3 comments)

Thanks for extending OptimizeStmt for V3 tables!

It would be nice to have some tests with UPDATE/DELETE. I know that Impala 
cannot write V3 right now but we could add test cases with tables written by 
other engines, or as soon as writing is available in Impala.

http://gerrit.cloudera.org:8080/#/c/24093/1/fe/src/main/java/org/apache/impala/analysis/OptimizeStmt.java
File fe/src/main/java/org/apache/impala/analysis/OptimizeStmt.java:

http://gerrit.cloudera.org:8080/#/c/24093/1/fe/src/main/java/org/apache/impala/analysis/OptimizeStmt.java@312
PS1, Line 312: private Expr getLastUpdatedSeqNoExpr(Analyzer analyzer) throws 
AnalysisException {
             :     FunctionName fnName = new FunctionName(BuiltinsDb.NAME, 
"coalesce");
             :     List<Expr> paramList = new ArrayList<>();
             :     paramList.add(createSlotRef(analyzer, 
"_file_last_updated_sequence_number"));
             :     paramList.add(createSlotRef(analyzer, 
"ICEBERG__DATA__SEQUENCE__NUMBER"));
             :     FunctionCallExpr fnCall = new FunctionCallExpr(fnName, new 
FunctionParams(paramList));
             :     fnCall.setIsInternalFnCall(true);
             :     fnCall.analyze(analyzer);
             :     analyzer.materializeSlots(fnCall);
             :     return fnCall;
             :   }
nit: Most of this function is the same as getRowIdExpr(). Do you think it is 
worth extracting the common parts and calling it with the paramList?


http://gerrit.cloudera.org:8080/#/c/24093/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-optimize.test
File 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-optimize.test:

http://gerrit.cloudera.org:8080/#/c/24093/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-optimize.test@25
PS1, Line 25: SELECT _file_row_id, _file_last_updated_sequence_number, * FROM 
optimize_nopart;
Shouldn't we check all the same fields as above?


http://gerrit.cloudera.org:8080/#/c/24093/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-optimize.test@108
PS1, Line 108: ====
Maybe we could add a full table compaction after this to check how _file_row_id 
changed.



--
To view, visit http://gerrit.cloudera.org:8080/24093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1c3cc4b9aaa46e494e1aa4583c1a6aafecad48de
Gerrit-Change-Number: 24093
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Peter Rozsa <[email protected]>
Gerrit-Comment-Date: Fri, 13 Mar 2026 13:37:39 +0000
Gerrit-HasComments: Yes

Reply via email to