Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/24093 )

Change subject: IMPALA-14592: Enable OPTIMIZE for Iceberg V3 tables
......................................................................


Patch Set 2:

(5 comments)

Thanks for the comments!

http://gerrit.cloudera.org:8080/#/c/24093/1/fe/src/main/java/org/apache/impala/analysis/OptimizeStmt.java
File fe/src/main/java/org/apache/impala/analysis/OptimizeStmt.java:

http://gerrit.cloudera.org:8080/#/c/24093/1/fe/src/main/java/org/apache/impala/analysis/OptimizeStmt.java@a172
PS1, Line 172:
             :
             :
             :
             :
> Should we keep this statement in but changing just the version for comparis
Makes sense, done.


http://gerrit.cloudera.org:8080/#/c/24093/1/fe/src/main/java/org/apache/impala/analysis/OptimizeStmt.java@282
PS1, Line 282:     }
> nit: these methods will be useful for UPDATE/MERGE as well, currently I don
Yeah, we have similar auxiliary methods in IcebergUtil (e.g. 
getIcebergPartitionTransformExpr), though that class has already grown too 
large. Probably we should introduce a new class, IcebergCommonExprs, and move 
these common expressions there.

Too keep this change minimal and focused, filed IMPALA-14837 to track the 
refactoring efforts.


http://gerrit.cloudera.org:8080/#/c/24093/1/fe/src/main/java/org/apache/impala/analysis/OptimizeStmt.java@312
PS1, Line 312:   return createCoalesceFn(analyzer,
             :         createSlotRef(analyzer, 
"_file_last_updated_sequence_number"),
             :         createSlotRef(analyzer, 
"ICEBERG__DATA__SEQUENCE__NUMBER"));
             :   }
             :
             :   /**
             :    * Creates a COALESCE function call with the given expressions 
as parameters.
             :    */
             :   private Expr createCoalesceFn(Analyzer analyzer, Expr... exprs)
             :       throws AnalysisException {
             :
> nit: Most of this function is the same as getRowIdExpr(). Do you think it i
Makes sense, done.


http://gerrit.cloudera.org:8080/#/c/24093/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-optimize.test
File 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-optimize.test:

http://gerrit.cloudera.org:8080/#/c/24093/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-optimize.test@25
PS1, Line 25: SELECT _file_row_id, _file_last_updated_sequence_number, * FROM 
optimize_nopart;
> Shouldn't we check all the same fields as above?
The compacted data files have new firs-row-id and data-sequence-numbers which 
was a bit confusing at first sight, so I omitted them. I can add them back if 
you think they're worth checking.

Anyway I've added a new test for upgraded tables where I'm showing them.


http://gerrit.cloudera.org:8080/#/c/24093/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-optimize.test@108
PS1, Line 108:
> Maybe we could add a full table compaction after this to check how _file_ro
Good idea, done.



--
To view, visit http://gerrit.cloudera.org:8080/24093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1c3cc4b9aaa46e494e1aa4583c1a6aafecad48de
Gerrit-Change-Number: 24093
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Peter Rozsa <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Fri, 13 Mar 2026 15:23:15 +0000
Gerrit-HasComments: Yes

Reply via email to