Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/22339 )

Change subject: IMPALA-13609: Store Iceberg snapshot id for COMPUTE STATS
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/22339/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/22339/3//COMMIT_MSG@41
PS3, Line 41: 'impala.lastComputeStatsTime' table property from becoming too 
long, it
            : will only include information for 10 columns by default
> How many columns do you think we should include?
I think creating ranges from a Map<Long, Long> is fairly simple, and 
throroughly unit-testable. E.g:

* Transpose Map<Long, Long> to Map<Long, TreeSet<Long>>, i.e. resulting Map is 
<snapshot id> --> <ordered set of field ids>
* For each entry in Map<Long, TreeSet<Long>>, create the ranges, e.g. 
1-12:5000,14:5000,17-19:5000, and 13:6000,15-16:6000
* Concatenate the lists of ranges

When a new COMPUTE STATS needs to partially overwrite existing values, we can 
just transform the stored values back to a Map<Long, Long>, update the map, 
then transform it back to ranges form. Am I missing some additional 
complexities?

I think it'd be much better than having non-intuitive/error-prone logic about 
what should we do with missing field ids.


http://gerrit.cloudera.org:8080/#/c/22339/5/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/22339/5/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@817
PS5, Line 817:
I think here currentSnapshotId can be newer than the one actually being used by 
COMPUTE STATS.



--
To view, visit http://gerrit.cloudera.org:8080/22339
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id9998b84c4fd20d1cf5e97a34f3553832ec70ae7
Gerrit-Change-Number: 22339
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Thu, 27 Feb 2025 15:45:24 +0000
Gerrit-HasComments: Yes

Reply via email to