gaborkaszab commented on PR #12629:
URL: https://github.com/apache/iceberg/pull/12629#issuecomment-2862964351
> CALL catalog_name.system.compute_partition_stats('db.sample'); -- does
incremental compute if previous stats exist, else full compute.
I like this approach.
About the `full_compute => true` param, could you help what would be the
user motivation of calling this version of the procedure? There is already one
version that could decide between incremental or full compute, so this one
seems unnecessary. Unless the use-case is that the user learns that some
previous stats are broken and hence want to do a full recompute. But then if
they know that stats are broken, I agree with Peter that they should drop stats
and then use the other variant of the `compute_partition_stats` procedure
without the `full_compute` param. Is there anything I miss?
Following this logic, we might not want to expose
`PartitionStatsHandler.computeAndWriteStatsFileIncremental()` as public since
the other `computeAndWriteStats` version should be able to figure out to do
incremental or full compute.
Any thoughts?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]