ayushtkn commented on code in PR #4397:
URL: https://github.com/apache/hive/pull/4397#discussion_r1224024690
##########
ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java:
##########
@@ -220,8 +220,10 @@ public int persistColumnStats(Hive db, Table tbl) throws
HiveException, MetaExce
start = System. currentTimeMillis();
if (tbl != null && tbl.isNonNative() &&
tbl.getStorageHandler().canSetColStatistics(tbl)) {
tbl.getStorageHandler().setColStatistics(tbl, colStats);
+ } else {
+ // Set table or partition column statistics in metastore.
+ db.setPartitionColumnStatistics(request);
}
- db.setPartitionColumnStatistics(request);
Review Comment:
> If we can not get stats from puffine due to some exception, we can
fallback get stats from metastore. So i think maybe write stats into the two
places is meaningful
Storing at two places have additional costs during write & currently we have
two modes, "iceberg" & "metastore", so both denotes where to store the stats.
Storing at both sides, seems to be a third mode, like "both" and presently
we don't have a fallback logic either during read side, that if puffin file are
inaccessible then go to metastore kind of thing.
May be if we want such a thing, we can have a new mode, if we feel that is
required in future stages.
As of now, I think, "iceberg" mode should store only in puffin and
"metastore" mode should store only in "metastore"
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]