ayushtkn commented on code in PR #4397:
URL: https://github.com/apache/hive/pull/4397#discussion_r1224024690


##########
ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java:
##########
@@ -220,8 +220,10 @@ public int persistColumnStats(Hive db, Table tbl) throws 
HiveException, MetaExce
       start = System. currentTimeMillis();
       if (tbl != null && tbl.isNonNative() && 
tbl.getStorageHandler().canSetColStatistics(tbl)) {
         tbl.getStorageHandler().setColStatistics(tbl, colStats);
+      } else {
+        // Set table or partition column statistics in metastore.
+        db.setPartitionColumnStatistics(request);
       }
-      db.setPartitionColumnStatistics(request);

Review Comment:
   > If we can not get stats from puffine due to some exception, we can 
fallback get stats from metastore. So i think maybe write stats into the two 
places is meaningful
   
   Storing at two places have additional costs during write & currently we have 
two modes, "iceberg" & "metastore", so both denotes where to store the stats.
   
   Storing at both sides, seems to be a third mode, like "both" and presently 
we don't have a fallback logic either during read side, that if puffin file are 
inaccessible then go to metastore kind of thing. 
   
   May be if we want such a thing, we can have a new mode, if we feel that is 
required in future stages.
   
   As of now, I think, "iceberg" mode should store only in puffin and 
"metastore" mode should store only in "metastore" 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to