maheshk114 commented on a change in pull request #1834:
URL: https://github.com/apache/hive/pull/1834#discussion_r556266918
##########
File path: ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java
##########
@@ -204,6 +206,54 @@ public int persistColumnStats(Hive db, Table tbl) throws
HiveException, MetaExce
public void setDpPartSpecs(Collection<Partition> dpPartSpecs) {
}
+ public static boolean canSkipStatsGeneration(String dbName, String tblName,
String partName,
+ long statsWriteId, String
queryValidWriteIdList) {
+ if (queryValidWriteIdList != null) { // Can be null if its not an ACID
table.
+ ValidWriteIdList validWriteIdList = new
ValidReaderWriteIdList(queryValidWriteIdList);
+ // Just check if the write ID is valid. If it's valid (i.e. we are
allowed to see it),
+ // that means it cannot possibly be a concurrent write. As stats
optimization is enabled
+ // only in case auto gather is enabled. Thus the stats must be updated
by a valid committed
+ // transaction and stats generation can be skipped.
+ if (validWriteIdList.isWriteIdValid(statsWriteId)) {
+ try {
+ IMetaStoreClient msc = Hive.get().getMSC();
+ TxnState state = msc.findStatStatusByWriteId(dbName, tblName,
partName, statsWriteId);
Review comment:
this is to make sure that the txn is not cleaned up by compactor.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]