sankarh commented on a change in pull request #579: HIVE-21109 : Support stats replication for ACID tables. URL: https://github.com/apache/hive/pull/579#discussion_r269880708
########## File path: standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnCommonUtils.java ########## @@ -84,6 +86,73 @@ public static ValidTxnList createValidReadTxnList(GetOpenTxnsResponse txns, long return new ValidReadTxnList(exceptions, outAbortedBits, highWaterMark, minOpenTxnId); } + /** + * Transform a {@link org.apache.hadoop.hive.metastore.api.GetOpenTxnsResponse} to a + * {@link org.apache.hadoop.hive.common.ValidTxnList}. This assumes that the caller intends to + * read the files, and thus treats both open and aborted transactions as invalid. + * + * This API is used by Hive replication which may have multiple transactions open at a time. + * + * @param txns open txn list from the metastore + * @param currentTxns Current transactions that the replication has opened. If any of the + * transactions is greater than 0 it will be removed from the exceptions + * list so that the replication sees its own transaction as valid. + * @return a valid txn list. + */ + public static ValidTxnList createValidReadTxnList(GetOpenTxnsResponse txns, Review comment: Yes, even I think, for REPL LOAD, we should always hardcode the ValidWriteIdList using current writeId so that stats are always valid while applying current event. Even if it is invalid, the subsequent alterTable/partition event would set it so in the table/partition parameters. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services