sankarh commented on a change in pull request #579: HIVE-21109 : Support stats
replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r269880708
##########
File path:
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnCommonUtils.java
##########
@@ -84,6 +86,73 @@ public static ValidTxnList
createValidReadTxnList(GetOpenTxnsResponse txns, long
return new ValidReadTxnList(exceptions, outAbortedBits, highWaterMark,
minOpenTxnId);
}
+ /**
+ * Transform a {@link
org.apache.hadoop.hive.metastore.api.GetOpenTxnsResponse} to a
+ * {@link org.apache.hadoop.hive.common.ValidTxnList}. This assumes that
the caller intends to
+ * read the files, and thus treats both open and aborted transactions as
invalid.
+ *
+ * This API is used by Hive replication which may have multiple transactions
open at a time.
+ *
+ * @param txns open txn list from the metastore
+ * @param currentTxns Current transactions that the replication has opened.
If any of the
+ * transactions is greater than 0 it will be removed from
the exceptions
+ * list so that the replication sees its own transaction
as valid.
+ * @return a valid txn list.
+ */
+ public static ValidTxnList createValidReadTxnList(GetOpenTxnsResponse txns,
Review comment:
Yes, even I think, for REPL LOAD, we should always hardcode the
ValidWriteIdList using current writeId so that stats are always valid while
applying current event. Even if it is invalid, the subsequent
alterTable/partition event would set it so in the table/partition parameters.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services