vihangk1 commented on a change in pull request #1095:
URL: https://github.com/apache/hive/pull/1095#discussion_r442461795
##########
File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
##########
@@ -2426,19 +2423,13 @@ public static TableSnapshot
getTableSnapshot(Configuration conf,
}
/**
- * This is called by Hive.java for all write operations (DDL). Advance write
id
- * for the table via transaction manager, and store it in config. The write
id
- * will be marked as committed instantly in config, as all DDL are auto
- * committed, there's no chance to rollback.
+ * This is called by Driver.java for all write operations (DDL). This
updates the latest validWriteIdList in config,
+ * so that the same can be sent from HMS Client during invocation of get_*
HMS APIs.
*/
- public static ValidWriteIdList advanceWriteId(HiveConf conf, Table tbl)
throws LockException {
- if (!isTransactionalTable(tbl)) {
Review comment:
this if condition is needed right?
##########
File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
##########
@@ -3241,6 +3231,19 @@ private static void printDirCacheEntries() {
}
}
+ /**
+ * Checks whether a given table is enabled for replication.
+ * @param tbl table
+ * @return true, if the table is enabled for replication. False, otherwise.
+ */
+ public static boolean inReplication(Table tbl) {
+ if (tbl.getParameters().get(ReplicationSpec.KEY.CURR_STATE_ID.toString())
!= null) {
+ return true;
+ } else {
Review comment:
nit, could be simply rewritten as return
tbl.getParameters().get(ReplicationSpec.KEY.CURR_STATE_ID.toString()) != null;
Also, is tbl.getParameters() guranteed to be always not null?
##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
##########
@@ -6151,4 +6218,19 @@ public StorageHandlerInfo getStorageHandlerInfo(Table
table)
throw new HiveException(e);
}
}
+
+ public void clearValidWriteIdList() {
+ if (metaStoreClient != null) {
+ metaStoreClient.clearValidWriteIdList();
+ }
+ }
+
+ boolean inReplication(Table tbl) {
+ if (tbl.getParameters().get(ReplicationSpec.KEY.CURR_STATE_ID.toString())
!= null) {
Review comment:
can be simply rewritten as return
tbl.getParameters().get(ReplicationSpec.KEY.CURR_STATE_ID.toString()) != null;
##########
File path:
ql/src/java/org/apache/hadoop/hive/ql/ddl/table/constraint/add/AlterTableAddConstraintDesc.java
##########
@@ -41,6 +41,6 @@ public AlterTableAddConstraintDesc(TableName tableName,
ReplicationSpec replicat
@Override
public boolean mayNeedWriteId() {
- return false;
+ return true;
Review comment:
Are all the AlterTable operations return true now? May be we can just
add a Override in its base class AbstractAlterTableDesc?
##########
File path: ql/src/java/org/apache/hadoop/hive/ql/Driver.java
##########
@@ -373,6 +374,10 @@ private void acquireLocks() throws
CommandProcessorException {
String fqTableName = acidDdlDesc.getFullTableName();
final TableName tn =
HiveTableName.ofNullableWithNoDefault(fqTableName);
long writeId =
driverContext.getTxnManager().getTableWriteId(tn.getDb(), tn.getTable());
+ // This updates the latest validWriteIdList for the current table in
the config, which latest will be sent
Review comment:
spell-check: s/latest/later
##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
##########
@@ -824,6 +836,7 @@ public void alterTable(String catName, String dbName,
String tblName, Table newT
throw new HiveException("Unable to alter table. " + e.getMessage(), e);
} catch (TException e) {
throw new HiveException("Unable to alter table. " + e.getMessage(), e);
+ } finally {
Review comment:
is this needed anymore?
##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
##########
@@ -905,6 +921,7 @@ public void alterPartition(String catName, String dbName,
String tblName, Partit
throw new HiveException("Unable to alter partition. " + e.getMessage(),
e);
} catch (TException e) {
throw new HiveException("Unable to alter partition. " + e.getMessage(),
e);
+ } finally {
Review comment:
needed? Looks like there are multiple places here with the empty finally
blocks. Can you please remove them?
##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
##########
@@ -5560,12 +5604,16 @@ public void cacheFileMetadata(
public void dropConstraint(String dbName, String tableName, String
constraintName)
throws HiveException, NoSuchObjectException {
+
try {
+ Table tbl = getTable(dbName, tableName);
Review comment:
Its unclear to me why we need to do this. Will this call throw an
exception if the ValidWriteIdList doesn't match?
##########
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
##########
@@ -12308,16 +12303,99 @@ else if(ast.getChild(0).getType() ==
HiveParser.TOK_FALSE) {
// if phase1Result false return
return false;
}
+
+ // 5. Set write id for HMS client
+ if (getTxnMgr().supportsAcid() &&
conf.get(ValidTxnWriteIdList.VALID_TABLES_WRITEIDS_KEY) == null) {
Review comment:
I am not very familiar with this code here so may be you or Peter can
respond to this for clarity? If I understand correctly this is called during
the parsing phase of the query. Does this happen before we call
Driver.acquireLocks? If yes, does this mean that this code now has started to
allocate writeIds instead of the acquireLocks method?
##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
##########
@@ -1230,14 +1255,22 @@ public void dropTable(String dbName, String tableName,
boolean deleteData,
*/
public void dropTable(String dbName, String tableName, boolean deleteData,
boolean ignoreUnknownTab, boolean ifPurge) throws HiveException {
+
try {
+ Table tbl = null;
Review comment:
why do we need this? If there is a specific need to add here may be good
to write a comment or two here about the same.
##########
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
##########
@@ -12308,16 +12303,99 @@ else if(ast.getChild(0).getType() ==
HiveParser.TOK_FALSE) {
// if phase1Result false return
return false;
}
+
+ // 5. Set write id for HMS client
+ if (getTxnMgr().supportsAcid() &&
conf.get(ValidTxnWriteIdList.VALID_TABLES_WRITEIDS_KEY) == null) {
+
+ ValidTxnWriteIdList txnWriteIds = null;
+
+ if (conf.get(ValidTxnWriteIdList.COMPACTOR_VALID_TABLES_WRITEIDS_KEY) !=
null) {
+ txnWriteIds = new
ValidTxnWriteIdList(conf.getLong(ValidTxnList.COMPACTOR_VALID_TXNS_ID_KEY, 0));
+ txnWriteIds.addTableValidWriteIdList(
+ new
ValidReaderWriteIdList(conf.get(ValidTxnWriteIdList.COMPACTOR_VALID_TABLES_WRITEIDS_KEY)));
+ } else {
+ List<String> tabNames = new ArrayList<>();
+ for (String tabName : collectTables(qb)) {
+ String fullName = TableName
+ .fromString(tabName, SessionState.get().getCurrentCatalog(),
SessionState.get().getCurrentDatabase())
+ .getDbTable();
+ tabNames.add(fullName);
+ }
+
+ if (!tabNames.isEmpty()) {
+ String txnString = conf.get(ValidTxnList.VALID_TXNS_KEY);
+
+ try {
+ if ((txnString == null) || (txnString.isEmpty())) {
+ txnString = getTxnMgr().getValidTxns().toString();
+ conf.set(ValidTxnList.VALID_TXNS_KEY, txnString);
+ }
+
+ txnWriteIds = getTxnMgr().getValidWriteIds(tabNames, txnString);
Review comment:
Is there a possibility of race condition here? The Driver calls
validTxnManager.recordValidWriteId in acquireLocks() method? For instance if
both these lines of code are recording the transactionIdList is it possible
that they have a different view of the validTransactionList?
##########
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
##########
@@ -5057,7 +5096,10 @@ public static boolean isHadoop1() {
public List<Partition> exchangeTablePartitions(Map<String, String>
partitionSpecs,
String sourceDb, String sourceTable, String destDb,
String destinationTableName) throws HiveException {
+
try {
+ Table srcTbl = getTable(sourceDb, sourceTable);
Review comment:
is the srcTbl used anywhere?
##########
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
##########
@@ -12308,16 +12303,99 @@ else if(ast.getChild(0).getType() ==
HiveParser.TOK_FALSE) {
// if phase1Result false return
return false;
}
+
+ // 5. Set write id for HMS client
+ if (getTxnMgr().supportsAcid() &&
conf.get(ValidTxnWriteIdList.VALID_TABLES_WRITEIDS_KEY) == null) {
+
+ ValidTxnWriteIdList txnWriteIds = null;
+
+ if (conf.get(ValidTxnWriteIdList.COMPACTOR_VALID_TABLES_WRITEIDS_KEY) !=
null) {
+ txnWriteIds = new
ValidTxnWriteIdList(conf.getLong(ValidTxnList.COMPACTOR_VALID_TXNS_ID_KEY, 0));
+ txnWriteIds.addTableValidWriteIdList(
+ new
ValidReaderWriteIdList(conf.get(ValidTxnWriteIdList.COMPACTOR_VALID_TABLES_WRITEIDS_KEY)));
+ } else {
+ List<String> tabNames = new ArrayList<>();
+ for (String tabName : collectTables(qb)) {
+ String fullName = TableName
Review comment:
What happens if the query is joining a transactional table and a
non-transactional table? Does the code allocate a writeId for non-transactional
table too?
##########
File path: storage-api/src/java/org/apache/hadoop/hive/common/ValidTxnList.java
##########
@@ -31,6 +31,11 @@
*/
public static final String VALID_TXNS_KEY = "hive.txn.valid.txns";
+ /**
+ * Key used to store txn id for compactor in a
+ * {@link org.apache.hadoop.conf.Configuration} object.
+ */
+ public static final String COMPACTOR_VALID_TXNS_ID_KEY =
"hive.compactor.txn.valid.txns.id";
Review comment:
I don't see the place in the code which sets this key in the
configuration object.
##########
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
##########
@@ -12308,16 +12303,99 @@ else if(ast.getChild(0).getType() ==
HiveParser.TOK_FALSE) {
// if phase1Result false return
return false;
}
+
+ // 5. Set write id for HMS client
+ if (getTxnMgr().supportsAcid() &&
conf.get(ValidTxnWriteIdList.VALID_TABLES_WRITEIDS_KEY) == null) {
+
+ ValidTxnWriteIdList txnWriteIds = null;
+
+ if (conf.get(ValidTxnWriteIdList.COMPACTOR_VALID_TABLES_WRITEIDS_KEY) !=
null) {
+ txnWriteIds = new
ValidTxnWriteIdList(conf.getLong(ValidTxnList.COMPACTOR_VALID_TXNS_ID_KEY, 0));
+ txnWriteIds.addTableValidWriteIdList(
+ new
ValidReaderWriteIdList(conf.get(ValidTxnWriteIdList.COMPACTOR_VALID_TABLES_WRITEIDS_KEY)));
+ } else {
+ List<String> tabNames = new ArrayList<>();
+ for (String tabName : collectTables(qb)) {
+ String fullName = TableName
+ .fromString(tabName, SessionState.get().getCurrentCatalog(),
SessionState.get().getCurrentDatabase())
+ .getDbTable();
+ tabNames.add(fullName);
+ }
+
+ if (!tabNames.isEmpty()) {
+ String txnString = conf.get(ValidTxnList.VALID_TXNS_KEY);
+
+ try {
+ if ((txnString == null) || (txnString.isEmpty())) {
+ txnString = getTxnMgr().getValidTxns().toString();
+ conf.set(ValidTxnList.VALID_TXNS_KEY, txnString);
+ }
+
+ txnWriteIds = getTxnMgr().getValidWriteIds(tabNames, txnString);
+ } catch (LockException e) {
+ throw new SemanticException("Failed to fetch write Id from
TxnManager", e);
+ }
+ }
+ }
+
+ if (txnWriteIds != null) {
Review comment:
I feel that this code here logically should be done after acquireLocks
method in Driver and we should let the Driver record the transaction
information. But I am not an expert in this area so may be Peter should confirm
if this understanding is not correct.
##########
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
##########
@@ -12308,16 +12195,98 @@ else if(ast.getChild(0).getType() ==
HiveParser.TOK_FALSE) {
// if phase1Result false return
return false;
}
+
+ // 5. Set write id for HMS client
+ if (getTxnMgr().supportsAcid() &&
conf.get(ValidTxnWriteIdList.VALID_TABLES_WRITEIDS_KEY) == null) {
+
+ ValidTxnWriteIdList txnWriteIds = null;
+
+ if (conf.get(ValidTxnWriteIdList.COMPACTOR_VALID_TABLES_WRITEIDS_KEY) !=
null) {
+ txnWriteIds = new
ValidTxnWriteIdList(conf.getLong(ValidTxnList.COMPACTOR_VALID_TXNS_ID_KEY, 0));
+ txnWriteIds.addTableValidWriteIdList(
+ new
ValidReaderWriteIdList(conf.get(ValidTxnWriteIdList.COMPACTOR_VALID_TABLES_WRITEIDS_KEY)));
+ } else {
+ List<String> tabNames = new ArrayList<>();
+ for (String tabName : collectTables(qb)) {
+ String fullName = TableName
+ .fromString(tabName, SessionState.get().getCurrentCatalog(),
SessionState.get().getCurrentDatabase())
+ .getDbTable();
+ tabNames.add(fullName);
+ }
+
+ if (!tabNames.isEmpty()) {
+ String txnString = conf.get(ValidTxnList.VALID_TXNS_KEY);
+
+ try {
+ if ((txnString == null) || (txnString.isEmpty())) {
+ txnString = getTxnMgr().getValidTxns().toString();
+ conf.set(ValidTxnList.VALID_TXNS_KEY, txnString);
+ }
+
+ txnWriteIds = getTxnMgr().getValidWriteIds(tabNames, txnString);
+ } catch (LockException e) {
+ throw new SemanticException("Failed to fetch write Id from
TxnManager", e);
+ }
+ }
+ }
+
+ if (txnWriteIds != null) {
+ conf.set(ValidTxnWriteIdList.VALID_TABLES_WRITEIDS_KEY,
txnWriteIds.toString());
+ try {
+ db.getMSC().setValidWriteIdList(txnWriteIds.toString());
+ Hive.get().getMSC().setValidWriteIdList(txnWriteIds.toString());
Review comment:
Guess Peter also asked this before. The Hive.get() internally will set
the validWriteIdList already right? Why do we need to explicitly set it to the
Client here?
##########
File path:
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
##########
@@ -4062,4 +4062,8 @@ void createOrDropTriggerToPoolMapping(String
resourcePlanName, String triggerNam
ReplicationMetricList getReplicationMetrics(GetReplicationMetricsRequest
replicationMetricsRequest)
throws MetaException, TException;
+ void setValidWriteIdList(String txnWriteIdList);
Review comment:
can you rename the variable to validWriteIdList since that is more
readable in my opinion.
Also, I think it would be good to document this method especially since this
interface is a public API and used by non-hive HMS clients. Do we plan to use
this validWriteIdList to set in the HMS API requests if the input arguments
don't explicitly provide it?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]