Vipin Vishvkarma created HIVE-24020:
---------------------------------------

             Summary: Automatic Compaction not working in existing partitions 
for Streaming Ingest with Dynamic Partition
                 Key: HIVE-24020
                 URL: https://issues.apache.org/jira/browse/HIVE-24020
             Project: Hive
          Issue Type: Bug
    Affects Versions: 3.1.2, 4.0.0
            Reporter: Vipin Vishvkarma
            Assignee: Vipin Vishvkarma


This issue happens when we try to do streaming ingest with dynamic partition on 
already existing partitions. I checked in the code, we have following check in 
the AbstractRecordWriter.

 
{code:java}
PartitionInfo partitionInfo = conn.createPartitionIfNotExists(partitionValues);
// collect the newly added partitions. connection.commitTransaction() will 
report the dynamically added
// partitions to TxnHandler
if (!partitionInfo.isExists()) {
  addedPartitions.add(partitionInfo.getName());
} else {
  if (LOG.isDebugEnabled()) {
    LOG.debug("Partition {} already exists for table {}",
        partitionInfo.getName(), fullyQualifiedTableName);
  }
}
{code}
Above *addedPartitions* is passed to *addDynamicPartitions* during 
TransactionBatch commit. So in case of already existing partitions, 
*addedPartitions* will be empty and *addDynamicPartitions* **will not move 
entries from TXN_COMPONENTS to COMPLETED_TXN_COMPONENTS. This results in 
Initiator not able to trigger auto compaction.

Another issue which has been observed is, we are not clearing *addedPartitions* 
on writer close, which results in information flowing across transactions.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to