Rajesh Balamohan created HIVE-24367:
---------------------------------------

             Summary: Explore whether HiveAlterHandler::alterTable can be 
optimised for non-partitioned tablesInbox
                 Key: HIVE-24367
                 URL: https://issues.apache.org/jira/browse/HIVE-24367
             Project: Hive
          Issue Type: Improvement
          Components: HiveServer2
            Reporter: Rajesh Balamohan


{color:#222222}Writing lots of delta in non-partitioned table creates runtime 
issues, when lot of delta folders are present.{color}

{color:#222222} {color}

{color:#222222}Following code in HiveAlterHandler is invoked for every insert 
operation. It computes {{{color}

{color:#222222}updateTableStatsSlow}} for every insert causing runtime 
delays.{color}

{color:#222222} {color}
{noformat}
if (MetaStoreUtils.requireCalStats(null, null, newt, environmentContext) &&
    !isPartitionedTable) {
  Database db = msdb.getDatabase(catName, newDbName);
  assert(isReplicated == HiveMetaStore.HMSHandler.isDbReplicationTarget(db));
  // Update table stats. For partitioned table, we update stats in 
alterPartition()
  MetaStoreUtils.updateTableStatsSlow(db, newt, wh, false, true, 
environmentContext);
}
{noformat}
{color:#222222}It would be good to explore whether only the newly added delta 
can be listed for computing stats. This would avoid huge listing call during 
stats collection.{color}

{color:#222222}e.g queries to repro{color}
{noformat}
CREATE TABLE IF NOT EXISTS test (name String, value int);
INSERT INTO test VALUES('K1',1);
INSERT INTO test VALUES('K2',2);
..
..
..
INSERT INTO test VALUES('K20000',2)

 {noformat}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to