Rajesh Balamohan created HIVE-24367:
---------------------------------------
Summary: Explore whether HiveAlterHandler::alterTable can be
optimised for non-partitioned tablesInbox
Key: HIVE-24367
URL: https://issues.apache.org/jira/browse/HIVE-24367
Project: Hive
Issue Type: Improvement
Components: HiveServer2
Reporter: Rajesh Balamohan
{color:#222222}Writing lots of delta in non-partitioned table creates runtime
issues, when lot of delta folders are present.{color}
{color:#222222} {color}
{color:#222222}Following code in HiveAlterHandler is invoked for every insert
operation. It computes {{{color}
{color:#222222}updateTableStatsSlow}} for every insert causing runtime
delays.{color}
{color:#222222} {color}
{noformat}
if (MetaStoreUtils.requireCalStats(null, null, newt, environmentContext) &&
!isPartitionedTable) {
Database db = msdb.getDatabase(catName, newDbName);
assert(isReplicated == HiveMetaStore.HMSHandler.isDbReplicationTarget(db));
// Update table stats. For partitioned table, we update stats in
alterPartition()
MetaStoreUtils.updateTableStatsSlow(db, newt, wh, false, true,
environmentContext);
}
{noformat}
{color:#222222}It would be good to explore whether only the newly added delta
can be listed for computing stats. This would avoid huge listing call during
stats collection.{color}
{color:#222222}e.g queries to repro{color}
{noformat}
CREATE TABLE IF NOT EXISTS test (name String, value int);
INSERT INTO test VALUES('K1',1);
INSERT INTO test VALUES('K2',2);
..
..
..
INSERT INTO test VALUES('K20000',2)
{noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)