Rajesh Balamohan created HIVE-24805:
---------------------------------------

             Summary: Compactor: Initiator shouldn't fetch table details again 
and again for partitioned tables
                 Key: HIVE-24805
                 URL: https://issues.apache.org/jira/browse/HIVE-24805
             Project: Hive
          Issue Type: Improvement
            Reporter: Rajesh Balamohan


Initiator shouldn't be fetch table details for all its partitions. When there 
are large number of databases/tables, it takes lot of time for Initiator to 
complete its initial iteration and load on DB also goes higher.


https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L129

https://github.com/apache/hive/blob/64bb52316f19426ebea0087ee15e282cbde1d852/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L456

For all the following partitions, table details would be the same. However, it 
ends up fetching table details from HMS again and again.

{noformat}
2021-02-22 08:13:16,106 INFO  
org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see 
if we should compact 
tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2451899
2021-02-22 08:13:16,124 INFO  
org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see 
if we should compact 
tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2451830
2021-02-22 08:13:16,140 INFO  
org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see 
if we should compact 
tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2452586
2021-02-22 08:13:16,149 INFO  
org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see 
if we should compact 
tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2452698
2021-02-22 08:13:16,158 INFO  
org.apache.hadoop.hive.ql.txn.compactor.Initiator: [Thread-11]: Checking to see 
if we should compact 
tpcds_bin_partitioned_orc_1000.store_returns_tmp2.sr_returned_date_sk=2452063
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to