Gopal V created HIVE-19790: ------------------------------ Summary: Metastore upgrade: 3.1.0 upgrade script is slow and non-idempotent Key: HIVE-19790 URL: https://issues.apache.org/jira/browse/HIVE-19790 Project: Hive Issue Type: Bug Components: Standalone Metastore Reporter: Gopal V
Because of the giant bit-vectors stored on mysql, the update of PART_COL_STATS is very slow and also is not idempotent. {code} -------------- UPDATE `PART_COL_STATS` SET `CAT_NAME` = 'hive' -------------- Query OK, 0 rows affected (4 min 1.57 sec) Rows matched: 778025 Changed: 0 Warnings: 0 {code} Adding a filter speeds it up because it will no longer overwrite {code} mysql> explain UPDATE `PART_COL_STATS` SET `CAT_NAME` = 'hive' where `CAT_NAME` =''; -------------- explain UPDATE `PART_COL_STATS` SET `CAT_NAME` = 'hive' where `CAT_NAME` ='' -------------- +----+-------------+----------------+-------+---------------+---------------+---------+-------+------+------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------------+-------+---------------+---------------+---------+-------+------+------------------------------+ | 1 | SIMPLE | PART_COL_STATS | range | PCS_STATS_IDX | PCS_STATS_IDX | 258 | const | 1 | Using where; Using temporary | +----+-------------+----------------+-------+---------------+---------------+---------+-------+------+------------------------------+ 1 row in set (0.00 sec) {code} this would be much faster to re-run and would not accidentally overwrite any existing CAT_NAMEs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)