zhangbutao commented on code in PR #4744: URL: https://github.com/apache/hive/pull/4744#discussion_r1421703321
########## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/impl/FindColumnsWithStatsHandler.java: ########## @@ -37,11 +37,16 @@ public class FindColumnsWithStatsHandler implements QueryHandler<List<String>> { //language=SQL private static final String TABLE_SELECT = "SELECT \"COLUMN_NAME\" FROM \"TAB_COL_STATS\" " + - "WHERE \"DB_NAME\" = :dbName AND \"TABLE_NAME\" = :tableName"; + "INNER JOIN \"TBLS\" ON \"TAB_COL_STATS\".\"TBL_ID\" = \"TBLS\".\"TBL_ID\" " + Review Comment: Chime in with some thoughts :) Not sure if the index can be used to avoid performance degression when multi-join. There are many read(select) operations related to statistics in Hive, especially in CBO stage. Sometimes the performance of mutli-join operation in MySQL is bad than single table operation, and the multi-join also cause mysql performance stress. https://github.com/apache/hive/pull/4744/files#diff-bcca13f6cc251df321e8fe80568ef0334a1d44f7e5e7ff2fcaa06ab4f05bbdf9 `MetaStoreDirectSql.java` also changed the stats related operation from single table to multi-join operation. Can we do some performance stress tests to verify that performance won't decline. (maybe not easy to test) Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org