zabetak commented on code in PR #5764: URL: https://github.com/apache/hive/pull/5764#discussion_r2041704820
########## standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java: ########## @@ -700,6 +700,12 @@ public enum ConfVars { "configured with embedded metastore. To get optimal performance, set config to meet the following condition\n"+ "(2 * pool_size * metastore_instances + 2 * pool_size * HS2_instances_with_embedded_metastore) = \n" + "(2 * physical_core_count + hard_disk_count)."), + CONNECTION_POOLING_MAX_SECONDARY_CONNECTIONS("datanucleus.connectionPool.maxSecondaryPoolSize", + "datanucleus.connectionPool.maxSecondaryPoolSize", 2, + "Specify the maximum number of connections in the objectstore-secondary connection pool.\n" + + "Same as datanucleus.connectionPool.maxPoolSize, but for the objectstore-secondary pool. \n" + + "Note that this is not a valid datanucleus configuration, this is used only in Hive.\n" + Review Comment: The comment it is not a valid configuration is a bit confusing. Note that `datanucleus.connectionPool.maxPoolSize` is not passed directly to datanucleus and in fact it is only used by Hive. It may be better to not mention anything. ########## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PersistenceManagerProvider.java: ########## @@ -315,9 +315,12 @@ private static PersistenceManagerFactory initPMF(Configuration conf, boolean for databaseProduct = DatabaseProduct.determineDatabaseProduct(ds, conf); // The secondary connection factory is used for schema generation, and for value generation operations. // We should use a different pool for the secondary connection factory to avoid resource starvation. - // Since DataNucleus uses locks for schema generation and value generation, 2 connections should be sufficient. + // DataNucleus uses locks for schema generation and value generation, under normal circumstances 2 connections + // should be sufficient. However, as found in HIVE-28839 in certain situations connection starvation may happen, + // so we need to make this configurable until a final fix is not available. Review Comment: As far as I understand the intention is to keep this config around even after fixing HIVE-28839. If my understanding is correct then it's better to remove the comment "However...is not available". If it is really meant to be a temporary config till we merge HIVE-28839 then it's fine to leave things as is. Maybe just fix the typo "fix is not available" -> "fix is available". ########## standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java: ########## @@ -700,6 +700,12 @@ public enum ConfVars { "configured with embedded metastore. To get optimal performance, set config to meet the following condition\n"+ "(2 * pool_size * metastore_instances + 2 * pool_size * HS2_instances_with_embedded_metastore) = \n" + "(2 * physical_core_count + hard_disk_count)."), + CONNECTION_POOLING_MAX_SECONDARY_CONNECTIONS("datanucleus.connectionPool.maxSecondaryPoolSize", Review Comment: I would prefer slightly more a name like `datanucleus.connectionPool.secondary.maxPoolSize` or `datanucleus.connectionPool.nontx.maxPoolSize` but don't feel very strong about it. The datanucleus prefix is somewhat misleading since we are not propagating the property to datanucleus but given the current situation and the other props don't have a better suggestion so we can leave with it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org