hsnusonic commented on code in PR #3466:
URL: https://github.com/apache/hive/pull/3466#discussion_r927901748


##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PersistenceManagerProvider.java:
##########
@@ -253,8 +253,12 @@ private static PersistenceManagerFactory 
initPMF(Configuration conf, boolean for
     } else {
       try {
         DataSource ds = (maxPoolSize > 0) ? dsp.create(conf, maxPoolSize) : 
dsp.create(conf);
+        // The secondary connection factory is used for schema generation, and 
for value generation operations.
+        // We should use a different pool for the secondary connection factory 
to avoid resource starvation.
+        // Since DataNucleus uses locks for schema generation and value 
generation, 2 connections should be sufficient.
+        DataSource ds2 = dsp.create(conf, /* maxPoolSize */ 2);

Review Comment:
   Hi @deniskuzZ,
   The issue is most easily to be observed with lots of add_partitions 
requests. When there is a write operation, DataNucleus will need to use 
ValueGenerator. Image there are 10 add_partitions requests, all have taken 1 
connection from the pool. Now they are moving to value generation stage, there 
is a monitor lock in DataNucleus so only one will proceed and others are 
blocked. Even only one thread is trying to get connection from the pool, no 
connection is available (assuming we have 10 connections in the pool). This is 
the problem we use same pool for primary and secondary connection factory. Does 
it make sense to you?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to