[uportal-dev] uPortal database connection pool size

2014-12-02 Thread James Wennmacher
Answering the question below on the uportal-user list got me thinking (a 
dangerous thing indeed ... :-) ).

Currently all 3 of the uPortal DB connection pool sizes defined in 
datasourceContext.xml are all set to the same max value (of 75 by 
default).  In glancing at the code I am thinking that the raw events DB 
pool and the aggregation events DB pool are running on timed threads and 
only use 1 DB connection each (see 
https://github.com/Jasig/uPortal/blob/uportal-4.1.2/uportal-war/src/main/java/org/jasig/portal/events/handlers/QueueingEventHandler.java#L41
 
and 
https://github.com/Jasig/uPortal/blob/uportal-4.1.2/uportal-war/src/main/resources/properties/contexts/schedulerContext.xml#L73).
 
They can both have a smaller maxActive value to limit the exposure of an 
error somehow consuming large numbers of DB connections and impacting 
uPortal and portlets (via consuming too many DB connections on the DB 
server).  I was thinking of setting their maxActive value to 5 to allow 
simultaneous threads for saving, purging, and querying.

Does anyone see a problem with this strategy?  UWMadison or someone with 
an active system can you glance at the DB MBeans AggrEventsDB and 
RawEventsDB in uPortal/Datasource and see if the NumActive + NumIdle are 
even close to 5 on your system?  (unfortunately without monitoring tools 
I don't see how you'd find out what the max # of connections ever made was).

Thanks in advance for your insights and thoughts.

James Wennmacher - Unicon
480.558.2420



 Forwarded Message 
Subject:Re: [uportal-user] Increase database connection pool size
Date:   Tue, 02 Dec 2014 11:02:02 -0700
From:   James Wennmacher jwennmac...@unicon.net
To: uportal-u...@lists.jasig.org



Database connection counts are defined in 
uportal-war/src/main/resources/properties/contexts/datasourceContext.xml.

uPortal uses the DB connections for a fairly brief period of time.  The 
message 'none available[size:75; busy:0; idle:0; lastwait:5000]' plus 
your comment about leaving it overnight  makes me wonder if somehow the 
connections are being lost and not reclaimed.  I suggest:

1. Insure that the load test is not hitting servers too heavily; e.g. 
load is distributed evenly.  I could see running out of DB connections 
happening if a server gets hammered (though the connections should be 
freed up at some point later).  Does it happen primarily to one or two 
servers and not all of them?

2. Try adding the following properties to the basePooledDataSource bean 
in datasourceContext.xml:

property name=logAbandoned value=true /
property name=numTestsPerEvictionRun value=5 /

This may not resolve the issue, but perhaps the logging will provide a 
clue to what's going on.  However it is likely the additional logging 
will not trigger.  The property minEvictableIdleTimeMillis is supposed 
to release a connection after it has been idle for the specified number 
of milliseconds, and the properties abandonWhenPercentageFull, 
removeAbandoned, and removeAbandonedTimeout which are specified are 
supposed to clean up abandon connections (allocated but not used in 
removeAbandonedTimeout seconds when a new connection is requested but 
none are available).  However in a load test scenario, especially one 
where a server is taxed very heavily, the removeAbandonedTimeout value 
may be too high (value is 300 sec) if connections are heavily used so no 
connections may be considered abandoned and harvested during the test.  
However I wouldn't change removeAbandonedTimeout just yet.  After the 
test completes however there may be some useful log messages if some 
connections were consumed and not released.  If nothing else 5 minutes 
after the test completes you should be able to log onto the server even 
if all connections are consumed since it should consider at least some 
of the connections as abandoned and eligible for harvesting.  However 
your comment about leaving the system overnight and the issue still 
exists makes me think the abandon connections will not be harvested.  
Still worth trying logAbandoned to see if it provides more info.

3. Are there other DB connection error messages?  The ones you mentioned 
are for event aggregation (runs periodically to aggregate and purge raw 
portal activity event data) and for jgroups (used for distributed cache 
management to allow uPortal nodes to notify other uPortal nodes about 
cache replication or invalidation).  Were there any for uPortal activity 
not having a database connection?

4. It would be great to get more information so we can try to fix the 
issue of the connections not being released.  Even when fully consumed, 
the connections should release after a period of time (after being idle 
for minEvictableIdleTimeMillis milliseconds, or 5 minutes at the latest 
per removeAbandonedTimeout property when attempting to get a new 
connection and none are available).  When this situation occurs are you 
able to look at the DB server and 

Re: [uportal-dev] uPortal database connection pool size

2014-12-02 Thread Tim Levett
Random collection of thoughts on this topic:


  *   Raw events would need more than one connection. There are 3 major 
interactions (might be more) with UP_RAW_EVENTS: Insert for new events, 
read/update for aggregation, and purging.
  *   This pool is per node, so if you are clustering (like our 4 nodes) this 
can get you into trouble if you have limits on your concurrent database 
connections.
  *   It looks like we use the default pool size here on MyUW. I don't have 
access to the admin interface in production.
  *   The only time we had database connection issues is when we had a database 
hiccup (single point of failure) and a bunch of requests queued up.
  *   If we do modifications, I would also suggest upping the wait for 
connection timeout.


Thanks,


Tim Levett
tim.levettATwisc.edu
MyUW-Infrastructure



From: bounce-37985837-70367...@lists.wisc.edu 
bounce-37985837-70367...@lists.wisc.edu on behalf of James Wennmacher 
jwennmac...@unicon.net
Sent: Tuesday, December 2, 2014 1:02 PM
To: uportal-dev@lists.ja-sig.org
Subject: [uportal-dev] uPortal database connection pool size

Answering the question below on the uportal-user list got me thinking (a 
dangerous thing indeed ... :-) ).

Currently all 3 of the uPortal DB connection pool sizes defined in 
datasourceContext.xml are all set to the same max value (of 75 by default).  In 
glancing at the code I am thinking that the raw events DB pool and the 
aggregation events DB pool are running on timed threads and only use 1 DB 
connection each (see 
https://github.com/Jasig/uPortal/blob/uportal-4.1.2/uportal-war/src/main/java/org/jasig/portal/events/handlers/QueueingEventHandler.java#L41
 and 
https://github.com/Jasig/uPortal/blob/uportal-4.1.2/uportal-war/src/main/resources/properties/contexts/schedulerContext.xml#L73).
  They can both have a smaller maxActive value to limit the exposure of an 
error somehow consuming large numbers of DB connections and impacting uPortal 
and portlets (via consuming too many DB connections on the DB server).  I was 
thinking of setting their maxActive value to 5 to allow simultaneous threads 
for saving, purging, and querying.

Does anyone see a problem with this strategy?  UWMadison or someone with an 
active system can you glance at the DB MBeans AggrEventsDB and RawEventsDB in 
uPortal/Datasource and see if the NumActive + NumIdle are even close to 5 on 
your system?  (unfortunately without monitoring tools I don't see how you'd 
find out what the max # of connections ever made was).

Thanks in advance for your insights and thoughts.


James Wennmacher - Unicon
480.558.2420


 Forwarded Message 
Subject:Re: [uportal-user] Increase database connection pool size
Date:   Tue, 02 Dec 2014 11:02:02 -0700
From:   James Wennmacher jwennmac...@unicon.netmailto:jwennmac...@unicon.net
To: uportal-u...@lists.jasig.orgmailto:uportal-u...@lists.jasig.org


Database connection counts are defined in 
uportal-war/src/main/resources/properties/contexts/datasourceContext.xml.

uPortal uses the DB connections for a fairly brief period of time.  The message 
'none available[size:75; busy:0; idle:0; lastwait:5000]' plus your comment 
about leaving it overnight  makes me wonder if somehow the connections are 
being lost and not reclaimed.  I suggest:

1. Insure that the load test is not hitting servers too heavily; e.g. load is 
distributed evenly.  I could see running out of DB connections happening if a 
server gets hammered (though the connections should be freed up at some point 
later).  Does it happen primarily to one or two servers and not all of them?

2. Try adding the following properties to the basePooledDataSource bean in 
datasourceContext.xml:

property name=logAbandoned value=true /
property name=numTestsPerEvictionRun value=5 /

This may not resolve the issue, but perhaps the logging will provide a clue to 
what's going on.  However it is likely the additional logging will not trigger. 
 The property minEvictableIdleTimeMillis is supposed to release a connection 
after it has been idle for the specified number of milliseconds, and the 
properties abandonWhenPercentageFull, removeAbandoned, and 
removeAbandonedTimeout which are specified are supposed to clean up abandon 
connections (allocated but not used in removeAbandonedTimeout seconds when a 
new connection is requested but none are available).  However in a load test 
scenario, especially one where a server is taxed very heavily, the 
removeAbandonedTimeout value may be too high (value is 300 sec) if connections 
are heavily used so no connections may be considered abandoned and harvested 
during the test.  However I wouldn't change removeAbandonedTimeout just yet.  
After the test completes however there may be some useful log messages if some 
connections were consumed and not released.  If nothing else 5 minutes after 
the test completes you should be able to log onto the server even if all

Re: [uportal-dev] uPortal database connection pool size

2014-12-02 Thread James Wennmacher
Sounds reasonable on the maxWait increase.  How about 10 sec?  Written 
up as https://issues.jasig.org/browse/UP-4325.  I'll try to get to it 
later this month to give others a chance to comment on it.

James Wennmacher - Unicon
480.558.2420

On 12/02/2014 12:38 PM, Tim Levett wrote:

 Random collection of thoughts on this topic:


   * Raw events would need more than one connection. Thereare3 major
 interactions (might be more) with UP_RAW_EVENTS: Insert for new
 events,read/update for aggregation, and purging.
   * This pool is pernode,so if you are clustering (like our 4 nodes)
 this can get you into trouble if you have limits on your
 concurrentdatabase connections.
   * It looks like we use the default pool size here on MyUW.I don't
 have access to theadmin interface in production.
   * The only time we had database connection issues is when we had a
 databasehiccup(single point of failure) and a bunch of requests
 queued up.
   * If we do modifications, I would also suggest upping the wait for
 connection timeout.


 Thanks,


 Tim Levett
 tim.levettATwisc.edu
 MyUW-Infrastructure


 
 *From:* bounce-37985837-70367...@lists.wisc.edu 
 bounce-37985837-70367...@lists.wisc.edu on behalf of James 
 Wennmacher jwennmac...@unicon.net
 *Sent:* Tuesday, December 2, 2014 1:02 PM
 *To:* uportal-dev@lists.ja-sig.org
 *Subject:* [uportal-dev] uPortal database connection pool size
 Answering the question below on the uportal-user list got me thinking 
 (a dangerous thing indeed ... :-) ).

 Currently all 3 of the uPortal DB connection pool sizes defined in 
 datasourceContext.xml are all set to the same max value (of 75 by 
 default).  In glancing at the code I am thinking that the raw events 
 DB pool and the aggregation events DB pool are running on timed 
 threads and only use 1 DB connection each (see 
 https://github.com/Jasig/uPortal/blob/uportal-4.1.2/uportal-war/src/main/java/org/jasig/portal/events/handlers/QueueingEventHandler.java#L41
  
 and 
 https://github.com/Jasig/uPortal/blob/uportal-4.1.2/uportal-war/src/main/resources/properties/contexts/schedulerContext.xml#L73).
  
 They can both have a smaller maxActive value to limit the exposure of 
 an error somehow consuming large numbers of DB connections and 
 impacting uPortal and portlets (via consuming too many DB connections 
 on the DB server).  I was thinking of setting their maxActive value to 
 5 to allow simultaneous threads for saving, purging, and querying.

 Does anyone see a problem with this strategy?  UWMadison or someone 
 with an active system can you glance at the DB MBeans AggrEventsDB and 
 RawEventsDB in uPortal/Datasource and see if the NumActive + NumIdle 
 are even close to 5 on your system?  (unfortunately without monitoring 
 tools I don't see how you'd find out what the max # of connections 
 ever made was).

 Thanks in advance for your insights and thoughts.

 James Wennmacher - Unicon
 480.558.2420


  Forwarded Message 
 Subject:  Re: [uportal-user] Increase database connection pool size
 Date: Tue, 02 Dec 2014 11:02:02 -0700
 From: James Wennmacher jwennmac...@unicon.net
 To:   uportal-u...@lists.jasig.org



 Database connection counts are defined in 
 uportal-war/src/main/resources/properties/contexts/datasourceContext.xml.

 uPortal uses the DB connections for a fairly brief period of time.  
 The message 'none available[size:75; busy:0; idle:0; lastwait:5000]' 
 plus your comment about leaving it overnight  makes me wonder if 
 somehow the connections are being lost and not reclaimed.  I suggest:

 1. Insure that the load test is not hitting servers too heavily; e.g. 
 load is distributed evenly.  I could see running out of DB connections 
 happening if a server gets hammered (though the connections should be 
 freed up at some point later).  Does it happen primarily to one or two 
 servers and not all of them?

 2. Try adding the following properties to the basePooledDataSource 
 bean in datasourceContext.xml:

 property name=logAbandoned value=true /
 property name=numTestsPerEvictionRun value=5 /

 This may not resolve the issue, but perhaps the logging will provide a 
 clue to what's going on.  However it is likely the additional logging 
 will not trigger.  The property minEvictableIdleTimeMillis is supposed 
 to release a connection after it has been idle for the specified 
 number of milliseconds, and the properties abandonWhenPercentageFull, 
 removeAbandoned, and removeAbandonedTimeout which are specified are 
 supposed to clean up abandon connections (allocated but not used in 
 removeAbandonedTimeout seconds when a new connection is requested but 
 none are available).  However in a load test scenario, especially one 
 where a server is taxed very heavily, the removeAbandonedTimeout value 
 may be too high (value is 300 sec) if connections are heavily used so