Random collection of thoughts on this topic:

  *   Raw events would need more than one connection. There are 3 major 
interactions (might be more) with UP_RAW_EVENTS: Insert for new events, 
read/update for aggregation, and purging.
  *   This pool is per node, so if you are clustering (like our 4 nodes) this 
can get you into trouble if you have limits on your concurrent database 
connections.
  *   It looks like we use the default pool size here on MyUW. I don't have 
access to the admin interface in production.
  *   The only time we had database connection issues is when we had a database 
hiccup (single point of failure) and a bunch of requests queued up.
  *   If we do modifications, I would also suggest upping the wait for 
connection timeout.


Thanks,


Tim Levett
tim.levettATwisc.edu
MyUW-Infrastructure


________________________________
From: bounce-37985837-70367...@lists.wisc.edu 
<bounce-37985837-70367...@lists.wisc.edu> on behalf of James Wennmacher 
<jwennmac...@unicon.net>
Sent: Tuesday, December 2, 2014 1:02 PM
To: uportal-dev@lists.ja-sig.org
Subject: [uportal-dev] uPortal database connection pool size

Answering the question below on the uportal-user list got me thinking (a 
dangerous thing indeed ... :-) ).

Currently all 3 of the uPortal DB connection pool sizes defined in 
datasourceContext.xml are all set to the same max value (of 75 by default).  In 
glancing at the code I am thinking that the raw events DB pool and the 
aggregation events DB pool are running on timed threads and only use 1 DB 
connection each (see 
https://github.com/Jasig/uPortal/blob/uportal-4.1.2/uportal-war/src/main/java/org/jasig/portal/events/handlers/QueueingEventHandler.java#L41
 and 
https://github.com/Jasig/uPortal/blob/uportal-4.1.2/uportal-war/src/main/resources/properties/contexts/schedulerContext.xml#L73).
  They can both have a smaller maxActive value to limit the exposure of an 
error somehow consuming large numbers of DB connections and impacting uPortal 
and portlets (via consuming too many DB connections on the DB server).  I was 
thinking of setting their maxActive value to 5 to allow simultaneous threads 
for saving, purging, and querying.

Does anyone see a problem with this strategy?  UWMadison or someone with an 
active system can you glance at the DB MBeans AggrEventsDB and RawEventsDB in 
uPortal/Datasource and see if the NumActive + NumIdle are even close to 5 on 
your system?  (unfortunately without monitoring tools I don't see how you'd 
find out what the max # of connections ever made was).

Thanks in advance for your insights and thoughts.


James Wennmacher - Unicon
480.558.2420


-------- Forwarded Message --------
Subject:        Re: [uportal-user] Increase database connection pool size
Date:   Tue, 02 Dec 2014 11:02:02 -0700
From:   James Wennmacher <jwennmac...@unicon.net><mailto:jwennmac...@unicon.net>
To:     uportal-u...@lists.jasig.org<mailto:uportal-u...@lists.jasig.org>


Database connection counts are defined in 
uportal-war/src/main/resources/properties/contexts/datasourceContext.xml.

uPortal uses the DB connections for a fairly brief period of time.  The message 
'none available[size:75; busy:0; idle:0; lastwait:5000]' plus your comment 
about leaving it overnight  makes me wonder if somehow the connections are 
being lost and not reclaimed.  I suggest:

1. Insure that the load test is not hitting servers too heavily; e.g. load is 
distributed evenly.  I could see running out of DB connections happening if a 
server gets hammered (though the connections should be freed up at some point 
later).  Does it happen primarily to one or two servers and not all of them?

2. Try adding the following properties to the basePooledDataSource bean in 
datasourceContext.xml:

<property name="logAbandoned" value="true" />
<property name="numTestsPerEvictionRun" value="5" />

This may not resolve the issue, but perhaps the logging will provide a clue to 
what's going on.  However it is likely the additional logging will not trigger. 
 The property minEvictableIdleTimeMillis is supposed to release a connection 
after it has been idle for the specified number of milliseconds, and the 
properties abandonWhenPercentageFull, removeAbandoned, and 
removeAbandonedTimeout which are specified are supposed to clean up abandon 
connections (allocated but not used in removeAbandonedTimeout seconds when a 
new connection is requested but none are available).  However in a load test 
scenario, especially one where a server is taxed very heavily, the 
removeAbandonedTimeout value may be too high (value is 300 sec) if connections 
are heavily used so no connections may be considered abandoned and harvested 
during the test.  However I wouldn't change removeAbandonedTimeout just yet.  
After the test completes however there may be some useful log messages if some 
connections were consumed and not released.  If nothing else 5 minutes after 
the test completes you should be able to log onto the server even if all 
connections are consumed since it should consider at least some of the 
connections as abandoned and eligible for harvesting.  However your comment 
about leaving the system overnight and the issue still exists makes me think 
the abandon connections will not be harvested.  Still worth trying logAbandoned 
to see if it provides more info.

3. Are there other DB connection error messages?  The ones you mentioned are 
for event aggregation (runs periodically to aggregate and purge raw portal 
activity event data) and for jgroups (used for distributed cache management to 
allow uPortal nodes to notify other uPortal nodes about cache replication or 
invalidation).  Were there any for uPortal activity not having a database 
connection?

4. It would be great to get more information so we can try to fix the issue of 
the connections not being released.  Even when fully consumed, the connections 
should release after a period of time (after being idle for 
minEvictableIdleTimeMillis milliseconds, or 5 minutes at the latest per 
removeAbandonedTimeout property when attempting to get a new connection and 
none are available).  When this situation occurs are you able to look at the DB 
server and see if the DB sees the 75 connections from the failed uPortal 
server, if the DB thinks the connections are idle, and what the last SQL 
command was on each of the DB connections?  It is also possible that network 
issues between the uPortal server and DB server are causing network socket 
connections to hang.  Finding out if the DB server is aware of the connections, 
their state, and what the last activity was should help determine if that is 
the case and hopefully point us to where the issue is.

5.  Barring additional investigation above (which I'd really like to have 
investigated and addressed), if you decide to try and increase the DB 
connections you'll want to discuss with your DBA.  Each uPortal server will 
make the up to 3 times the specified number of connections (75 for uPortal app 
use, 75 for raw event storage, 75 for event aggregation), plus some portlets 
(newsreader, announcements, simple content portlet, calendar, bookmarks) have 
separate db connection pools or make DB connections on each request that will 
go to the same DB.  If you have 10 servers, assuming they each make 75 + 
another 30 to 50 connections for the portlets (making a guess at max portlet 
connections), the max calculation would be your DB server would need to have 
resources to handle 10 * (3 * 75 + 30) DB connections.  As a note I don't think 
that the raw events or the aggregation events DB pools are likely to use 75 DB 
connections each as I think they are threads that run periodically on a timer 
and would use only 1 or 2 DB connections each (barring some software fault) 
even though their pool sizes are a max of 75.  If I'm right the real 
calculations would be more like 10 * (75 + 2 + 30), though it is likely the 
portlets would be less likely to max out their connections unless they are all 
on the main landing page or otherwise close together in the page flow.

In light of above it is possible that part of what is going on is that uPortal 
is attempting to request a DB connection, but the DB server is maxed out and it 
rejects the open.  I'm not sure if that's what is going on but it is worth 
investigating.

I hope this helps, and please let us know what you find out.

Thanks,

James Wennmacher - Unicon
480.558.2420

On 12/02/2014 08:46 AM, Ryan Melissari wrote:
I am in the process of load testing uPortal and am running out of database 
connections.  I have looked and don't see where I would increase this.  From 
the log file it is set to 75...does anyone know what a good number to increase 
this to would be?  Also, it seems that once it uses all the connections, it 
never releases them. I have left it overnight and it never gives them back, 
forcing me to restart tomcat.  Is there a way to set a max wait as well? Here 
is the error I am getting in the portal.log:

INFO  [uP-TaskExec-7-aggregateRawEvents] o.h.e.i.DefaultLoadEventListener 
2014-12-02 09:27:54,147 - HHH000327: Error performing load command : 
org.hibernate.exception.GenericJDBCException: Could not open connection
ERROR [Timer-5,uPortal.cacheManager,htst2web1-56682] 
o.j.p.jgroups.protocols.DAO_PING 2014-12-02 09:27:56,126 - failed sending 
discovery request
org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC 
Connection; nested exception is 
org.apache.tomcat.jdbc.pool.PoolExhaustedException: 
[Timer-5,uPortal.cacheManager,htst2web1-56682] Timeout: Pool empty. Unable to 
fetch a connection in 5 seconds, none available[size:75; busy:0; idle:0; 
lastwait:5000].



                --

You are currently subscribed to 
uportal-u...@lists.ja-sig.org<mailto:uportal-u...@lists.ja-sig.org> as: 
jwennmac...@unicon.net<mailto:jwennmac...@unicon.net>
To unsubscribe, change settings or access archives, see 
http://www.ja-sig.org/wiki/display/JSG/uportal-user




--

You are currently subscribed to uportal-dev@lists.ja-sig.org as: 
tim.lev...@wisc.edu
To unsubscribe, change settings or access archives, see 
http://www.ja-sig.org/wiki/display/JSG/uportal-dev

-- 
You are currently subscribed to uportal-dev@lists.ja-sig.org as: 
arch...@mail-archive.com
To unsubscribe, change settings or access archives, see 
http://www.ja-sig.org/wiki/display/JSG/uportal-dev

Reply via email to