Re: [dspace-tech] Re: Notes on PostgreSQL connection pooling with a Tomcat JNDI resource

2018-02-28 Thread Alan Orth
Thanks, Tom!

Those indexes really seem to help on the XMLUI /submissions page. For
example, on my production server I just tried to clear 22 old unfinished
submissions and it took about five minutes to finish. Then, while the
server was processing those operations, I used this query from the
PostgreSQL 9.5 docs[0] to capture some snapshots of the PostgreSQL locks:

SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid =
psa.pid;

I observed a growing number of locks — 500, 800, and eventually 2,000 —
during the process, most of which were on the resourcepolicy and
metadatavalue tables, ie:

SELECT * FROM resourcepolicy WHERE resource_type_id= $1 AND resource_id= $2
AND action_id= $3
SELECT * FROM MetadataValue WHERE resource_id= $1 and resource_type_id = $2
ORDER BY metadata_field_id, place

During normal DSpace usage it seems that the number of locks is usually
around 100 or 200.

I applied the indexes from DS-3636[1] on my test server and tried the same
operation and it finished almost instantly. This pull request is a no
brainer in my opinion—I would recommend reviving the dialog around it on
Jira and GitHub. For reference, I've uploaded one of the logs of 2,000
locks that I recorded during this process[2].

[0] https://www.postgresql.org/docs/9.5/static/view-pg-locks.html
[1] https://github.com/DSpace/DSpace/pull/1791
[2] https://gist.github.com/alanorth/3c4fea2a277abfa4c816d66d9c795100

Regards,

On Tue, Feb 27, 2018 at 5:34 PM Tom Desair  wrote:

> Some other useful configuration options you can use in combination with
> the "removeAbandoned='true'" setting:
>
>- abandonWhenPercentageFull: Only start cleaning up abandoned
>connections if the pool is used for more than X %.
>- jdbcInterceptors="ResetAbandonedTimer": Make sure the "abondoned"
>timer is reset every time there is activity on a connection
>
>
> https://tomcat.apache.org/tomcat-7.0-doc/jdbc-pool.html#Tomcat_JDBC_Enhanced_Attributes
>
> @Alan, can you check if any of these indexes improves the performance of
> that query:
> https://github.com/DSpace/DSpace/pull/1791/files#diff-5244ef3f861b63643134797c3866e16a
> ?
>
>
>
> [image: logo] Tom Desair
> 250-B Suite 3A, Lucius Gordon Drive, West Henrietta, NY 14586
> 
> Gaston Geenslaan 14, Leuven 3001, Belgium
> 
> www.atmire.com
> 
>
> 2018-02-27 14:48 GMT+01:00 Alan Orth :
>
>> Hello,
>>
>> *The JDBC pool parameter removeAbandoned='true' absolutely solves the
>> idle in transaction issue*.
>>
>> However! In the past two weeks since I enabled it, I notice that we don't
>> have any "idle in transaction" connections in pg_stat_activity anymore—and
>> only a handful of active connections at any given time. The default timeout
>> is to prune "abandoned" connections after 60 seconds, which doesn't seem to
>> be a problem except on the XMLUI submissions page. One user with many
>> (thousands?) of submissions says this page is very slow and experiences
>> "Socket closed" errors eventually. This is presumably PostgreSQL killing
>> his connection. I've tried increasing this to 90 and even 180 seconds, but
>> it doesn't help. What is happening on the /submissions page?
>>
>> Anyways, as an experiment, this morning I turned off the
>> removeAbandoned='true' setting on the JDBC pool and within a few hours
>> DSpace was down and there were 200 "idle in transaction" connections again.
>> *sigh* For what it's worth, pg_stat_activity shows 100 "idle in
>> transaction" connections running the following query:
>>
>> SELECT * FROM resourcepolicy WHERE resource_type_id= $1 AND resource_id=
>> $2 AND action_id= $3
>>
>> I'm hoping for some insight on what this query is, and about the XMLUI
>> submissions page being slow! Again, we're running DSpace 5.5, PostgreSQL
>> 9.5.11, and XMLUI with ~70,000 items.
>>
>> Thanks,
>>
>> On Sat, Feb 10, 2018 at 5:13 PM Alan Orth  wrote:
>>
>>> Tim,
>>>
>>> Any idea why DSpace thinks those connections are busy when PostgreSQL
>>> itself reports them as idle?
>>>
>>> In order to try to understand the traffic that is coming in I wrote a
>>> Munin plugin[0] that uses JMX to monitor the number of active sessions in
>>> DSpace applications (REST doesn't appear to create sessions). From that
>>> it's clear that we have quite a lot of traffic sometimes, with nearly 1500
>>> active sessions sometimes (see attached)! We aggressively add bot user
>>> agents to Tomcat's Crawler Session Manager Valve so all of these bots like
>>> Google, Bing, and Baidu that browse from fifty IP addresses concurrently
>>> get lumped into one JSESSIONID.
>>>
>>> In any case I've adjusted my JDBC pool parameters to include the
>>> parameters removeAbandoned='true' and 

Re: [dspace-tech] Re: Notes on PostgreSQL connection pooling with a Tomcat JNDI resource

2018-02-27 Thread Tom Desair
Some other useful configuration options you can use in combination with the
"removeAbandoned='true'" setting:

   - abandonWhenPercentageFull: Only start cleaning up abandoned
   connections if the pool is used for more than X %.
   - jdbcInterceptors="ResetAbandonedTimer": Make sure the "abondoned"
   timer is reset every time there is activity on a connection

https://tomcat.apache.org/tomcat-7.0-doc/jdbc-pool.html#Tomcat_JDBC_Enhanced_Attributes

@Alan, can you check if any of these indexes improves the performance of
that query:
https://github.com/DSpace/DSpace/pull/1791/files#diff-5244ef3f861b63643134797c3866e16a
?



[image: logo] Tom Desair
250-B Suite 3A, Lucius Gordon Drive, West Henrietta, NY 14586
Gaston Geenslaan 14, Leuven 3001, Belgium
www.atmire.com


2018-02-27 14:48 GMT+01:00 Alan Orth :

> Hello,
>
> *The JDBC pool parameter removeAbandoned='true' absolutely solves the idle
> in transaction issue*.
>
> However! In the past two weeks since I enabled it, I notice that we don't
> have any "idle in transaction" connections in pg_stat_activity anymore—and
> only a handful of active connections at any given time. The default timeout
> is to prune "abandoned" connections after 60 seconds, which doesn't seem to
> be a problem except on the XMLUI submissions page. One user with many
> (thousands?) of submissions says this page is very slow and experiences
> "Socket closed" errors eventually. This is presumably PostgreSQL killing
> his connection. I've tried increasing this to 90 and even 180 seconds, but
> it doesn't help. What is happening on the /submissions page?
>
> Anyways, as an experiment, this morning I turned off the
> removeAbandoned='true' setting on the JDBC pool and within a few hours
> DSpace was down and there were 200 "idle in transaction" connections again.
> *sigh* For what it's worth, pg_stat_activity shows 100 "idle in
> transaction" connections running the following query:
>
> SELECT * FROM resourcepolicy WHERE resource_type_id= $1 AND resource_id=
> $2 AND action_id= $3
>
> I'm hoping for some insight on what this query is, and about the XMLUI
> submissions page being slow! Again, we're running DSpace 5.5, PostgreSQL
> 9.5.11, and XMLUI with ~70,000 items.
>
> Thanks,
>
> On Sat, Feb 10, 2018 at 5:13 PM Alan Orth  wrote:
>
>> Tim,
>>
>> Any idea why DSpace thinks those connections are busy when PostgreSQL
>> itself reports them as idle?
>>
>> In order to try to understand the traffic that is coming in I wrote a
>> Munin plugin[0] that uses JMX to monitor the number of active sessions in
>> DSpace applications (REST doesn't appear to create sessions). From that
>> it's clear that we have quite a lot of traffic sometimes, with nearly 1500
>> active sessions sometimes (see attached)! We aggressively add bot user
>> agents to Tomcat's Crawler Session Manager Valve so all of these bots like
>> Google, Bing, and Baidu that browse from fifty IP addresses concurrently
>> get lumped into one JSESSIONID.
>>
>> In any case I've adjusted my JDBC pool parameters to include the
>> parameters removeAbandoned='true' and testOnReturn='true' in addition to
>> those Mark and I were using earlier in this thread. Things are calm on the
>> site the last few days but let's see what happens during the week. We're in
>> the process of upgrading from 5.5 to 5.8, which will include some database
>> fixes, but I'm definitely looking forward to moving our repository to
>> DSpace 6.x to take advantage of the Hibernate work.
>>
>> [0] https://github.com/ilri/rmg-ansible-public/blob/master/
>> roles/munin/files/jmx_dspace_sessions
>>
>> Cheers,
>>
>> On Thu, Feb 8, 2018 at 7:26 PM Tim Donohue 
>> wrote:
>>
>>> Hi Alan,
>>>
>>> We do occasionally find areas of DSpace that fail to properly close
>>> database connections or otherwise clean up after themselves.  So, it's not
>>> unheard of for database leaks to be found and then fixed.  Obviously
>>> though, if we were aware of a specific leak in XMLUI, we'd get it fixed.
>>>
>>> I do know that significant work at the database layer went into DSpace 6
>>> (with the move to Hibernate)... and I've heard reports (from a few folks)
>>> that these sorts of issues are less frequent in DSpace 6.x.  (As I have
>>> heard of other large sites that said they had to restart DSpace every once
>>> in a while prior to 6.x, but are not seeing the same issues with 6.x.  I've
>>> not seen this behavior myself, but it's very possible that it is
>>> encountered more with highly active and/or larger sites)
>>>
>>> That unfortunately doesn't solve the issues you are seeing though.  Is
>>> your site extremely active (in terms of concurrent users)?  Are you perhaps
>>> having a lot of bad behaving spiders that are hitting your site and
>>> possibly using up database connections?  250 connections in the pool sounds
>>> like plenty, 

Re: [dspace-tech] Re: Notes on PostgreSQL connection pooling with a Tomcat JNDI resource

2018-02-27 Thread Alan Orth
Hello,

*The JDBC pool parameter removeAbandoned='true' absolutely solves the idle
in transaction issue*.

However! In the past two weeks since I enabled it, I notice that we don't
have any "idle in transaction" connections in pg_stat_activity anymore—and
only a handful of active connections at any given time. The default timeout
is to prune "abandoned" connections after 60 seconds, which doesn't seem to
be a problem except on the XMLUI submissions page. One user with many
(thousands?) of submissions says this page is very slow and experiences
"Socket closed" errors eventually. This is presumably PostgreSQL killing
his connection. I've tried increasing this to 90 and even 180 seconds, but
it doesn't help. What is happening on the /submissions page?

Anyways, as an experiment, this morning I turned off the
removeAbandoned='true' setting on the JDBC pool and within a few hours
DSpace was down and there were 200 "idle in transaction" connections again.
*sigh* For what it's worth, pg_stat_activity shows 100 "idle in
transaction" connections running the following query:

SELECT * FROM resourcepolicy WHERE resource_type_id= $1 AND resource_id= $2
AND action_id= $3

I'm hoping for some insight on what this query is, and about the XMLUI
submissions page being slow! Again, we're running DSpace 5.5, PostgreSQL
9.5.11, and XMLUI with ~70,000 items.

Thanks,

On Sat, Feb 10, 2018 at 5:13 PM Alan Orth  wrote:

> Tim,
>
> Any idea why DSpace thinks those connections are busy when PostgreSQL
> itself reports them as idle?
>
> In order to try to understand the traffic that is coming in I wrote a
> Munin plugin[0] that uses JMX to monitor the number of active sessions in
> DSpace applications (REST doesn't appear to create sessions). From that
> it's clear that we have quite a lot of traffic sometimes, with nearly 1500
> active sessions sometimes (see attached)! We aggressively add bot user
> agents to Tomcat's Crawler Session Manager Valve so all of these bots like
> Google, Bing, and Baidu that browse from fifty IP addresses concurrently
> get lumped into one JSESSIONID.
>
> In any case I've adjusted my JDBC pool parameters to include the
> parameters removeAbandoned='true' and testOnReturn='true' in addition to
> those Mark and I were using earlier in this thread. Things are calm on the
> site the last few days but let's see what happens during the week. We're in
> the process of upgrading from 5.5 to 5.8, which will include some database
> fixes, but I'm definitely looking forward to moving our repository to
> DSpace 6.x to take advantage of the Hibernate work.
>
> [0]
> https://github.com/ilri/rmg-ansible-public/blob/master/roles/munin/files/jmx_dspace_sessions
>
> Cheers,
>
> On Thu, Feb 8, 2018 at 7:26 PM Tim Donohue  wrote:
>
>> Hi Alan,
>>
>> We do occasionally find areas of DSpace that fail to properly close
>> database connections or otherwise clean up after themselves.  So, it's not
>> unheard of for database leaks to be found and then fixed.  Obviously
>> though, if we were aware of a specific leak in XMLUI, we'd get it fixed.
>>
>> I do know that significant work at the database layer went into DSpace 6
>> (with the move to Hibernate)... and I've heard reports (from a few folks)
>> that these sorts of issues are less frequent in DSpace 6.x.  (As I have
>> heard of other large sites that said they had to restart DSpace every once
>> in a while prior to 6.x, but are not seeing the same issues with 6.x.  I've
>> not seen this behavior myself, but it's very possible that it is
>> encountered more with highly active and/or larger sites)
>>
>> That unfortunately doesn't solve the issues you are seeing though.  Is
>> your site extremely active (in terms of concurrent users)?  Are you perhaps
>> having a lot of bad behaving spiders that are hitting your site and
>> possibly using up database connections?  250 connections in the pool sounds
>> like plenty, but it obviously may be dependent on how many concurrent hits
>> you are seeing.
>>
>> In searching around, I also came across some older (circa 2010) notes on
>> our wiki at
>> https://wiki.duraspace.org/display/DSPACE/Idle+In+Transaction+Problem
>>  It doesn't provide any exact solutions though, but it shows that you are
>> not alone (I'm not sure I'd recommend killing "idle in transaction"
>> processes though, as that wiki page suggests).  Also here's more info from
>> Postgres on what "idle in transaction" means:
>> https://www.postgresql.org/docs/9.2/static/monitoring-ps.html (it also
>> suggests that looking at pg_locks system view might provide info on what
>> transactions are still open, etc).
>>
>> So, I don't have any answers here, but maybe these will provide you with
>> clues or inspiration to dig in a bit further.  Please do let us know what
>> you find, and feel free to post back with any clues or breadcrumbs you
>> discover along the way.
>>
>> - Tim
>>
>> On Wed, Feb 7, 2018 at 5:07 PM Alan Orth 

Re: [dspace-tech] Re: Notes on PostgreSQL connection pooling with a Tomcat JNDI resource

2018-02-10 Thread Alan Orth
Tim,

Any idea why DSpace thinks those connections are busy when PostgreSQL
itself reports them as idle?

In order to try to understand the traffic that is coming in I wrote a Munin
plugin[0] that uses JMX to monitor the number of active sessions in DSpace
applications (REST doesn't appear to create sessions). From that it's clear
that we have quite a lot of traffic sometimes, with nearly 1500 active
sessions sometimes (see attached)! We aggressively add bot user agents to
Tomcat's Crawler Session Manager Valve so all of these bots like Google,
Bing, and Baidu that browse from fifty IP addresses concurrently get lumped
into one JSESSIONID.

In any case I've adjusted my JDBC pool parameters to include the parameters
removeAbandoned='true' and testOnReturn='true' in addition to those Mark
and I were using earlier in this thread. Things are calm on the site the
last few days but let's see what happens during the week. We're in the
process of upgrading from 5.5 to 5.8, which will include some database
fixes, but I'm definitely looking forward to moving our repository to
DSpace 6.x to take advantage of the Hibernate work.

[0]
https://github.com/ilri/rmg-ansible-public/blob/master/roles/munin/files/jmx_dspace_sessions

Cheers,

On Thu, Feb 8, 2018 at 7:26 PM Tim Donohue  wrote:

> Hi Alan,
>
> We do occasionally find areas of DSpace that fail to properly close
> database connections or otherwise clean up after themselves.  So, it's not
> unheard of for database leaks to be found and then fixed.  Obviously
> though, if we were aware of a specific leak in XMLUI, we'd get it fixed.
>
> I do know that significant work at the database layer went into DSpace 6
> (with the move to Hibernate)... and I've heard reports (from a few folks)
> that these sorts of issues are less frequent in DSpace 6.x.  (As I have
> heard of other large sites that said they had to restart DSpace every once
> in a while prior to 6.x, but are not seeing the same issues with 6.x.  I've
> not seen this behavior myself, but it's very possible that it is
> encountered more with highly active and/or larger sites)
>
> That unfortunately doesn't solve the issues you are seeing though.  Is
> your site extremely active (in terms of concurrent users)?  Are you perhaps
> having a lot of bad behaving spiders that are hitting your site and
> possibly using up database connections?  250 connections in the pool sounds
> like plenty, but it obviously may be dependent on how many concurrent hits
> you are seeing.
>
> In searching around, I also came across some older (circa 2010) notes on
> our wiki at
> https://wiki.duraspace.org/display/DSPACE/Idle+In+Transaction+Problem
>  It doesn't provide any exact solutions though, but it shows that you are
> not alone (I'm not sure I'd recommend killing "idle in transaction"
> processes though, as that wiki page suggests).  Also here's more info from
> Postgres on what "idle in transaction" means:
> https://www.postgresql.org/docs/9.2/static/monitoring-ps.html (it also
> suggests that looking at pg_locks system view might provide info on what
> transactions are still open, etc).
>
> So, I don't have any answers here, but maybe these will provide you with
> clues or inspiration to dig in a bit further.  Please do let us know what
> you find, and feel free to post back with any clues or breadcrumbs you
> discover along the way.
>
> - Tim
>
> On Wed, Feb 7, 2018 at 5:07 PM Alan Orth  wrote:
>
>> Is there some kind of database transaction leaking issue with DSpace,
>> particularly the XMLUI? Basically, I find that it doesn't matter what I my
>> pool size is. Eventually it always gets full. Sometimes it takes a few
>> days, sometimes it happens a few times in one day. Today I saw this in
>> dspace.log.2018-02-07:
>>
>> org.apache.tomcat.jdbc.pool.PoolExhaustedException:
>> [http-bio-127.0.0.1-8443-exec-328] Timeout: Pool empty. Unable to fetch a
>> connection in 5 seconds, none available[size:250; busy:250; idle:0;
>> lastwait:5000].
>>
>> The pool size is 250, and yet PostgreSQL activity shows literally all of
>> the 250 connections as either idle or idle in transaction:
>>
>> $ psql -c 'select * from pg_stat_activity' | grep -c "PostgreSQL JDBC"
>> 250
>> $ psql -c 'select * from pg_stat_activity' | grep "PostgreSQL JDBC" |
>> grep -c idle
>> 250
>> $ psql -c 'select * from pg_stat_activity' | grep "PostgreSQL JDBC" |
>> grep -c "idle in transaction"
>> 187
>>
>> What is going on? I'm going crazy restarting the server and staring at
>> log files trying to figure out what is going on. We're using DSpace 5.5
>> with PostgreSQL 9.5.10 and the latest PostgreSQL JDBC driver from
>> jdbc.postgresql.org (42.2.1) on fast hardware with plenty of memory and
>> CPU. But DSpace. Always. Crashes. Every. Day. It seems DS-3551 might help,
>> but we can't move to DSpace 5.7 yet, and our 5.5 branch doesn't build if I
>> cherry-pick the commits for DS-3551[0].
>>
>> [0] 

Re: [dspace-tech] Re: Notes on PostgreSQL connection pooling with a Tomcat JNDI resource

2018-02-08 Thread Tim Donohue
Hi Alan,

We do occasionally find areas of DSpace that fail to properly close
database connections or otherwise clean up after themselves.  So, it's not
unheard of for database leaks to be found and then fixed.  Obviously
though, if we were aware of a specific leak in XMLUI, we'd get it fixed.

I do know that significant work at the database layer went into DSpace 6
(with the move to Hibernate)... and I've heard reports (from a few folks)
that these sorts of issues are less frequent in DSpace 6.x.  (As I have
heard of other large sites that said they had to restart DSpace every once
in a while prior to 6.x, but are not seeing the same issues with 6.x.  I've
not seen this behavior myself, but it's very possible that it is
encountered more with highly active and/or larger sites)

That unfortunately doesn't solve the issues you are seeing though.  Is your
site extremely active (in terms of concurrent users)?  Are you perhaps
having a lot of bad behaving spiders that are hitting your site and
possibly using up database connections?  250 connections in the pool sounds
like plenty, but it obviously may be dependent on how many concurrent hits
you are seeing.

In searching around, I also came across some older (circa 2010) notes on
our wiki at
https://wiki.duraspace.org/display/DSPACE/Idle+In+Transaction+Problem   It
doesn't provide any exact solutions though, but it shows that you are not
alone (I'm not sure I'd recommend killing "idle in transaction" processes
though, as that wiki page suggests).  Also here's more info from Postgres
on what "idle in transaction" means:
https://www.postgresql.org/docs/9.2/static/monitoring-ps.html (it also
suggests that looking at pg_locks system view might provide info on what
transactions are still open, etc).

So, I don't have any answers here, but maybe these will provide you with
clues or inspiration to dig in a bit further.  Please do let us know what
you find, and feel free to post back with any clues or breadcrumbs you
discover along the way.

- Tim

On Wed, Feb 7, 2018 at 5:07 PM Alan Orth  wrote:

> Is there some kind of database transaction leaking issue with DSpace,
> particularly the XMLUI? Basically, I find that it doesn't matter what I my
> pool size is. Eventually it always gets full. Sometimes it takes a few
> days, sometimes it happens a few times in one day. Today I saw this in
> dspace.log.2018-02-07:
>
> org.apache.tomcat.jdbc.pool.PoolExhaustedException:
> [http-bio-127.0.0.1-8443-exec-328] Timeout: Pool empty. Unable to fetch a
> connection in 5 seconds, none available[size:250; busy:250; idle:0;
> lastwait:5000].
>
> The pool size is 250, and yet PostgreSQL activity shows literally all of
> the 250 connections as either idle or idle in transaction:
>
> $ psql -c 'select * from pg_stat_activity' | grep -c "PostgreSQL JDBC"
> 250
> $ psql -c 'select * from pg_stat_activity' | grep "PostgreSQL JDBC" | grep
> -c idle
> 250
> $ psql -c 'select * from pg_stat_activity' | grep "PostgreSQL JDBC" | grep
> -c "idle in transaction"
> 187
>
> What is going on? I'm going crazy restarting the server and staring at log
> files trying to figure out what is going on. We're using DSpace 5.5 with
> PostgreSQL 9.5.10 and the latest PostgreSQL JDBC driver from
> jdbc.postgresql.org (42.2.1) on fast hardware with plenty of memory and
> CPU. But DSpace. Always. Crashes. Every. Day. It seems DS-3551 might help,
> but we can't move to DSpace 5.7 yet, and our 5.5 branch doesn't build if I
> cherry-pick the commits for DS-3551[0].
>
> [0] https://jira.duraspace.org/browse/DS-3551
>
> Let's get to the bottom of this,
>
> On Thu, Feb 1, 2018 at 3:43 PM Mark H. Wood  wrote:
>
>> If you have stuck connections, you may want to try some of the pool
>> provider's connection testing parameters.  I've had good results using
>> 'validationQuery="SELECT 1"' with 'testOnBorrow="true"'.
>>
>> Also I've found that older PostgreSQL drivers seem to see this problem
>> more frequently, so I try to keep Tomcat updated with the latest released
>> DBMS drivers.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "DSpace Technical Support" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to dspace-tech+unsubscr...@googlegroups.com.
>> To post to this group, send email to dspace-tech@googlegroups.com.
>> Visit this group at https://groups.google.com/group/dspace-tech.
>> For more options, visit https://groups.google.com/d/optout.
>>
> --
>
> Alan Orth
> alan.o...@gmail.com
>
>
> https://picturingjordan.com
> https://englishbulgaria.net
> https://mjanja.ch
>
> --
> You received this message because you are subscribed to the Google Groups
> "DSpace Technical Support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dspace-tech+unsubscr...@googlegroups.com.
> To post to this group, send email to dspace-tech@googlegroups.com.
> Visit this 

Re: [dspace-tech] Re: Notes on PostgreSQL connection pooling with a Tomcat JNDI resource

2018-02-07 Thread Alan Orth
Is there some kind of database transaction leaking issue with DSpace,
particularly the XMLUI? Basically, I find that it doesn't matter what I my
pool size is. Eventually it always gets full. Sometimes it takes a few
days, sometimes it happens a few times in one day. Today I saw this in
dspace.log.2018-02-07:

org.apache.tomcat.jdbc.pool.PoolExhaustedException:
[http-bio-127.0.0.1-8443-exec-328] Timeout: Pool empty. Unable to fetch a
connection in 5 seconds, none available[size:250; busy:250; idle:0;
lastwait:5000].

The pool size is 250, and yet PostgreSQL activity shows literally all of
the 250 connections as either idle or idle in transaction:

$ psql -c 'select * from pg_stat_activity' | grep -c "PostgreSQL JDBC"
250
$ psql -c 'select * from pg_stat_activity' | grep "PostgreSQL JDBC" | grep
-c idle
250
$ psql -c 'select * from pg_stat_activity' | grep "PostgreSQL JDBC" | grep
-c "idle in transaction"
187

What is going on? I'm going crazy restarting the server and staring at log
files trying to figure out what is going on. We're using DSpace 5.5 with
PostgreSQL 9.5.10 and the latest PostgreSQL JDBC driver from
jdbc.postgresql.org (42.2.1) on fast hardware with plenty of memory and
CPU. But DSpace. Always. Crashes. Every. Day. It seems DS-3551 might help,
but we can't move to DSpace 5.7 yet, and our 5.5 branch doesn't build if I
cherry-pick the commits for DS-3551[0].

[0] https://jira.duraspace.org/browse/DS-3551

Let's get to the bottom of this,

On Thu, Feb 1, 2018 at 3:43 PM Mark H. Wood  wrote:

> If you have stuck connections, you may want to try some of the pool
> provider's connection testing parameters.  I've had good results using
> 'validationQuery="SELECT 1"' with 'testOnBorrow="true"'.
>
> Also I've found that older PostgreSQL drivers seem to see this problem
> more frequently, so I try to keep Tomcat updated with the latest released
> DBMS drivers.
>
> --
> You received this message because you are subscribed to the Google Groups
> "DSpace Technical Support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dspace-tech+unsubscr...@googlegroups.com.
> To post to this group, send email to dspace-tech@googlegroups.com.
> Visit this group at https://groups.google.com/group/dspace-tech.
> For more options, visit https://groups.google.com/d/optout.
>
-- 

Alan Orth
alan.o...@gmail.com
https://picturingjordan.com
https://englishbulgaria.net
https://mjanja.ch

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] Re: Notes on PostgreSQL connection pooling with a Tomcat JNDI resource

2018-02-01 Thread Mark H. Wood
If you have stuck connections, you may want to try some of the pool 
provider's connection testing parameters.  I've had good results using 
'validationQuery="SELECT 1"' with 'testOnBorrow="true"'.

Also I've found that older PostgreSQL drivers seem to see this problem more 
frequently, so I try to keep Tomcat updated with the latest released DBMS 
drivers.

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


Re: [dspace-tech] Re: Notes on PostgreSQL connection pooling with a Tomcat JNDI resource

2018-01-30 Thread Alan Orth
You're welcome, Hayden. Glad it helped.

Regarding your performance issues, you need to try to understand where the
load is coming from. You might just have legitimate traffic, either from
users or search bots. Start by looking at your web server logs. We use
nginx in front of Tomcat, so I started logging requests to XMLUI, OAI, and
REST separately so I could see exactly what was happening in each. You'll
get much more insight once you start to understand the nature of your
traffic.

For example, Google and Bing respect robots.txt so you if enable the cron
job for nightly sitemap generation, then you can submit your sitemap to
their web master tools consoles, and then alter your robots.txt so that
dynamic search pages like Discovery and Browse are forbidden. This is a big
help because those pages are computationally heavy. See this wiki page for
how to enable sitemaps:

https://wiki.duraspace.org/display/DSDOC5x/Search+Engine+Optimization

On that note, from looking at my logs I saw that Baidu doesn't respect
robots.txt and just crawls all the dynamic pages. I used nginx to identify
Baidu by its user agent and throttle its requests using nginx's limit_req
module. I wrote about it on my blog a few months ago:

https://mjanja.ch/2017/11/rate-limiting-baiduspider-using-nginx/

Another important tweak along those lines is Tomcat's Crawler Session
Manager valve. Basically, Tomcat assigns all users a new session
(JSESSIONID) when they connect. Bots are users too, so they get one as
well. The problem is that search bots like Google, Bing, and Baidu often
crawl from fifty (50) IP addresses concurrently, and each one of those gets
a session, each of which takes up precious CPU time, memory, and database
resources. If you enable the Crawler Session Manager valve Tomcat will
force all matching user agents to use one session. See the Tomcat docs or
look at our server.xml for more information:

https://github.com/ilri/rmg-ansible-public/blob/master/roles/dspace/templates/tomcat/server-tomcat7.xml.j2

Sometimes I will get an alert that our server's CPU load is very high so
I'll go look at the logs and do some grep and awk magic to figure out the
IP addresses for the day's top ten users. Often its illuminating. For
example, one day in December some client in China downloaded 20,000 PDFs
from us. Another time some open access bot started harvesting us and made
400,000 requests in a few hours. I looked at the logs, found the contact
information in their user agent string, and contacted their developers.
They acknowledged that there bot had gone awry due to a bug and fixed it.

Now our repository has been using a pool with a maximum of 300 connections
for a month or so and I haven't had DSpace crash due to database issues
since then. We still have high load every few days and users complain that
the site is "down", but when I look at PostgreSQL activity I see we have
211 active database connections—so there's a problem but it's not the
database! This is when I started to look elsewhere in the application stack
to alleviate this issue. For example, Tomcat's default maxThreads is 200,
so it's likely that this is the bottle neck now. I've recently bumped it
and its companion processorCache up to 400 and I'm monitoring things now.

So start with looking at the logs! They are very interesting and
enlightening. :)

Cheers,

On Tue, Jan 30, 2018 at 2:11 PM Hayden Young 
wrote:

> Hi Alan
>
> Thanks for the in-depth analysis and possible remedies. I implemented the
> jndi connections as you had outlined and tested that it did indeed use the
> Tomcat settings and not the DSpace configuration. As soon as I redeployed
> our DSpace it certainly felt as if there was an improvement in performance.
> A couple of changes were required to the jndi tomcat configuration in
> server.xml (probably related to running tomcat8) which were changing
> maxWait to maxWaitMillis and maxActive to maxTotal.
>
> Unfortunately, we were soon back to where we started with
> org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for
> connection from pool errors when browsing bitstreams. We also noticed the
> problem arose quicker than it has in the past; usually it is 1 to 2 weeks
> before we see this problem; this time it has happened within 3 days. The
> site does not experience much traffic and a check of the Postgres idle
> connections shows only 8 which seems well below what would cause the db
> error.
>
> Are you able to outline possible steps to analyzing the problem from
> DSpace's end?
>
> Our other solution is to put PG Bouncer between the db and dspace although
> my feeling is that this is only a temporary solution.
>
> Thanks
>
>
> Hayden Young
> https://www.knowledgearc.com
>
>
> On Wednesday, January 17, 2018 at 1:36:16 AM UTC+8, Alan Orth wrote:
>
>> Thanks, Hardy. I've created a new Jira issue[0], updated my git commit
>> message with the DS issue number, and updated my pull request[1] title and
>> 

Re: [dspace-tech] Re: Notes on PostgreSQL connection pooling with a Tomcat JNDI resource

2018-01-30 Thread Hayden Young
Hi Alan

Thanks for the in-depth analysis and possible remedies. I implemented the 
jndi connections as you had outlined and tested that it did indeed use the 
Tomcat settings and not the DSpace configuration. As soon as I redeployed 
our DSpace it certainly felt as if there was an improvement in performance. 
A couple of changes were required to the jndi tomcat configuration in 
server.xml (probably related to running tomcat8) which were changing 
maxWait to maxWaitMillis and maxActive to maxTotal.

Unfortunately, we were soon back to where we started with 
org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for 
connection from pool errors when browsing bitstreams. We also noticed the 
problem arose quicker than it has in the past; usually it is 1 to 2 weeks 
before we see this problem; this time it has happened within 3 days. The 
site does not experience much traffic and a check of the Postgres idle 
connections shows only 8 which seems well below what would cause the db 
error.

Are you able to outline possible steps to analyzing the problem from 
DSpace's end?

Our other solution is to put PG Bouncer between the db and dspace although 
my feeling is that this is only a temporary solution.

Thanks


Hayden Young
https://www.knowledgearc.com

On Wednesday, January 17, 2018 at 1:36:16 AM UTC+8, Alan Orth wrote:
>
> Thanks, Hardy. I've created a new Jira issue[0], updated my git commit 
> message with the DS issue number, and updated my pull request[1] title and 
> description to include the fact that this is slightly related to DS-3434. I 
> think that's good to go now.
>
> [0] https://jira.duraspace.org/browse/DS-3803
> [1] https://github.com/DSpace/DSpace/pull/1917
>
> Cheers,
>
> On Tue, Jan 16, 2018 at 5:41 PM Hardy Pottinger  > wrote:
>
>> Hi, Alan, yes, we require a matching Jira issue for each pull request. 
>> When you create a new issue, please mention DS-3434 as being related. 
>> Thanks!
>>
>> --Hardy
>>
>> On Sat, Jan 13, 2018 at 10:00 PM, Alan Orth > > wrote:
>>
>>> I've been testing DSpace 6.2 with Tomcat 8.5.24 and noticed that DSpace 
>>> does not start if you supply a database pool from JNDI(!). It seems to be a 
>>> known issue, as Mark Wood created a Jira ticket for it[0]. Also, DSpace 6.2 
>>> still has the db.jndi setting in dspace.cfg, even though this setting is no 
>>> longer user customizable. I've made a pull request against the dspace-6_x 
>>> branch to remove it, but I did not create a Jira ticket. Is it required?
>>>
>>> [0] https://jira.duraspace.org/browse/DS-3434
>>> [1] https://github.com/DSpace/DSpace/pull/1917
>>>
>>> Regards,
>>>
>>> On Thu, Jan 11, 2018 at 9:37 AM Alan Orth >> > wrote:
>>>
 To continue the discussion on a slightly related note: I've just 
 finished dealing with the fallout caused by some new bot — the only 
 fingerprint of which is its unique-but-normal-looking user agent — hitting 
 our XMLUI with 450,000 requests from six different IPs over just a few 
 hours. This generated a ridiculous amount of load on the server, including 
 160 PostgreSQL connections and 52,000 Tomcat sessions before I was able to 
 mitigate it. Surprisingly, since I had increased out pool size to 300 
 after 
 my last message, we never got pool timeout or database connection errors 
 in 
 dspace.log, but the site was very unresponsive — and this is on a beefy 
 server with SSDs, plenty of RAM, large PostgreSQL buffer cache, etc! I 
 ended up having to rate limit this user agent in our frontend nginx web 
 server using the limit_req_zone module[0].

 So a bit of a mixed success and frustration here. No amount of pool 
 tweaking will fix this type of issue, because there's always another 
 bigger, stupider bot that comes along eventually and doesn't match the 
 "bot" user agent. I will definitely look into implementing separate pools 
 as Tom had suggested, though, to limit the damage caused by high load to 
 certain DSpace web applications. Keep sharing your experiences! This is 
 very valuable and interesting to me.

 [0] 
 https://github.com/ilri/rmg-ansible-public/commit/368faaa99028c8e0c8a99de3f6c253a228d5f63b

 Cheers!

 On Thu, Jan 4, 2018 at 7:31 AM Alan Orth  wrote:

> That's a cool idea to use a separate pool for each web application, 
> Tom! I'd much rather have my OAI fail to establish a database connection 
> than my XMLUI. ;)
>
> Since I wrote the original mailing list message two weeks ago I've had 
> DSpace fail to establish a database connection a few thousand times and 
> I've increased my pool's max active from 50 to 75 and then 125 — our site 
> gets about four million hits per month (from looking at nginx logs), so 
> I'm 
> still trying to find the "sweet spot" for the pool settings. 

Re: [dspace-tech] Re: Notes on PostgreSQL connection pooling with a Tomcat JNDI resource

2018-01-16 Thread Alan Orth
Thanks, Hardy. I've created a new Jira issue[0], updated my git commit
message with the DS issue number, and updated my pull request[1] title and
description to include the fact that this is slightly related to DS-3434. I
think that's good to go now.

[0] https://jira.duraspace.org/browse/DS-3803
[1] https://github.com/DSpace/DSpace/pull/1917

Cheers,

On Tue, Jan 16, 2018 at 5:41 PM Hardy Pottinger 
wrote:

> Hi, Alan, yes, we require a matching Jira issue for each pull request.
> When you create a new issue, please mention DS-3434 as being related.
> Thanks!
>
> --Hardy
>
> On Sat, Jan 13, 2018 at 10:00 PM, Alan Orth  wrote:
>
>> I've been testing DSpace 6.2 with Tomcat 8.5.24 and noticed that DSpace
>> does not start if you supply a database pool from JNDI(!). It seems to be a
>> known issue, as Mark Wood created a Jira ticket for it[0]. Also, DSpace 6.2
>> still has the db.jndi setting in dspace.cfg, even though this setting is no
>> longer user customizable. I've made a pull request against the dspace-6_x
>> branch to remove it, but I did not create a Jira ticket. Is it required?
>>
>> [0] https://jira.duraspace.org/browse/DS-3434
>> [1] https://github.com/DSpace/DSpace/pull/1917
>>
>> Regards,
>>
>> On Thu, Jan 11, 2018 at 9:37 AM Alan Orth  wrote:
>>
>>> To continue the discussion on a slightly related note: I've just
>>> finished dealing with the fallout caused by some new bot — the only
>>> fingerprint of which is its unique-but-normal-looking user agent — hitting
>>> our XMLUI with 450,000 requests from six different IPs over just a few
>>> hours. This generated a ridiculous amount of load on the server, including
>>> 160 PostgreSQL connections and 52,000 Tomcat sessions before I was able to
>>> mitigate it. Surprisingly, since I had increased out pool size to 300 after
>>> my last message, we never got pool timeout or database connection errors in
>>> dspace.log, but the site was very unresponsive — and this is on a beefy
>>> server with SSDs, plenty of RAM, large PostgreSQL buffer cache, etc! I
>>> ended up having to rate limit this user agent in our frontend nginx web
>>> server using the limit_req_zone module[0].
>>>
>>> So a bit of a mixed success and frustration here. No amount of pool
>>> tweaking will fix this type of issue, because there's always another
>>> bigger, stupider bot that comes along eventually and doesn't match the
>>> "bot" user agent. I will definitely look into implementing separate pools
>>> as Tom had suggested, though, to limit the damage caused by high load to
>>> certain DSpace web applications. Keep sharing your experiences! This is
>>> very valuable and interesting to me.
>>>
>>> [0]
>>> https://github.com/ilri/rmg-ansible-public/commit/368faaa99028c8e0c8a99de3f6c253a228d5f63b
>>>
>>> Cheers!
>>>
>>> On Thu, Jan 4, 2018 at 7:31 AM Alan Orth  wrote:
>>>
 That's a cool idea to use a separate pool for each web application,
 Tom! I'd much rather have my OAI fail to establish a database connection
 than my XMLUI. ;)

 Since I wrote the original mailing list message two weeks ago I've had
 DSpace fail to establish a database connection a few thousand times and
 I've increased my pool's max active from 50 to 75 and then 125 — our site
 gets about four million hits per month (from looking at nginx logs), so I'm
 still trying to find the "sweet spot" for the pool settings. Anything's
 better than setting the pool in dspace.cfg, though.

 I wish other people would share their pool settings and experiences.

 On Wed, Jan 3, 2018 at 2:40 PM Hardy Pottinger <
 hardy.pottin...@gmail.com> wrote:

> Hi, please do create this wiki page, I'd love to read it. Thanks!
>
> --Hardy
>
> On Wed, Jan 3, 2018 at 4:10 PM, Tom Desair 
> wrote:
>
>> I just wanted to add a small note that having 1 single DB pool for
>> all Tomcat webapps can (and has) lead to problems. Your current pool size
>> is 50. This means that if you have (malicious) crawlers hitting your OAI
>> endpoint, this can deplete the available database connections available 
>> for
>> the web UI (XMLUI or JSPUI). The other way around can also happen.
>>
>> But using JNDI DB pools also give you more fine-grained control over
>> the connection distribution over the different web apps. For example, a
>> default PostgreSQL installation comes with a max connection limit of 100.
>> This means you can safely use around 70 connections (from experience). 
>> You
>> can then divided these connections with individual JNDI pools like this:
>>
>>- OAI: 15 connections
>>- REST: 15 connections
>>- WEB UI: 40 connections
>>
>>
>> Let me know if you've created a JNDI DB pool wiki page. I'll then try
>> to add some useful information on JDBC 

Re: [dspace-tech] Re: Notes on PostgreSQL connection pooling with a Tomcat JNDI resource

2018-01-16 Thread Hardy Pottinger
Hi, Alan, yes, we require a matching Jira issue for each pull request. When
you create a new issue, please mention DS-3434 as being related. Thanks!

--Hardy

On Sat, Jan 13, 2018 at 10:00 PM, Alan Orth  wrote:

> I've been testing DSpace 6.2 with Tomcat 8.5.24 and noticed that DSpace
> does not start if you supply a database pool from JNDI(!). It seems to be a
> known issue, as Mark Wood created a Jira ticket for it[0]. Also, DSpace 6.2
> still has the db.jndi setting in dspace.cfg, even though this setting is no
> longer user customizable. I've made a pull request against the dspace-6_x
> branch to remove it, but I did not create a Jira ticket. Is it required?
>
> [0] https://jira.duraspace.org/browse/DS-3434
> [1] https://github.com/DSpace/DSpace/pull/1917
>
> Regards,
>
> On Thu, Jan 11, 2018 at 9:37 AM Alan Orth  wrote:
>
>> To continue the discussion on a slightly related note: I've just finished
>> dealing with the fallout caused by some new bot — the only fingerprint of
>> which is its unique-but-normal-looking user agent — hitting our XMLUI with
>> 450,000 requests from six different IPs over just a few hours. This
>> generated a ridiculous amount of load on the server, including 160
>> PostgreSQL connections and 52,000 Tomcat sessions before I was able to
>> mitigate it. Surprisingly, since I had increased out pool size to 300 after
>> my last message, we never got pool timeout or database connection errors in
>> dspace.log, but the site was very unresponsive — and this is on a beefy
>> server with SSDs, plenty of RAM, large PostgreSQL buffer cache, etc! I
>> ended up having to rate limit this user agent in our frontend nginx web
>> server using the limit_req_zone module[0].
>>
>> So a bit of a mixed success and frustration here. No amount of pool
>> tweaking will fix this type of issue, because there's always another
>> bigger, stupider bot that comes along eventually and doesn't match the
>> "bot" user agent. I will definitely look into implementing separate pools
>> as Tom had suggested, though, to limit the damage caused by high load to
>> certain DSpace web applications. Keep sharing your experiences! This is
>> very valuable and interesting to me.
>>
>> [0] https://github.com/ilri/rmg-ansible-public/commit/
>> 368faaa99028c8e0c8a99de3f6c253a228d5f63b
>>
>> Cheers!
>>
>> On Thu, Jan 4, 2018 at 7:31 AM Alan Orth  wrote:
>>
>>> That's a cool idea to use a separate pool for each web application, Tom!
>>> I'd much rather have my OAI fail to establish a database connection than my
>>> XMLUI. ;)
>>>
>>> Since I wrote the original mailing list message two weeks ago I've had
>>> DSpace fail to establish a database connection a few thousand times and
>>> I've increased my pool's max active from 50 to 75 and then 125 — our site
>>> gets about four million hits per month (from looking at nginx logs), so I'm
>>> still trying to find the "sweet spot" for the pool settings. Anything's
>>> better than setting the pool in dspace.cfg, though.
>>>
>>> I wish other people would share their pool settings and experiences.
>>>
>>> On Wed, Jan 3, 2018 at 2:40 PM Hardy Pottinger <
>>> hardy.pottin...@gmail.com> wrote:
>>>
 Hi, please do create this wiki page, I'd love to read it. Thanks!

 --Hardy

 On Wed, Jan 3, 2018 at 4:10 PM, Tom Desair 
 wrote:

> I just wanted to add a small note that having 1 single DB pool for all
> Tomcat webapps can (and has) lead to problems. Your current pool size is
> 50. This means that if you have (malicious) crawlers hitting your OAI
> endpoint, this can deplete the available database connections available 
> for
> the web UI (XMLUI or JSPUI). The other way around can also happen.
>
> But using JNDI DB pools also give you more fine-grained control over
> the connection distribution over the different web apps. For example, a
> default PostgreSQL installation comes with a max connection limit of 100.
> This means you can safely use around 70 connections (from experience). You
> can then divided these connections with individual JNDI pools like this:
>
>- OAI: 15 connections
>- REST: 15 connections
>- WEB UI: 40 connections
>
>
> Let me know if you've created a JNDI DB pool wiki page. I'll then try
> to add some useful information on JDBC interceptors (
> https://tomcat.apache.org/tomcat-7.0-doc/jdbc-pool.html#
> Configuring_JDBC_interceptors).
>
>
> [image: logo] Tom Desair
> 250-B Suite 3A, Lucius Gordon Drive, West Henrietta, NY 14586
> 
> Gaston Geenslaan 14, Leuven 3001, Belgium
> 
> www.atmire.com
> 

Re: [dspace-tech] Re: Notes on PostgreSQL connection pooling with a Tomcat JNDI resource

2018-01-13 Thread Alan Orth
I've been testing DSpace 6.2 with Tomcat 8.5.24 and noticed that DSpace
does not start if you supply a database pool from JNDI(!). It seems to be a
known issue, as Mark Wood created a Jira ticket for it[0]. Also, DSpace 6.2
still has the db.jndi setting in dspace.cfg, even though this setting is no
longer user customizable. I've made a pull request against the dspace-6_x
branch to remove it, but I did not create a Jira ticket. Is it required?

[0] https://jira.duraspace.org/browse/DS-3434
[1] https://github.com/DSpace/DSpace/pull/1917

Regards,

On Thu, Jan 11, 2018 at 9:37 AM Alan Orth  wrote:

> To continue the discussion on a slightly related note: I've just finished
> dealing with the fallout caused by some new bot — the only fingerprint of
> which is its unique-but-normal-looking user agent — hitting our XMLUI with
> 450,000 requests from six different IPs over just a few hours. This
> generated a ridiculous amount of load on the server, including 160
> PostgreSQL connections and 52,000 Tomcat sessions before I was able to
> mitigate it. Surprisingly, since I had increased out pool size to 300 after
> my last message, we never got pool timeout or database connection errors in
> dspace.log, but the site was very unresponsive — and this is on a beefy
> server with SSDs, plenty of RAM, large PostgreSQL buffer cache, etc! I
> ended up having to rate limit this user agent in our frontend nginx web
> server using the limit_req_zone module[0].
>
> So a bit of a mixed success and frustration here. No amount of pool
> tweaking will fix this type of issue, because there's always another
> bigger, stupider bot that comes along eventually and doesn't match the
> "bot" user agent. I will definitely look into implementing separate pools
> as Tom had suggested, though, to limit the damage caused by high load to
> certain DSpace web applications. Keep sharing your experiences! This is
> very valuable and interesting to me.
>
> [0]
> https://github.com/ilri/rmg-ansible-public/commit/368faaa99028c8e0c8a99de3f6c253a228d5f63b
>
> Cheers!
>
> On Thu, Jan 4, 2018 at 7:31 AM Alan Orth  wrote:
>
>> That's a cool idea to use a separate pool for each web application, Tom!
>> I'd much rather have my OAI fail to establish a database connection than my
>> XMLUI. ;)
>>
>> Since I wrote the original mailing list message two weeks ago I've had
>> DSpace fail to establish a database connection a few thousand times and
>> I've increased my pool's max active from 50 to 75 and then 125 — our site
>> gets about four million hits per month (from looking at nginx logs), so I'm
>> still trying to find the "sweet spot" for the pool settings. Anything's
>> better than setting the pool in dspace.cfg, though.
>>
>> I wish other people would share their pool settings and experiences.
>>
>> On Wed, Jan 3, 2018 at 2:40 PM Hardy Pottinger 
>> wrote:
>>
>>> Hi, please do create this wiki page, I'd love to read it. Thanks!
>>>
>>> --Hardy
>>>
>>> On Wed, Jan 3, 2018 at 4:10 PM, Tom Desair 
>>> wrote:
>>>
 I just wanted to add a small note that having 1 single DB pool for all
 Tomcat webapps can (and has) lead to problems. Your current pool size is
 50. This means that if you have (malicious) crawlers hitting your OAI
 endpoint, this can deplete the available database connections available for
 the web UI (XMLUI or JSPUI). The other way around can also happen.

 But using JNDI DB pools also give you more fine-grained control over
 the connection distribution over the different web apps. For example, a
 default PostgreSQL installation comes with a max connection limit of 100.
 This means you can safely use around 70 connections (from experience). You
 can then divided these connections with individual JNDI pools like this:

- OAI: 15 connections
- REST: 15 connections
- WEB UI: 40 connections


 Let me know if you've created a JNDI DB pool wiki page. I'll then try
 to add some useful information on JDBC interceptors (
 https://tomcat.apache.org/tomcat-7.0-doc/jdbc-pool.html#Configuring_JDBC_interceptors
 ).


 [image: logo] Tom Desair
 250-B Suite 3A, Lucius Gordon Drive, West Henrietta, NY 14586
 
 Gaston Geenslaan 14, Leuven 3001, Belgium
 
 www.atmire.com
 

 2018-01-03 22:36 GMT+01:00 Tim Donohue :

> Hi Alan & Mark,
>
> These notes look like the start to some enhanced documentation around
> setting up DSpace + Tomcat JNDI (hint, hint).
>
> I'm wondering (out loud) if we should take these concepts/ideas and
> turn 

Re: [dspace-tech] Re: Notes on PostgreSQL connection pooling with a Tomcat JNDI resource

2018-01-10 Thread Alan Orth
To continue the discussion on a slightly related note: I've just finished
dealing with the fallout caused by some new bot — the only fingerprint of
which is its unique-but-normal-looking user agent — hitting our XMLUI with
450,000 requests from six different IPs over just a few hours. This
generated a ridiculous amount of load on the server, including 160
PostgreSQL connections and 52,000 Tomcat sessions before I was able to
mitigate it. Surprisingly, since I had increased out pool size to 300 after
my last message, we never got pool timeout or database connection errors in
dspace.log, but the site was very unresponsive — and this is on a beefy
server with SSDs, plenty of RAM, large PostgreSQL buffer cache, etc! I
ended up having to rate limit this user agent in our frontend nginx web
server using the limit_req_zone module[0].

So a bit of a mixed success and frustration here. No amount of pool
tweaking will fix this type of issue, because there's always another
bigger, stupider bot that comes along eventually and doesn't match the
"bot" user agent. I will definitely look into implementing separate pools
as Tom had suggested, though, to limit the damage caused by high load to
certain DSpace web applications. Keep sharing your experiences! This is
very valuable and interesting to me.

[0]
https://github.com/ilri/rmg-ansible-public/commit/368faaa99028c8e0c8a99de3f6c253a228d5f63b

Cheers!

On Thu, Jan 4, 2018 at 7:31 AM Alan Orth  wrote:

> That's a cool idea to use a separate pool for each web application, Tom!
> I'd much rather have my OAI fail to establish a database connection than my
> XMLUI. ;)
>
> Since I wrote the original mailing list message two weeks ago I've had
> DSpace fail to establish a database connection a few thousand times and
> I've increased my pool's max active from 50 to 75 and then 125 — our site
> gets about four million hits per month (from looking at nginx logs), so I'm
> still trying to find the "sweet spot" for the pool settings. Anything's
> better than setting the pool in dspace.cfg, though.
>
> I wish other people would share their pool settings and experiences.
>
> On Wed, Jan 3, 2018 at 2:40 PM Hardy Pottinger 
> wrote:
>
>> Hi, please do create this wiki page, I'd love to read it. Thanks!
>>
>> --Hardy
>>
>> On Wed, Jan 3, 2018 at 4:10 PM, Tom Desair  wrote:
>>
>>> I just wanted to add a small note that having 1 single DB pool for all
>>> Tomcat webapps can (and has) lead to problems. Your current pool size is
>>> 50. This means that if you have (malicious) crawlers hitting your OAI
>>> endpoint, this can deplete the available database connections available for
>>> the web UI (XMLUI or JSPUI). The other way around can also happen.
>>>
>>> But using JNDI DB pools also give you more fine-grained control over the
>>> connection distribution over the different web apps. For example, a default
>>> PostgreSQL installation comes with a max connection limit of 100. This
>>> means you can safely use around 70 connections (from experience). You can
>>> then divided these connections with individual JNDI pools like this:
>>>
>>>- OAI: 15 connections
>>>- REST: 15 connections
>>>- WEB UI: 40 connections
>>>
>>>
>>> Let me know if you've created a JNDI DB pool wiki page. I'll then try to
>>> add some useful information on JDBC interceptors (
>>> https://tomcat.apache.org/tomcat-7.0-doc/jdbc-pool.html#Configuring_JDBC_interceptors
>>> ).
>>>
>>>
>>> [image: logo] Tom Desair
>>> 250-B Suite 3A, Lucius Gordon Drive, West Henrietta, NY 14586
>>> 
>>> Gaston Geenslaan 14, Leuven 3001, Belgium
>>> 
>>> www.atmire.com
>>> 
>>>
>>> 2018-01-03 22:36 GMT+01:00 Tim Donohue :
>>>
 Hi Alan & Mark,

 These notes look like the start to some enhanced documentation around
 setting up DSpace + Tomcat JNDI (hint, hint).

 I'm wondering (out loud) if we should take these concepts/ideas and
 turn them into official documentation in the "Installing DSpace" section
 (maybe somewhere under "Advanced Installation"?):
 https://wiki.duraspace.org/display/DSDOC6x/Installing+DSpace

 Thanks though for sharing the notes and Q here. I think this will be
 very helpful for others who wish to go this route.

 - Tim


 On Wed, Jan 3, 2018 at 3:17 PM Mark H. Wood 
 wrote:

> Thanks for posting these notes.  I'm sure they will be helpful.
> You've shown some tools that I didn't know about.
>
> A pool instantiated by DSpace is probably effectively invisible to
> other webapp.s even in the same JVM.  The Servlet spec. tries very hard to
> create the 

Re: [dspace-tech] Re: Notes on PostgreSQL connection pooling with a Tomcat JNDI resource

2018-01-03 Thread Alan Orth
That's a cool idea to use a separate pool for each web application, Tom!
I'd much rather have my OAI fail to establish a database connection than my
XMLUI. ;)

Since I wrote the original mailing list message two weeks ago I've had
DSpace fail to establish a database connection a few thousand times and
I've increased my pool's max active from 50 to 75 and then 125 — our site
gets about four million hits per month (from looking at nginx logs), so I'm
still trying to find the "sweet spot" for the pool settings. Anything's
better than setting the pool in dspace.cfg, though.

I wish other people would share their pool settings and experiences.

On Wed, Jan 3, 2018 at 2:40 PM Hardy Pottinger 
wrote:

> Hi, please do create this wiki page, I'd love to read it. Thanks!
>
> --Hardy
>
> On Wed, Jan 3, 2018 at 4:10 PM, Tom Desair  wrote:
>
>> I just wanted to add a small note that having 1 single DB pool for all
>> Tomcat webapps can (and has) lead to problems. Your current pool size is
>> 50. This means that if you have (malicious) crawlers hitting your OAI
>> endpoint, this can deplete the available database connections available for
>> the web UI (XMLUI or JSPUI). The other way around can also happen.
>>
>> But using JNDI DB pools also give you more fine-grained control over the
>> connection distribution over the different web apps. For example, a default
>> PostgreSQL installation comes with a max connection limit of 100. This
>> means you can safely use around 70 connections (from experience). You can
>> then divided these connections with individual JNDI pools like this:
>>
>>- OAI: 15 connections
>>- REST: 15 connections
>>- WEB UI: 40 connections
>>
>>
>> Let me know if you've created a JNDI DB pool wiki page. I'll then try to
>> add some useful information on JDBC interceptors (
>> https://tomcat.apache.org/tomcat-7.0-doc/jdbc-pool.html#Configuring_JDBC_interceptors
>> ).
>>
>>
>> [image: logo] Tom Desair
>> 250-B Suite 3A, Lucius Gordon Drive, West Henrietta, NY 14586
>> 
>> Gaston Geenslaan 14, Leuven 3001, Belgium
>> 
>> www.atmire.com
>> 
>>
>> 2018-01-03 22:36 GMT+01:00 Tim Donohue :
>>
>>> Hi Alan & Mark,
>>>
>>> These notes look like the start to some enhanced documentation around
>>> setting up DSpace + Tomcat JNDI (hint, hint).
>>>
>>> I'm wondering (out loud) if we should take these concepts/ideas and turn
>>> them into official documentation in the "Installing DSpace" section (maybe
>>> somewhere under "Advanced Installation"?):
>>> https://wiki.duraspace.org/display/DSDOC6x/Installing+DSpace
>>>
>>> Thanks though for sharing the notes and Q here. I think this will be
>>> very helpful for others who wish to go this route.
>>>
>>> - Tim
>>>
>>>
>>> On Wed, Jan 3, 2018 at 3:17 PM Mark H. Wood 
>>> wrote:
>>>
 Thanks for posting these notes.  I'm sure they will be helpful.  You've
 shown some tools that I didn't know about.

 A pool instantiated by DSpace is probably effectively invisible to
 other webapp.s even in the same JVM.  The Servlet spec. tries very hard to
 create the illusion that each webapp. is floating in a kind of womb, where
 everything it needs is mysteriously provided for it from somewhere beyond
 its perception.  Each has its own classloader, for example, and so things
 that a webapp. creates for itself tend to be known only in places that are
 not accessible by other webapp.s.  I could wish that DSpace made more
 thorough use of the Servlet environment rather than behaving as if it is
 standalone code.

 You're quite correct that the command-line tools don't share a pool
 with any of the webapp.s, because the launcher runs in a different process
 with its own address space.  This is one reason to continue specifying pool
 settings in dspace.cfg -- IMO this should be the *only* use of those
 settings.  It *is* possible to supply a pool to the command line out of
 JNDI -- I've done it -- but you need to supply a directory service to the
 process.  I can say a little about that if anybody is interested.  You
 could provide in dspace.cfg settings more appropriate to the command line,
 if your webapp.s are set up with pools (tuned for their needs) from JNDI.

 The reason you don't have to tinker with directory services for
 webapp.s is that the  and  elements are causing
 your Servlet container (Tomcat) to populate an internal directory service
 with objects such as your connection pool.  This is specified by Java EE,
 but many Servlet containers implement it even when not required by the
 relevant spec.s.


Re: [dspace-tech] Re: Notes on PostgreSQL connection pooling with a Tomcat JNDI resource

2018-01-03 Thread Hardy Pottinger
Hi, please do create this wiki page, I'd love to read it. Thanks!

--Hardy

On Wed, Jan 3, 2018 at 4:10 PM, Tom Desair  wrote:

> I just wanted to add a small note that having 1 single DB pool for all
> Tomcat webapps can (and has) lead to problems. Your current pool size is
> 50. This means that if you have (malicious) crawlers hitting your OAI
> endpoint, this can deplete the available database connections available for
> the web UI (XMLUI or JSPUI). The other way around can also happen.
>
> But using JNDI DB pools also give you more fine-grained control over the
> connection distribution over the different web apps. For example, a default
> PostgreSQL installation comes with a max connection limit of 100. This
> means you can safely use around 70 connections (from experience). You can
> then divided these connections with individual JNDI pools like this:
>
>- OAI: 15 connections
>- REST: 15 connections
>- WEB UI: 40 connections
>
>
> Let me know if you've created a JNDI DB pool wiki page. I'll then try to
> add some useful information on JDBC interceptors (
> https://tomcat.apache.org/tomcat-7.0-doc/jdbc-pool.html#
> Configuring_JDBC_interceptors).
>
>
> [image: logo] Tom Desair
> 250-B Suite 3A, Lucius Gordon Drive, West Henrietta, NY 14586
> 
> Gaston Geenslaan 14, Leuven 3001, Belgium
> 
> www.atmire.com
> 
>
> 2018-01-03 22:36 GMT+01:00 Tim Donohue :
>
>> Hi Alan & Mark,
>>
>> These notes look like the start to some enhanced documentation around
>> setting up DSpace + Tomcat JNDI (hint, hint).
>>
>> I'm wondering (out loud) if we should take these concepts/ideas and turn
>> them into official documentation in the "Installing DSpace" section (maybe
>> somewhere under "Advanced Installation"?): https://wiki.
>> duraspace.org/display/DSDOC6x/Installing+DSpace
>>
>> Thanks though for sharing the notes and Q here. I think this will be
>> very helpful for others who wish to go this route.
>>
>> - Tim
>>
>>
>> On Wed, Jan 3, 2018 at 3:17 PM Mark H. Wood  wrote:
>>
>>> Thanks for posting these notes.  I'm sure they will be helpful.  You've
>>> shown some tools that I didn't know about.
>>>
>>> A pool instantiated by DSpace is probably effectively invisible to other
>>> webapp.s even in the same JVM.  The Servlet spec. tries very hard to create
>>> the illusion that each webapp. is floating in a kind of womb, where
>>> everything it needs is mysteriously provided for it from somewhere beyond
>>> its perception.  Each has its own classloader, for example, and so things
>>> that a webapp. creates for itself tend to be known only in places that are
>>> not accessible by other webapp.s.  I could wish that DSpace made more
>>> thorough use of the Servlet environment rather than behaving as if it is
>>> standalone code.
>>>
>>> You're quite correct that the command-line tools don't share a pool with
>>> any of the webapp.s, because the launcher runs in a different process with
>>> its own address space.  This is one reason to continue specifying pool
>>> settings in dspace.cfg -- IMO this should be the *only* use of those
>>> settings.  It *is* possible to supply a pool to the command line out of
>>> JNDI -- I've done it -- but you need to supply a directory service to the
>>> process.  I can say a little about that if anybody is interested.  You
>>> could provide in dspace.cfg settings more appropriate to the command line,
>>> if your webapp.s are set up with pools (tuned for their needs) from JNDI.
>>>
>>> The reason you don't have to tinker with directory services for webapp.s
>>> is that the  and  elements are causing your Servlet
>>> container (Tomcat) to populate an internal directory service with objects
>>> such as your connection pool.  This is specified by Java EE, but many
>>> Servlet containers implement it even when not required by the relevant
>>> spec.s.
>>>
>>> You *do* need to supply any DBMS drivers to the container itself,
>>> because the pool and connections are created by the container and so must
>>> be visible from *its* classloader(s), which (in Tomcat anyway) are on a
>>> branch of the hierarchy that is parallel to those of the webapp.s.  I also
>>> would use the latest released driver.
>>>
>>> It should be simple to provide a log message when the resolution of
>>> jdbc/dspace *succeeds*, and I think we should.  There's already a Jira
>>> issue about making the other case less scary, and perhaps this should be
>>> included in that work.
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "DSpace Technical Support" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to 

Re: [dspace-tech] Re: Notes on PostgreSQL connection pooling with a Tomcat JNDI resource

2018-01-03 Thread Tom Desair
I just wanted to add a small note that having 1 single DB pool for all
Tomcat webapps can (and has) lead to problems. Your current pool size is
50. This means that if you have (malicious) crawlers hitting your OAI
endpoint, this can deplete the available database connections available for
the web UI (XMLUI or JSPUI). The other way around can also happen.

But using JNDI DB pools also give you more fine-grained control over the
connection distribution over the different web apps. For example, a default
PostgreSQL installation comes with a max connection limit of 100. This
means you can safely use around 70 connections (from experience). You can
then divided these connections with individual JNDI pools like this:

   - OAI: 15 connections
   - REST: 15 connections
   - WEB UI: 40 connections


Let me know if you've created a JNDI DB pool wiki page. I'll then try to
add some useful information on JDBC interceptors (
https://tomcat.apache.org/tomcat-7.0-doc/jdbc-pool.html#Configuring_JDBC_interceptors
).


[image: logo] Tom Desair
250-B Suite 3A, Lucius Gordon Drive, West Henrietta, NY 14586
Gaston Geenslaan 14, Leuven 3001, Belgium
www.atmire.com


2018-01-03 22:36 GMT+01:00 Tim Donohue :

> Hi Alan & Mark,
>
> These notes look like the start to some enhanced documentation around
> setting up DSpace + Tomcat JNDI (hint, hint).
>
> I'm wondering (out loud) if we should take these concepts/ideas and turn
> them into official documentation in the "Installing DSpace" section (maybe
> somewhere under "Advanced Installation"?): https://wiki.
> duraspace.org/display/DSDOC6x/Installing+DSpace
>
> Thanks though for sharing the notes and Q here. I think this will be
> very helpful for others who wish to go this route.
>
> - Tim
>
>
> On Wed, Jan 3, 2018 at 3:17 PM Mark H. Wood  wrote:
>
>> Thanks for posting these notes.  I'm sure they will be helpful.  You've
>> shown some tools that I didn't know about.
>>
>> A pool instantiated by DSpace is probably effectively invisible to other
>> webapp.s even in the same JVM.  The Servlet spec. tries very hard to create
>> the illusion that each webapp. is floating in a kind of womb, where
>> everything it needs is mysteriously provided for it from somewhere beyond
>> its perception.  Each has its own classloader, for example, and so things
>> that a webapp. creates for itself tend to be known only in places that are
>> not accessible by other webapp.s.  I could wish that DSpace made more
>> thorough use of the Servlet environment rather than behaving as if it is
>> standalone code.
>>
>> You're quite correct that the command-line tools don't share a pool with
>> any of the webapp.s, because the launcher runs in a different process with
>> its own address space.  This is one reason to continue specifying pool
>> settings in dspace.cfg -- IMO this should be the *only* use of those
>> settings.  It *is* possible to supply a pool to the command line out of
>> JNDI -- I've done it -- but you need to supply a directory service to the
>> process.  I can say a little about that if anybody is interested.  You
>> could provide in dspace.cfg settings more appropriate to the command line,
>> if your webapp.s are set up with pools (tuned for their needs) from JNDI.
>>
>> The reason you don't have to tinker with directory services for webapp.s
>> is that the  and  elements are causing your Servlet
>> container (Tomcat) to populate an internal directory service with objects
>> such as your connection pool.  This is specified by Java EE, but many
>> Servlet containers implement it even when not required by the relevant
>> spec.s.
>>
>> You *do* need to supply any DBMS drivers to the container itself, because
>> the pool and connections are created by the container and so must be
>> visible from *its* classloader(s), which (in Tomcat anyway) are on a branch
>> of the hierarchy that is parallel to those of the webapp.s.  I also would
>> use the latest released driver.
>>
>> It should be simple to provide a log message when the resolution of
>> jdbc/dspace *succeeds*, and I think we should.  There's already a Jira
>> issue about making the other case less scary, and perhaps this should be
>> included in that work.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "DSpace Technical Support" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to dspace-tech+unsubscr...@googlegroups.com.
>> To post to this group, send email to dspace-tech@googlegroups.com.
>> Visit this group at https://groups.google.com/group/dspace-tech.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
> --
> Tim Donohue
> Technical Lead for DSpace & DSpaceDirect
> DuraSpace.org | DSpace.org | DSpaceDirect.org
>
> --
> You received this message because you are subscribed to the Google Groups
> 

Re: [dspace-tech] Re: Notes on PostgreSQL connection pooling with a Tomcat JNDI resource

2018-01-03 Thread Tim Donohue
Hi Alan & Mark,

These notes look like the start to some enhanced documentation around
setting up DSpace + Tomcat JNDI (hint, hint).

I'm wondering (out loud) if we should take these concepts/ideas and turn
them into official documentation in the "Installing DSpace" section (maybe
somewhere under "Advanced Installation"?):
https://wiki.duraspace.org/display/DSDOC6x/Installing+DSpace

Thanks though for sharing the notes and Q here. I think this will be very
helpful for others who wish to go this route.

- Tim

On Wed, Jan 3, 2018 at 3:17 PM Mark H. Wood  wrote:

> Thanks for posting these notes.  I'm sure they will be helpful.  You've
> shown some tools that I didn't know about.
>
> A pool instantiated by DSpace is probably effectively invisible to other
> webapp.s even in the same JVM.  The Servlet spec. tries very hard to create
> the illusion that each webapp. is floating in a kind of womb, where
> everything it needs is mysteriously provided for it from somewhere beyond
> its perception.  Each has its own classloader, for example, and so things
> that a webapp. creates for itself tend to be known only in places that are
> not accessible by other webapp.s.  I could wish that DSpace made more
> thorough use of the Servlet environment rather than behaving as if it is
> standalone code.
>
> You're quite correct that the command-line tools don't share a pool with
> any of the webapp.s, because the launcher runs in a different process with
> its own address space.  This is one reason to continue specifying pool
> settings in dspace.cfg -- IMO this should be the *only* use of those
> settings.  It *is* possible to supply a pool to the command line out of
> JNDI -- I've done it -- but you need to supply a directory service to the
> process.  I can say a little about that if anybody is interested.  You
> could provide in dspace.cfg settings more appropriate to the command line,
> if your webapp.s are set up with pools (tuned for their needs) from JNDI.
>
> The reason you don't have to tinker with directory services for webapp.s
> is that the  and  elements are causing your Servlet
> container (Tomcat) to populate an internal directory service with objects
> such as your connection pool.  This is specified by Java EE, but many
> Servlet containers implement it even when not required by the relevant
> spec.s.
>
> You *do* need to supply any DBMS drivers to the container itself, because
> the pool and connections are created by the container and so must be
> visible from *its* classloader(s), which (in Tomcat anyway) are on a branch
> of the hierarchy that is parallel to those of the webapp.s.  I also would
> use the latest released driver.
>
> It should be simple to provide a log message when the resolution of
> jdbc/dspace *succeeds*, and I think we should.  There's already a Jira
> issue about making the other case less scary, and perhaps this should be
> included in that work.
>
> --
> You received this message because you are subscribed to the Google Groups
> "DSpace Technical Support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dspace-tech+unsubscr...@googlegroups.com.
> To post to this group, send email to dspace-tech@googlegroups.com.
> Visit this group at https://groups.google.com/group/dspace-tech.
> For more options, visit https://groups.google.com/d/optout.
>


-- 
Tim Donohue
Technical Lead for DSpace & DSpaceDirect
DuraSpace.org | DSpace.org | DSpaceDirect.org

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] Re: Notes on PostgreSQL connection pooling with a Tomcat JNDI resource

2018-01-03 Thread Mark H. Wood
Thanks for posting these notes.  I'm sure they will be helpful.  You've 
shown some tools that I didn't know about.

A pool instantiated by DSpace is probably effectively invisible to other 
webapp.s even in the same JVM.  The Servlet spec. tries very hard to create 
the illusion that each webapp. is floating in a kind of womb, where 
everything it needs is mysteriously provided for it from somewhere beyond 
its perception.  Each has its own classloader, for example, and so things 
that a webapp. creates for itself tend to be known only in places that are 
not accessible by other webapp.s.  I could wish that DSpace made more 
thorough use of the Servlet environment rather than behaving as if it is 
standalone code.

You're quite correct that the command-line tools don't share a pool with 
any of the webapp.s, because the launcher runs in a different process with 
its own address space.  This is one reason to continue specifying pool 
settings in dspace.cfg -- IMO this should be the *only* use of those 
settings.  It *is* possible to supply a pool to the command line out of 
JNDI -- I've done it -- but you need to supply a directory service to the 
process.  I can say a little about that if anybody is interested.  You 
could provide in dspace.cfg settings more appropriate to the command line, 
if your webapp.s are set up with pools (tuned for their needs) from JNDI.

The reason you don't have to tinker with directory services for webapp.s is 
that the  and  elements are causing your Servlet 
container (Tomcat) to populate an internal directory service with objects 
such as your connection pool.  This is specified by Java EE, but many 
Servlet containers implement it even when not required by the relevant 
spec.s.

You *do* need to supply any DBMS drivers to the container itself, because 
the pool and connections are created by the container and so must be 
visible from *its* classloader(s), which (in Tomcat anyway) are on a branch 
of the hierarchy that is parallel to those of the webapp.s.  I also would 
use the latest released driver.

It should be simple to provide a log message when the resolution of 
jdbc/dspace *succeeds*, and I think we should.  There's already a Jira 
issue about making the other case less scary, and perhaps this should be 
included in that work.

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.