Hello,

I'd rather narrow down the problem a bit more before upgrading (to make
sure that that's the issue).

It happened again, this time I took a thread dump of the instance that went
down (actually, it's only the DB connection pool that becomes unresponsive
but I remove the instance from my load balancer when this happens).

The thread dump shows pretty much the same things that the logs showed when
I shutdown tomcat: a very long list of a lot of stack traces which all get
stuck when trying to access the connection pool. An example from the thread
dump:

"http-apr-8080-exec-805" #840 daemon prio=5 os_prio=0
tid=0x00007f50d00c1800 nid=0x5fbc waiting on condition [0x00007f50c62e7000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000000ec44cf38> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
        at
org.apache.tomcat.dbcp.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:582)
        at
org.apache.tomcat.dbcp.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:439)
        at
org.apache.tomcat.dbcp.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:360)
        at
org.apache.tomcat.dbcp.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:118)
        at
org.apache.tomcat.dbcp.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:1412)

Questions:
1) What order does a thread dump print in? That is, I'd like to know what
the first thread was in my thread dump (temporally) so that I might know
which thread caused the actual issue. Is this the beginning of the thread
dump or the end?

2) The thread dump was so long that I can't see the beginning of it. Does
anyone know how to get the thread dump in bits and pieces so the beginning
doesn't get cut off when it's long?

Possible causes:
Is it possible that someone is acting maliciously and intentionally hitting
certain pages many times to cause the pool to lock up?
I think this because if I just take a thread dump when my application is
functioning, then it doesn't show anything suspicious.
This leads me to believe that the problem is not cumulative but rather
happens all at once.
Then when I take a look at the problematic thread dump and see so many
stack traces that access my database, it leads me to think that they are
accessed all at once at the time that the pool locks up.

Any thoughts?

Thanks a lot.
_


On Mon, Sep 12, 2016 at 9:37 PM, Mark Thomas <ma...@apache.org> wrote:

> On 12/09/2016 19:02, Yuval Schwartz wrote:
> > Hey Mark, thanks a lot.
> >
> > On Mon, Sep 12, 2016 at 4:42 PM, Mark Thomas <ma...@apache.org> wrote:
> >
> >> On 12/09/2016 11:54, Yuval Schwartz wrote:
>
> <snip/>
>
> >> It might also be a bug in the connection pool that has been fixed.
> >> Upgrading to the latest 8.0.x (or better still the latest 8.5.x) should
> >> address that.
> >>
> >
> > I'll look for a bug report on this (although I haven't found anything as
> of
> > yet).
>
> There was one around XA connections that could result in a leak. The
> others, you'd need to dig into the DBCP change log and Jira for.
>
> > I wouldn't mind upgrading but do you think this could be a bug? I've been
> > running my application with this setup for about 8 months; the problem
> only
> > started in the last week.
>
> That begs the question what changed a week ago?
>
> Mark
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>
>

Reply via email to