On 11/22/17 2:43 PM, Shawn Heisey wrote:
> On 11/22/2017 12:04 PM, Phil Steitz wrote:
>> On 11/22/17 9:43 AM, Shawn Heisey wrote:
>>> I do have results from the isClosed method when the problem happens. 
>>> That method *does* return true. 
>> That points to a Pool or DBCP bug, assuming you are sure that no
>> other thread has a reference to the PoolableConnection or some other
>> code path did not call close on it before you tested isClosed.  If
>> you are sure this is not happening, you should open a DBCP JIRA
>> (which may end up reassigned to pool).  Ideal would be to have a
>> test case that makes it happen.
> I would absolutely love to have a test case that can reproduce it, but
> since I haven't got any idea what the root of the problem is, I wouldn't
> know how to write such a test case.
>
> What I'd really like to do is be able to look over dbcp2 and pool2 code
> to see if I can spot a problem, but I'm having a hard time following the
> code.  I expected to find some kind of synchronization in the code
> branching from getConnection() to prevent different threads from being
> able to step on each other, but haven't seen any so far.  I can't tell
> if this means I'm looking in the wrong place or not.  The object
> inheritance is pretty extensive, so I'm having a hard time finding the
> right place to look.
>
> If it turns out that there is zero synchronization happening between the
> idle eviction thread and the depths of the code for things like
> getConnection, then I don't see how any kind of guarantee can be made. 
> So far the synchronization object in the eviction thread only seems to
> pair with other parts of eviction, NOT with anything else in the library.
>
> If the testXXX flags I've enabled do eliminate the problems I'm seeing,
> that's awesome, and it's *A* solution, but I think I'm still running
> into some kind of issue that needs to be fixed.  I just need to figure
> out whether it's dbcp2, pool2, or something in my own environment.  I'm
> willing to entertain the idea that it's my environment, but based on
> everything that I understand about my own code and our database servers
> (and I fully admit it's circumstantial evidence), it points to a problem
> with the idle eviction thread.
>
> I believe you when you say that the *intent* is for idle eviction to
> never close/evict a connection that's been requested from the pool.  I
> would like to verify whether the intent and what's actually implemented
> are the same.  If they're not the same, then I would like to attempt a
> patch.  I'm going to need help in figuring out exactly where I should be
> looking in the code for dbcp2 and pool2.

If the problem is the evictor closing a connection and having that
connection delivered to a client, the problem is almost certainly in
pool.  The thread-safety of the pool in this regard is engineered in
DefaultPooledObject, which is the wrapper that pool manages and
delivers to DBCP.  When the evictor visits a PooledObject (in
GenericObjectPool#evict) it tries to start the eviction test on the
object by calling its startEvictionTest method.  This method is
synchronized on the DefaultPooledObject.  Look at the code in that
method.  It checks to make sure that the object is in fact idle in
the pool.  The other half of the protection here is in
GenericObjectPool#borrowObject, which is what PoolingDataSource
calls to get a connection.  That method tries to get a PooledObject
from the pool and before handing it out (or validating it), it calls
the PooledObject's allocate method.  Look at the code for that in
DefaultPooledObject.  That method (also synchronized on the
PooledObject) checks that the object is not under eviction and sets
its state to allocated.  That is the core sync protection that
*should* make it impossible for the evictor to do anything to an
object that has been handed out to a client.

The logical place to start to get a test case that shows this
protection failing is to just set up a pool with very aggressive
eviction config (very small idle object timeout), frequent eviction
runs and a lot of concurrent borrowing.  Make sure the factory's
destroy method does something to simulate what PCF does to mark the
object as dead and see if you get any corpses handed out to
borrowers.  Also make sure that there are enough idle instances in
the pool for the evictor to visit.  For that, you probably want to
vary the borrowing load.  You can set up jmx to observe the pool
stats to see how many are idle at a given time or just log it using
the getNumIdle.  A quick look at the existing pool2 test cases does
not show exactly that scenario covered, so it would be good to add
in any case.

Phil


>
> Thanks,
> Shawn
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to