On 11/22/17 2:43 PM, Shawn Heisey wrote: > On 11/22/2017 12:04 PM, Phil Steitz wrote: >> On 11/22/17 9:43 AM, Shawn Heisey wrote: >>> I do have results from the isClosed method when the problem happens. >>> That method *does* return true. >> That points to a Pool or DBCP bug, assuming you are sure that no >> other thread has a reference to the PoolableConnection or some other >> code path did not call close on it before you tested isClosed. If >> you are sure this is not happening, you should open a DBCP JIRA >> (which may end up reassigned to pool). Ideal would be to have a >> test case that makes it happen. > I would absolutely love to have a test case that can reproduce it, but > since I haven't got any idea what the root of the problem is, I wouldn't > know how to write such a test case. > > What I'd really like to do is be able to look over dbcp2 and pool2 code > to see if I can spot a problem, but I'm having a hard time following the > code. I expected to find some kind of synchronization in the code > branching from getConnection() to prevent different threads from being > able to step on each other, but haven't seen any so far. I can't tell > if this means I'm looking in the wrong place or not. The object > inheritance is pretty extensive, so I'm having a hard time finding the > right place to look. > > If it turns out that there is zero synchronization happening between the > idle eviction thread and the depths of the code for things like > getConnection, then I don't see how any kind of guarantee can be made. > So far the synchronization object in the eviction thread only seems to > pair with other parts of eviction, NOT with anything else in the library. > > If the testXXX flags I've enabled do eliminate the problems I'm seeing, > that's awesome, and it's *A* solution, but I think I'm still running > into some kind of issue that needs to be fixed. I just need to figure > out whether it's dbcp2, pool2, or something in my own environment. I'm > willing to entertain the idea that it's my environment, but based on > everything that I understand about my own code and our database servers > (and I fully admit it's circumstantial evidence), it points to a problem > with the idle eviction thread. > > I believe you when you say that the *intent* is for idle eviction to > never close/evict a connection that's been requested from the pool. I > would like to verify whether the intent and what's actually implemented > are the same. If they're not the same, then I would like to attempt a > patch. I'm going to need help in figuring out exactly where I should be > looking in the code for dbcp2 and pool2.
If the problem is the evictor closing a connection and having that connection delivered to a client, the problem is almost certainly in pool. The thread-safety of the pool in this regard is engineered in DefaultPooledObject, which is the wrapper that pool manages and delivers to DBCP. When the evictor visits a PooledObject (in GenericObjectPool#evict) it tries to start the eviction test on the object by calling its startEvictionTest method. This method is synchronized on the DefaultPooledObject. Look at the code in that method. It checks to make sure that the object is in fact idle in the pool. The other half of the protection here is in GenericObjectPool#borrowObject, which is what PoolingDataSource calls to get a connection. That method tries to get a PooledObject from the pool and before handing it out (or validating it), it calls the PooledObject's allocate method. Look at the code for that in DefaultPooledObject. That method (also synchronized on the PooledObject) checks that the object is not under eviction and sets its state to allocated. That is the core sync protection that *should* make it impossible for the evictor to do anything to an object that has been handed out to a client. The logical place to start to get a test case that shows this protection failing is to just set up a pool with very aggressive eviction config (very small idle object timeout), frequent eviction runs and a lot of concurrent borrowing. Make sure the factory's destroy method does something to simulate what PCF does to mark the object as dead and see if you get any corpses handed out to borrowers. Also make sure that there are enough idle instances in the pool for the evictor to visit. For that, you probably want to vary the borrowing load. You can set up jmx to observe the pool stats to see how many are idle at a given time or just log it using the getNumIdle. A quick look at the existing pool2 test cases does not show exactly that scenario covered, so it would be good to add in any case. Phil > > Thanks, > Shawn > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
