On 11/22/2017 5:00 PM, Phil Steitz wrote: > If the problem is the evictor closing a connection and having that > connection delivered to a client, the problem is almost certainly in > pool. The thread-safety of the pool in this regard is engineered in > DefaultPooledObject, which is the wrapper that pool manages and > delivers to DBCP. When the evictor visits a PooledObject (in > GenericObjectPool#evict) it tries to start the eviction test on the > object by calling its startEvictionTest method. This method is > synchronized on the DefaultPooledObject. Look at the code in that > method. It checks to make sure that the object is in fact idle in > the pool. The other half of the protection here is in > GenericObjectPool#borrowObject, which is what PoolingDataSource > calls to get a connection. That method tries to get a PooledObject > from the pool and before handing it out (or validating it), it calls > the PooledObject's allocate method. Look at the code for that in > DefaultPooledObject. That method (also synchronized on the > PooledObject) checks that the object is not under eviction and sets > its state to allocated. That is the core sync protection that > *should* make it impossible for the evictor to do anything to an > object that has been handed out to a client.
I see the synchronization you're talking about here. It appears that all of the critical methods in DefaultPooledObject are synchronized (on the object). If you're absolutely certain that DefaultPooledObject is involved with all of the implementation my code is using, it all looks pretty complete to me. So I'm really curious as to why the connection is getting closed. I have seen the problem only minutes after restarting my program, so it seems unlikely that the server side is closing the connection, since the timeout for that is 8 hours. I did add the code a while back to test on create, borrow, return, and while idle, but it turns out that I hadn't actually pulled it down to the test server and recompiled. That is now done, so we'll see if that makes any difference. If testing the connection on pool actions does make a difference, then what is your speculation about what was happening when I ran into the closed connection only minutes after restart, and would it be worthy of an issue in Jira? The only theory I had was a race condition between eviction and borrowing, but unless there's something amiss in how all the object inheritance works out, it looks like that's probably not it. Some kind of issue with the TCP stack in Linux (either on the machines running my code or the MySQL server) is the only other idea I can think of. Or maybe a hardware/firmware issue, since it's likely that at least one of the NICs involved is doing TCP offload. I think that virtually every NIC in our infrastructure has that feature and that Linux enables it. Thanks, Shawn --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@commons.apache.org For additional commands, e-mail: user-h...@commons.apache.org