I'm able to replicate a leak at will.
When the client makes a request to a non-responsive server, the ConnOperner
object leaks with a non-zero lock count.

I've traced the problem to the use of 'new Pointer(this)'  in 
ConnOpener::connect().
When the connection is in progress, the SetSelect() call establishes a Write 
handler.
When the write handler is called, it properly deletes the pointer which properly
decrements the lock count.

When a timeout occurs, the write handler is never called.  Since it is never 
called,
the pointer is never deleted.

I've worked around the leak by adding the following code to 
ConnOpener::connect(),
just before the comm_connect_addr() switch statement.

    // wipe out the InProgressConnectRetry write handler
     if (temporaryFd_ >= 0) {
        fde *F = &fd_table[temporaryFd_];
        if (F->write_handler == Comm::ConnOpener::InProgressConnectRetry && 
F->write_data != NULL) {
            Pointer *ptr = static_cast<Pointer*>(F->write_data);
            Comm::SetSelect(temporaryFd_, COMM_SELECT_WRITE, NULL, NULL, 0);
            if (ptr->valid()) {
                delete ptr;
            }
        }

Can anyone come up with a better solution?

Mike Mitchell
________________________________________
From: Mike Mitchell
Sent: Monday, December 31, 2012 12:59 PM
To: [email protected]
Subject: ConnOpener leaks in 3.2.5

I'm seeing ConnOpener leaks in 3.2.5.
I compiled with --enable-debug-cbdata and cachemgr reports most of the 
CommOpener objects as invalid, each with a lock count greater than 0.
The lines look like:
Pointer            Type Locks   Allocated by
!0x20959018     29      2       comm/ConnOpener.h:74
!0x1efba998     29      2       comm/ConnOpener.h:74
!0x21bd4248     29      1       ../../src/comm/ConnOpener.h:74

After 100,000 queries I have 2,000 CommOpener objects allocated with 1,900 in 
use.  Of those, 1850 are invalid.

Could someone that understands cbdata take a look?
As near as I can tell only Comm::ConnOpener::InProgresConnectRetry() and 
Comm::ConnOpener::DelayedConnectRetry() duplicate the CommOpener object (by 
calling ptr->valid()).
It looks like the original is destructed, but I'm not sure if that is correct.
cbdataInternalFree() will decrement the lock count, but before it does it first 
checks and clears the "valid" flag.
If the "valid" flag is not set, cbdataInternalFree asserts.  If it is set, it 
clears the "valid" flag and then decrements
the lock count.

I'm very confused about the way cbdata is working.  I'm guessing that the 
"valid" flag is getting cleared before all the locks are removed, which will 
prevent any more locks from being removed.  The twisty maze of function 
templates is too much for me.  Could someone please take a look?

Mike Mitchell






Reply via email to