Hi All,
We are following up on a memory leak that we reported with 6.2.1 and have now
verified to be in 7.1.3 also. What happens is that when Log Collation is
configured and the log collation host is down that the log buffer will be
orphaned and leaked. At first we thought this was a lab only issue, but this
can also affect production depending upon connectivity issues, etc.
What I did was put debug code into the LogBuffer::destroy() to monitor the
refcnt on each call. The log buffer will be deleted only if refcnt is 1. The
value of refcnt is the value of the log buffer's m_references before
ink_atomic_increment(-1) is run inside this function since
ink_atomic_increment() returns the old value rather than the new value. So if
refcnt is 1 that means the m_references is now 0 and the log buffer can be
deleted. When I looked at the debug trace I saw for a particular log buffer
pointer value that the refcnt would be 2 then 3 then 2 and the pointer would
never be found again in the trace (so likely leaked and forgotten about). So it
turns out that the incrementing from 2 to 3 is due to the following --
diff --git a/proxy/logging/LogHost.cc b/proxy/logging/LogHost.cc
index 981bd1265..860906df9 100644
--- a/proxy/logging/LogHost.cc
+++ b/proxy/logging/LogHost.cc
@@ -427,7 +427,6 @@ LogHostList::preproc_and_try_delete(LogBuffer *lb)
}
if (lb != nullptr && need_orphan && available_host) {
- ink_atomic_increment(&lb->m_references, 1);
available_host->orphan_write_and_try_delete(lb);
} }
I tested with the patch above (removing the increment in the case of orphan
writes) and the memory leak is gone. But appreciate any second opinions on this
issue/solution since I know that some other folks may have better knowledge on
the logging system.
Thanks,
Peter