Was this test/work run/done against trunk, or against 0.8? The 0.8 release (and to much lesser extent 0.6 too) had several extremely nasty memory issues, but whilst undertaking the work for 0.10 that Andrew referenced below I was able to run a couple thousand connections into the broker within around 20MB of heap at the end (on a 64bit JVM, so knock maybe 75% off for a 32bit JVM). I have just performed the indicated test and had the same result.
Having taken and examined the heap dump of the above test I can see that we are indeed leaking connections and associated objects, however not for the reasons indicated so far. The 0-10 connections historically were not added to the ConnectionRegistry but now are being added, and the code for the feature which added them also does remove them when that feature is used, however it fails to remove them when the close is invoked from the client side (ie, the normal case). As a result, the connection is left in the ConnectionRegistry and is held in memory along with its associated Objects. (...and I have jsut noticed Andrew said as much below, doh..always read the full mail :P) I dont think nulling the references between the objects in question is the way to go here, it might help prevent the impact to a certain extent but it doesnt fix the underlying issue and may just introduce more (there is almost certainly scope for NPEs in there). Having crudely put in removal of the connections from the registry upon a standard clsoe, I was able to run 2000 connections into the broker and come out with 4MB of heap used instead of 13MB previously. I will look to properly put this fix into 0.11/0.12 soon, but its too late for 0.10 now; since this is actually not a regression since 0.6 or 0.8 (from which memory usage is actually massively improved) I definitely dont think its a blocker. As Andrew also noted, a specific test for this might be useful. Robbie On 24 March 2011 00:11, Andrew Kennedy <[email protected]>wrote: > Thanks. > > We had looked at the memory usage of 0-10 connections and seen a similar > effect when Robbie was fixing some other memory leaks. The 64K element > Method array in the 0-10 Session objects is needed so that sessions can > synchronise with the client, although perhaps a more efficient structure > could be used to store them, such as a Map that can grow and shrink > dynamically? As you say, overall memory usage is much reduced now, so the > real issue is the failure to deregister connections in the broker. > > I will have to try your code from QPID-3162 to see if the same issue still > exists on the updated 0-10 network stack, since I remember adding explicit > calls to remove connections from the broker registry... > > Looking at the patch, I have a few points: > > * in MultiVersionProtocolEngine you set _delegate to null, same with > _protocolEngine in MINANetworkDriver, and the Mina IoSession attachment is > also set to null. Which of these was *actually* preventing the GC from > reclaiming memory? If we know that, then we can remove only that reference. > This many seems like overkill if the session can be closed properly. > * in Connection, setState(CLOSED) is functionally equivalent to the two > lines you replace it with, since the monitor on lock is held already. > * calling _connection.onOpen(null) from the closed method in > ProtocolEngine_0_10 seems odd. It sets the _onOpenTask of ServerConnection > to null, which you could do in setState(CLOSED) of ServerConnection (see > previous item) instead? > > I will need to investigate a bit further, but adding a test case seems like > a useful thing to try and do. > > > Andrew. > -- > -- andrew d kennedy ? do not fold, bend, spindle, or mutilate ; > -- http://grkvlt.blogspot.com/ ? edinburgh : +44 7582 293 255 ; > > On 23 Mar 2011, at 19:06, Danushka Menikkumbura wrote: > > Just run the sample client that I have mentioned in [1] and see the number >> of ProtocolEngine_0_10 and related objects in heap on JProfiler. You should >> be able to see 50 of them. >> >> [1] - https://issues.apache.org/jira/browse/QPID-3162 >> >> Thanks, >> Danushka >> >> On Thu, Mar 24, 2011 at 12:13 AM, Danushka Menikkumbura < >> [email protected]> wrote: >> Hi Andrew, >> >> I just checked the latest trunk. The issue is still there, the memory >> consumption is very less, though. As I can see its because of the way the >> incomplete Method array is allocated in Assembler now. But still I can see >> the same result w.r.t. instances that are not GC'ed. >> >> Thanks, >> Danushka >> >> >> On Wed, Mar 23, 2011 at 6:07 PM, Andrew Kennedy < >> [email protected]> wrote: >> Also, >> >> I will add a test case to the new network code. I haven't seen this >> behaviour on the branch I'm working on, so it may be fixed. I'll wait for >> your patch before checking again, though. >> >> I should have more time to work on the in-vm networking branch now that >> I've started a new job; I know I said it would be part of 0.10 but >> circumstances meant I didn't have enough time to finish the work. >> >> Andrew. >> -- >> -- andrew d kennedy ? do not fold, bend, spindle, or mutilate ; >> -- http://grkvlt.blogspot.com/ ? edinburgh : +44 7582 293 255 ; >> >> >> On 23 Mar 2011, at 10:47, Andrew Kennedy wrote: >> >> Hi, Danushka. >> >> I'll take a look at this, it looks like a race during session close, >> which we have seen before. >> >> Andrew. >> -- >> -- andrew d kennedy ? edinburgh : +44 7582 293 255 >> >> >> >> >> On 23 March 2011 00:43, Danushka Menikkumbura >> <[email protected]> wrote: >> Hi, >> >> Any idea what is going wrong here?. >> >> Thanks, >> Danushka >> >> On Tue, Mar 22, 2011 at 12:30 PM, Danushka Menikkumbura < >> [email protected]> wrote: >> >> You can run a simple connection create/close loop and check the number of >> times MINANetworkDriver.sessionClosed gets called. >> >> When I did that I observed this particular method got only called 49 time >> when I ran the loop for 50 iterations, 99 times for 100 and 994 for 1000. >> >> Am I missing something here?. I also see that the ProtocolEngine_0_10 >> instance count in heap is equal to the number of times this method did not >> get called. Hence the memory leak. These objects retain in the memory as >> they are not removed from the _typeMap in ConfigStore as >> removeConfiguredObject does not get called. >> >> Thanks, >> Danushka >> >> >> On Tue, Mar 22, 2011 at 5:09 AM, Danushka Menikkumbura < >> [email protected]> wrote: >> >> Hi devs, >> >> I see sometimes (1/50) MINANetworkDriver.sessionClosed does not get called >> causing a memory leak. Is this a known issue?. >> >> Thanks, >> Danushka >> >> >> >> >> >> >> >> > > --------------------------------------------------------------------- > Apache Qpid - AMQP Messaging Implementation > Project: http://qpid.apache.org > Use/Interact: mailto:[email protected] > >
