Serious memory leak in ExtensionFactoryMap
------------------------------------------

                 Key: ABDERA-153
                 URL: https://issues.apache.org/jira/browse/ABDERA-153
             Project: Abdera
          Issue Type: Bug
    Affects Versions: 0.3.0
         Environment: N/A
            Reporter: Chris Berry
            Priority: Critical


Greetings,
Some time ago we reported a very serious memory leak in Abdera 0.3.0.
It was reported to the abdera users list on Nov 7, 2007 as "memory leak".
That message is replayed below...
At the time we could workaround the bug in our client, but over time this 
became impossible, and we had to hack ExtensionFactoryMap.

Attached is a patch to org.apache.abdera.factory.ExtensionFactoryMap
This patch simply removes the problem -- the WeakHashMap.
This change has a negligible effect on performance (as measured in load 
testing) 
But without it, abdera 0.3.0 will get OutOfMemoryExceptions quite quickly under 
load, rendering it unusable.
BTW: this code has been in Production for many weeks...

We were hoping that you could apply this patch to 0.3.0??
Or better;  produce 0.3.1

We are not able to upgrade to 0.4.0 yet, and would rather not have a hacked 
copy of 0.3.0.
Also, others not able to upgrade yet (since the upgrade is not entirely 
transparent) may benefit from the patch

I will also open a JIRA on this.

Thanks,
-- Chris 


        From:   [EMAIL PROTECTED]
        Subject:        memory leak
        Date:   November 5, 2007 3:49:24 PM CST
        To:     [EMAIL PROTECTED]
        Reply-To:       [EMAIL PROTECTED]


                    Hey All -

I'm working on a data service based on Abdera (working with Chris Berry, who's 
a regular on these lists...)  When we were running our first battery of serious 
load testing on our system, we encountered memory-leaky behavior, and a 
profiler showed us that we were indeed leaking hundreds of megabytes a minute, 
all traceable back to the wrappers field on 
org.apache.abdera.factory.ExtensionFactoryMap.  This field is a map from 
elements to their wrappers, if any.  At first, I was puzzled by the memory 
leak, as the field is initialized thusly:

        this.wrappers = Collections.synchronizedMap( new 
WeakHashMap<Element,Element>());

clearly, the implementor took care to make sure that this cache would not leak 
by making it a WeakHashMap, which generally guarantees that the map itself will 
not keep a key and its corresponding entry from being garbage collected.  I dug 
throughout our application code to find if we were actually holding other 
references to these objects, and I googled for anyone having problems with 
esoteric interactions between Collections.synchronizedMap and WeakHashMaps - 
found nothing there.  Then I went back to square one and re-read the 
WeakHashMap javadoc very carefully.  Here's the relevant section:

        Implementation note: The value objects in a WeakHashMap are held by 
ordinary strong references. Thus care should be taken to ensure that value 
objects do not strongly refer to their own keys, either directly or indirectly, 
since that will prevent the keys from being discarded. Note that a value object 
may refer indirectly to its key via the WeakHashMap itself; that is, a value 
object may strongly refer to some other key object whose associated value 
object, in turn, strongly refers to the key of the first value object. One way 
to deal with this is to wrap values themselves within WeakReferences before 
inserting, as in: m.put(key, new WeakReference(value)), and then unwrapping 
upon each get.

This is why there is a memory leak - the map is a mapping from elements to 
their wrappers - by the very nature of the object being a wrapper of the 
element, it will usually have a strong reference to the element itself, which 
is the key! You can verify that Abdera wrappers, in general, will do this by 
looking at org.apache.abdera.model.ElementWrapper, which takes the element 
being wrapped as its constructor argument, and holds a strong reference to it 
as an instance variable.

This map is an optimization to memoize the calls to getElementWrapper() and not 
reconstruct them more than is necessary - it is not needed for abdera to 
function properly, so we have temporarily worked around the problem in our own 
application like so - we used to acquire our FOMFactory by calling 
abdera.getFactory() on our org.apache.abdera.Abdera instance, and re-using that 
singleton throughout our application.  Now we construct a new FOMFactory with 
new FOMFactory(abdera) once per request to the server, and since the only 
appreciable state on the factory is this map itself, this is a valid 
work-around.

I'd initially planned to really fix this issue and submit a patch along with 
this message, but digging a little deeper, I'm not sure that the correct fix is 
crystal clear...  We could do as the javadoc above suggests, and wrap the 
values with WeakReferences to plug the leak, or we could use a LinkedHashMap 
configured as an LRU cache to just bound the cache, so it can't grow out of 
control - but right now, I don't think that either of those solutions would be 
correct, because it seems to me that none of the objects in the hierarchy 
rooted at FOMElement define equals() and/or hashCode() methods, so all of the 
objects are cached based on their actual object identity.  It seems that in the 
all likely use cases, instances of FOMElement and its descendants are re-parsed 
on every request to a server running abdera, and so what we will see is cache 
misses virtually 100% of the time, so even though we'd have plugged the leak, 
strictly speaking, we would have ignored the underlying issue that we're 
caching data on every request that will be fundamentally unable to be retrieved 
on subsequent requests.  This is based only on my reading over the code for a 
few hours, so I could be missing something, and I also might be forgetting 
about a use case that demands and makes proper use of this memoization, but as 
it stands right now, my recommended fix would probably be to just cut out the 
cache altogether, and allow for wrappers to get constructed fresh every time 
they are requested.  One more possibility is that the cache is actually a 
useful optimization, but only during the scope of one request - in which case 
the "work-around" we are using now is actually the best practice, and the fix 
would be to remove the factory instance on the Abdera class...

I'd like to hear from the Abdera developers what their thoughts are on this 
issue, and what the best resolution is likely to be.  This is no longer a 
pressing issue for our team, but it is potentially a time bomb waiting to blow 
up for any project dependent on Abdera.

thanks!  (and thanks for Abdera, generally - we're easily a year ahead of where 
we'd be on this project without it!)

-Bryon (long-time listener, first-time caller)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to