Thanks for responding Dan.
I may be wrong, but (IIRC) I think 0.3.0 was patched a few times in
place (not that its a best practice ;-)
Would 0.3.1 have to go thru the same process as a minor patch release??
If so, it probably is too much work.
Thanks,
-- Chris
On Apr 29, 2008, at 1:37 PM, Dan Diephouse wrote:
Hi Chris, due to the high amount of work it takes to get a release
out of the incubator, I really don't see this happening. I don't
know what you mean by applying this patch to 0.3.0, but we can't
really go back and change the release. It all has to go through the
incubator process. And I'm going to guess that if we're going
through that process we're going to focus on either 1.0-RC or 0.4.1.
Dan
Chris Berry wrote:
Greetings,
Some time ago we reported a very serious memory leak in Abdera 0.3.0.
It was reported to the abdera users list on Nov 7, 2007 as "memory
leak".
That message is replayed below...
At the time we could workaround the bug in our client, but over
time this became impossible, and we had to hack ExtensionFactoryMap.
Attached is a patch to org.apache.abdera.factory.ExtensionFactoryMap
This patch simply removes the problem -- the WeakHashMap.
This change has a negligible effect on performance (as measured in
load testing) But without it, abdera 0.3.0 will get
OutOfMemoryExceptions quite quickly under load, rendering it
unusable.
BTW: this code has been in Production for many weeks...
We were hoping that you could apply this patch to 0.3.0??
Or better; produce 0.3.1
We are not able to upgrade to 0.4.0 yet, and would rather not have
a hacked copy of 0.3.0.
Also, others not able to upgrade yet (since the upgrade is not
entirely transparent) may benefit from the patch
I will also open a JIRA on this.
Thanks,
-- Chris
------------------------------------------------------------------------
*From: * [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
*Subject: * *memory leak*
*Date: * November 5, 2007 3:49:24 PM CST
*To: * [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]
>
*Reply-To: * [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]
>
Hey All -
I'm working on a data service based on Abdera (working with Chris
Berry, who's a regular on these lists...) When we were running our
first battery of serious load testing on our system, we encountered
memory-leaky behavior, and a profiler showed us that we were indeed
leaking hundreds of megabytes a minute, all traceable back to the
wrappers field on org.apache.abdera.factory.ExtensionFactoryMap.
This field is a map from elements to their wrappers, if any. At
first, I was puzzled by the memory leak, as the field is
initialized thusly:
this.wrappers = Collections.synchronizedMap( new
WeakHashMap<Element,Element>());
clearly, the implementor took care to make sure that this cache
would not leak by making it a WeakHashMap, which generally
guarantees that the map itself will not keep a key and its
corresponding entry from being garbage collected. I dug throughout
our application code to find if we were actually holding other
references to these objects, and I googled for anyone having
problems with esoteric interactions between
Collections.synchronizedMap and WeakHashMaps - found nothing
there. Then I went back to square one and re-read the WeakHashMap
javadoc very carefully. Here's the relevant section:
Implementation note: The value objects in a WeakHashMap are held by
ordinary strong references. Thus care should be taken to ensure
that value objects do not strongly refer to their own keys, either
directly or indirectly, since that will prevent the keys from being
discarded. Note that a value object may refer indirectly to its key
via the WeakHashMap itself; that is, a value object may strongly
refer to some other key object whose associated value object, in
turn, strongly refers to the key of the first value object. One way
to deal with this is to wrap values themselves within
WeakReferences before inserting, as in: m.put(key, new
WeakReference(value)), and then unwrapping upon each get.
This is why there is a memory leak - the map is a mapping from
elements to their wrappers - by the very nature of the object being
a wrapper of the element, it will usually have a strong reference
to the element itself, which is the key! You can verify that Abdera
wrappers, in general, will do this by looking at
org.apache.abdera.model.ElementWrapper, which takes the element
being wrapped as its constructor argument, and holds a strong
reference to it as an instance variable.
This map is an optimization to memoize the calls to
getElementWrapper() and not reconstruct them more than is necessary
- it is not needed for abdera to function properly, so we have
temporarily worked around the problem in our own application like
so - we used to acquire our FOMFactory by calling
abdera.getFactory() on our org.apache.abdera.Abdera instance, and
re-using that singleton throughout our application. Now we
construct a new FOMFactory with new FOMFactory(abdera) once per
request to the server, and since the only appreciable state on the
factory is this map itself, this is a valid work-around.
I'd initially planned to really fix this issue and submit a patch
along with this message, but digging a little deeper, I'm not sure
that the correct fix is crystal clear... We could do as the
javadoc above suggests, and wrap the values with WeakReferences to
plug the leak, or we could use a LinkedHashMap configured as an LRU
cache to just bound the cache, so it can't grow out of control -
but right now, I don't think that either of those solutions would
be correct, because it seems to me that none of the objects in the
hierarchy rooted at FOMElement define equals() and/or hashCode()
methods, so all of the objects are cached based on their actual
object identity. It seems that in the all likely use cases,
instances of FOMElement and its descendants are re-parsed on every
request to a server running abdera, and so what we will see is
cache misses virtually 100% of the time, so even though we'd have
plugged the leak, strictly speaking, we would have ignored the
underlying issue that we're caching data on every request that will
be fundamentally unable to be retrieved on subsequent requests.
This is based only on my reading over the code for a few hours, so
I could be missing something, and I also might be forgetting about
a use case that demands and makes proper use of this memoization,
but as it stands right now, my recommended fix would probably be to
just cut out the cache altogether, and allow for wrappers to get
constructed fresh every time they are requested. One more
possibility is that the cache is actually a useful optimization,
but only during the scope of one request - in which case the "work-
around" we are using now is actually the best practice, and the fix
would be to remove the factory instance on the Abdera class...
I'd like to hear from the Abdera developers what their thoughts are
on this issue, and what the best resolution is likely to be. This
is no longer a pressing issue for our team, but it is potentially a
time bomb waiting to blow up for any project dependent on Abdera.
thanks! (and thanks for Abdera, generally - we're easily a year
ahead of where we'd be on this project without it!)
-Bryon (long-time listener, first-time caller)
--
Dan Diephouse
MuleSource
http://mulesource.com | http://netzooid.com