Thanks for the detailed writeup and explanations, it's helping me understand
a lot of these issues better... looks like I still have some bedtime reading
(and DSpace profiling) ahead of me! ;-)

Cheers,

Kim

On 5 October 2010 12:55, Graham Triggs <[email protected]> wrote:

> Hello all,
>
> So, we've been debating the performance / resource usage of DSpace 1.6.
> This has just happened to coincide with work that I needed to do to qualify
> DSpace 1.6 (and/or 1.7) for our own use. So, for the past week or so, I've
> been hammering away at profiling DSpace. This isn't a comprehensive analysis
> of the entire codebase, but focusing on the serious issue of stability and
> resource leakage.
>
> This immediately led to finding two issues with the dspace-services code
> and it's dependencies:
>
> 1) The resources in the FinalizableReferenceQueue were leaking (this was
> noted before in a Tomcat stack trace posted here) - this could have been
> avoided if a newer version of reflectutils had been used, along with the
> LifeCycleManager it contains
>
> 2) A ThreadLocal in the CachingServiceImpl was also leaking - the service
> did attempt to clean up in the shutdown, but it would only ever have been
> able to clear up the resources of the thread executing the shutdown, not all
> the other threads that would have processed requests.
>
> Investigating the second issue, it turned out that this was being used to
> keep track of Request scope caches - as well as potentially leaking
> resources (across requests), this highlighted a problem in the service
> contract, and a dangerous coupling between the implementation of the
> services.
>
> The Caching service was written to return a Request scoped cache without
> any valid Request to bind it to - this was explicitly used in the listeners
> / filters to acquire a cache and assign request objects into it before the
> request service was called to 'start' a request and bind to the scoped
> cache. There were also tests that expected this behaviour. Whilst the
> request service will clear a request scoped cache if it's been bound to it,
> you can't guarantee that the cache gets bound to a request.
>
> Worse though, the Request service was depending on the Caching service
> tracking request scoped caches against the current thread in order to be
> able to track the current request. So, the one responsibility that the
> request service should have, it was passing on and hoping that another
> service might do the job for it without it being explicitly clear in the
> contract.
>
> If that wasn't bad enough, getCache doesn't actually properly respect the
> CacheScope when trying to retrieve a particular cache (it could retrieve a
> non-Request scoped cache if one of the same name had already been created,
> regardless of the scope parameter). The Caching service also registered an
> MBean that was never removed. And, to wrap up my examination of the Caching
> service, it didn't close an InputStream that it opened.
>
> So, onwards - wanting to simplify this code a bit (as we don't need to
> support non-Spring IoC containers, at least not yet - and we do need to be
> able to understand and maintain this code) - I quickly noticed an issue in
> the SpringServiceManager where it returned multiple instances of the same
> service name, because it was under the mistaken belief that it needed to
> track some of the service names itself, when it was actually getting them
> returned from Spring.
>
> And the ServiceMixinManager was entirely redundant - as long as we can rely
> on Spring, we can rely on the container to do everything that the Mixin
> manager does, without us having to worry about the complexity of it (and it
> is both relatively complex, and a source of our problems earlier as it uses
> the FinalizableReferenceQueue). I've actually got a real bugbear about us
> using the term 'mixin'. A mixin means something specific - a class that
> provides an actual implementation, and is combined with other mixins to
> produce other classes [through multiple-inheritance]. When you look at what
> is implemented in dspace-services under the term mixin, well we don't
> actually have mixins... we have interfaces that are implemented by service
> classes. Each service class completely and wholly provide implementations of
> the methods defined in all the interfaces they implement. That's a concept
> that Java has had for a long time, and is (should be) well understood by all
> Java programmers. That doesn't make them mixins. We don't/shouldn't want
> 'mixins' (multiple inheritance - there is a reason Java doesn't support it).
> And we shouldn't be using a term incorrectly just because it's cool.
>
> While I'm on a terminology rant, I'm not particularly happy about the
> 'interceptors' (ie. RequestInterceptor). They are looking and feeling a lot
> more like listeners than interceptors - allowing other services to react to
> events taking place (start request / end request), rather than affecting
> it's behaviour. I've not renamed them, for the sake of keeping as much of
> the existing API intact as I can (whilst clearing out [currently] redundant
> or problematic code), but actually it seems that it should really be pushing
> a synchronous and immediately executing event notification, so that the
> listener registration can occur in a single place rather than duplicating
> that functionality anywhere similar concepts may be required.
>
> OK, less ranting, back to actual issues. Related to the RequestService, the
> SessionService lifecycle methods really don't seem to make much sense -
> should there be a request in place before you start a session? Are you going
> to get a clean session or attach to an existing one? Does end terminate the
> session, or simply release it? Add to that, the Threading for expiry of
> sessions was entirely problematic, existing on a different cycle to the
> container session manager. So, I took the brutal decision to kill those
> lifecycle methods - you have to have a request to get a session.
>
> What else? Well, the initial proposal to bring dspace-services into 1.6
> stated that the MBean registration would be removed. That wasn't the case
> for the CachingService, nor was it the case for the Kernel - which actually
> registered a UUID based MBean name. Basically, a bit pointless. I made it
> properly optional (it will register if you specify a name, but won't if
> 'anonymous') - but given the proposal for 1.6, I was considering binning it
> entirely. Especially as the MBean registration and de-registration was split
> over multiple classes.
>
> Other problems include some odd switching of config names in the
> configurationservice when it tries to do variable replacement - such to the
> point I was entirely confused why any of the unit tests would ever have
> worked. There were also 9 threading issues, and a few other problems, thrown
> up by code inspection.
>
> So, we have a prototype of a simplified and [hopefully mostly :)] corrected
> and cleaned up version of dspace-services here:
>
>
> http://scm.dspace.org/svn/repo/modules/dspace-services/branches/simplified-prototype/
>
> To recap the changes:
>
> 1) RequestService is no longer dependent on CachingService, or it's
> internal implentation
> 2) CachingService registers as a listener to RequestService to manage the
> lifecycle of Request scope caches, and only returns request scoped caches
> when there is a request
> 3) Caches returned are always of the correct scope
> 4) The request objects thrown into the cache are no longer there (you can
> retrieve them from the RequestService)
> 5) The Kernel / manager / initializaer classes have been detangled
> 6) MBean registration is completely optional (and should be removed imho)
> 7) The ServiceMixinManager is removed as it's entirely unnecessary whilst
> Spring is our only IoC container (and there is *no* good reason to
> complicate things by supporting combining others at this stage)
> 8) reflectutils removed (reflection code no longer necessary - other
> utility classes provided by JVM or Spring) - this means that the Sakai Maven
> repository can be removed (I've had problems in the past with some artifacts
> being retrieved from there)
> 9) SessionService simplified, and consistent APIs in place across Request
> and Session services.
> 10) Multiple threading and resource issues cleaned up
>
> Aside from having to modify a few tests that relied on the bizarre
> contracts that were in place (ie. being able to retrieve request scoped
> caches without a valid request), and removing tests related to code that is
> now gone, everything passes all of the tests that were already in place.
> This makes it pretty much a drop in replacement for the existing
> dspace-services code - certainly for the core of 1.6, can't say about other
> code that may be depending on dspace-services.
>
> You'll note that because of this, I haven't been as ambitious as I would
> like in ripping this apart. I would love to get rid of the kernel startup,
> and just rely on Spring starting up the kernel and service managers -
> injecting the bean factory into the managers it instantiates as appropriate.
> And pull out things like the interceptors / listeners into events reducing
> even more complex code potentially duplicated in services. But I had to
> start somewhere, and this seems to be the right sort of area for looking at
> DSpace 1.7. This shouldn't stop us being more aggressive at simplifying and
> reducing this code later.
>
> And to take us back to the earlier issues about leakage inside a container
> - dspace-services alone doesn't make the applications cleanup correctly.
> There needs to be some adjustments to the context listeners, and their
> ordering in the web.xml. You also have to take care with the JDBC drivers (I
> now have a reworked DatabaseManager that allows you to retrieve a DataSource
> from JNDI if available, and if not creates a DataSource using the entries in
> dspace.cfg, the structure of which means that you do not need the JDBC
> driver, DBCP or commons-pool in the webapp if you have a JNDI datasource.
>
> But with a bit of care and attention taken in the application assembly and
> deployment, the JSPUI does unload cleanly.
>
> XMLUI is still proving a bit more troublesome. There are two instances of
> resource leakage within Cocoon itself - in the sitemap code, it puts a stack
> into a ThreadLocal, but never resets the contents (even when it's empty).
> The flowscript code also doesn't exit the Rhino contexts correctly. Both of
> the issues will prevent cleaning up of the resources.
>
> With patched versions of those jars in place, as well as the same JDBC and
> listener issues from jspui - well, it still doesn't quite work reliably
> (although the profiler can no longer detect any GC roots). If I disable all
> of the sitemap and config reloading, *and* wait 10 minutes after undeploying
> the application, it will clean up and unload correctly. All I know right now
> is that there isn't a Thread showing up that would prevent this. It looks
> like it's some system cache (like in URL keep alives), or a session holding
> data.
>
> But it neither will unload either way with the current version of
> dspace-services - it needs the cleaned up/simplified version.
>
> One more thing about Tomcat - there is often a lot of bad press for Tomcat,
> blaming it for not unloading classes. Not only is this blatantly untrue (if
> your application can be unloaded, it will be - it's only the application
> leaking resources that stops it), they've made a lot of strides in the most
> recent releases (6.0.28 / 6.0.29) to help combat potential resources leaks
> in your applications. Take a look at the configuration options on Context:
>
> http://tomcat.apache.org/tomcat-6.0-doc/config/context.html   (clear
> ThreadLocals / stop threads and timers)
>
> and the listeners:
>
> http://tomcat.apache.org/tomcat-6.0-doc/config/listeners.html (JRE leak
> protection, that protects against a number of things, including the URL
> caching that is problematic on Windows)
>
> I wouldn't rely on these being able to prevent your application from
> causing a problem, and it's much better to ensure that applications are
> properly behaved even without Tomcat attempting to prevent problems - but it
> can still be useful in some circumstances.
>
> G
>
>
> ------------------------------------------------------------------------------
> Virtualization is moving to the mainstream and overtaking non-virtualized
> environment for deploying applications. Does it make network security
> easier or more difficult to achieve? Read this whitepaper to separate the
> two and get a better understanding.
> http://p.sf.net/sfu/hp-phase2-d2d
> _______________________________________________
> DSpace-tech mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>
>
------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to