[ 
https://jira.duraspace.org/browse/DS-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=25843#comment-25843
 ] 

Richard Rodgers commented on DS-1205:
-------------------------------------

Hi Lyncode:

I'm probably not being very clear, so let me elaborate a little. It is 
reasonably well understood that 'long-lived' programming uses of Context 
objects (where 'long-lived' means that the context visits a lot of 
DSpaceObjects) require that the Context's cache be managed. This isn't a bug - 
just a feature of the system like the fact that one can easily fill the 
temporary upload directory and cause I/O failures if not managed, etc. There 
are fairly simple ways to manage the cache in the DSpace API, and if you look 
at applications like 'ItemImport' (that can iterate over all Items), you can 
see examples of such management.

For a use like the one you describe, I'd say a single line of code could 
prevent any OOM/NPE exceptions: in the processing loop add:
(where 'c' is the context object):

if (c.getCacheSize() > 10000) c.clearCache();

I would be very surprised if there is a measurable/significant performance 
difference introduced by this code.

Could one imagine a auotmatically-managed cache that avoids this requirement 
(using EhCache or other)? Maybe, but it is unlikely to be 'more performant 
(faster or less resource-intensive)' than the simple HashMap used in the 
current implementation. In fact an EhCache instance is much more heavy-weight 
than a simple HashMap, and when one reflects on the fact that a new cache is 
created *with every request* to the DSpace system (i.e. http request), then you 
are slowing the system by many orders or magnitude to solve a problem that (1) 
occurs only in particular, foreseeable circumstances and (2) as above, has a 
simple, effective solution (just clearing the cache when the programing 
situation requires it).

I appreciate that it is difficult to determine the minimum memory requirements, 
but constraining the context cache doesn't help much in this effort: factors 
such as system load (number of simultaneous users), number of different 
services deployed (is SOLR running, etc), collection size, etc all contribute 
memory requirements. We certainly could do a better job of advising users of 
recommended amounts (maybe build a memory 'calculator' where one could enter 
these factors and get a number), but this won't make a big difference. In fact, 
most 'long-lived' contexts appear in command-line applications, where there is 
a different JVM running just for that application, with the result that memory 
tunings used on the interactive (Tomcat) side don't apply.

Does this all make sense? Sorry I didn't explain myself more clearly.

Richard
                
> DSpace org.dspace.core.Context caching problem
> ----------------------------------------------
>
>                 Key: DS-1205
>                 URL: https://jira.duraspace.org/browse/DS-1205
>             Project: DSpace
>          Issue Type: Bug
>          Components: DSpace API
>    Affects Versions: 1.8.2
>            Reporter: DSpace @ Lyncode
>            Priority: Blocker
>              Labels: Core
>             Fix For: 3.0
>
>
> bin/dspace script is crashing with OutOfMemory problems related with the 
> org.dspace.core.Context HashMap.
> Proof: https://github.com/lyncode/DSpace/issues/5
> I think this is a critical problem, mainly because it does not make sense 
> (for me) to have crashing applications because of caching mechanism memory 
> overusage. As good practices for implementing caches, this cache mechanism 
> could be aware of it's memory usage threshold, if reached, it could start 
> discarding old and unused objects in it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://jira.duraspace.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel

Reply via email to