[ 
https://issues.apache.org/jira/browse/OFBIZ-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680105#action_12680105
 ] 

Philippe Mouawad commented on OFBIZ-2186:
-----------------------------------------

Hello M. Jones,
First thank you for your reactivity.
Second here is the status about the two previous patches:
- The first one (you proposed) made the production crash every day for 5 days, 
an OOM was generated at least once a day on each server
- With the second one (my proposition) production has been up for 10 days (we 
did a jmap to verify the leak was solved as I said), furthermore we didn't 
notice slow response time (if you are worried about the synchronized block). 
The ecommerce site we are using ofbiz for is a big one with 3 Balanced JVM 
servers.

Now about you patch, I am not sure that your patch cannot introduce a Java 
Deadlock risk since now there are 2 synchronization 
occuring is the case of put/delete (one on the CacheLineTable through 
synchronized and one on LRUMap through Collections.synchronizedMap wrapping).

As you may guess we cannot afford testing this new patch with the risk of 
production crashing again, it is an ECommerce site.
I am wondering why you don't want to take my patch since it is OK on a big 
production SITE and does not incur an important performance loss.

Thanks for your answer.

> OutOfMemory provoked by Cache overflow
> --------------------------------------
>
>                 Key: OFBIZ-2186
>                 URL: https://issues.apache.org/jira/browse/OFBIZ-2186
>             Project: OFBiz
>          Issue Type: Bug
>          Components: framework
>    Affects Versions: SVN trunk, Release Branch 4.0
>         Environment: Linux, JDK 1.5_15, Xmx set to 1156Mo
>            Reporter: Philippe Mouawad
>            Priority: Critical
>         Attachments: CacheLineTable-patch.txt, OfbizBug.zip
>
>
> In our production system, we had an OutOfMemoryError.
> I analyzed the generated Heap Dump and found the following:
> The cache entitycache.entity-list.default.ProductCategoryMember retains a 
> heap of 369314128 Bytes.
> The LRUMap hold by this object is occupying this space (369314128) and this 
> object has a stange state:
> - maxSize is set to 5000 (as set in the cache.properties)
> - size is 128930 => PROBLEM
> IN cache.properties:
> entitycache.entity-list.default.ProductCategoryMember.expireTime=3600000
> entitycache.entity-list.default.ProductCategoryMember.useSoftReference=true
> entitycache.entity-list.default.ProductCategoryMember.maxInMemory=5000
> entitycache.entity-list.default.ProductCategoryMember.maxSize=7500
> I analyzed the code of LRUMap and its usage in CacheLineTable and IMHO the 
> bug is a missing synchonized  in get():
> public CacheLine<V> get(Object key) {
>         if (key == null) {
>             if (Debug.verboseOn()) Debug.logVerbose("In CacheLineTable tried 
> to get with null key, using NullObject" + this.cacheName, module);
>         }
>         return getNoCheck(key);
>     }
> Since LRUMap extends LinkedHashMap, if you look at get method, it changes the 
> state of the Map by calling e.recordAccess(this):
>     public V get(Object key) {
>         Entry<K,V> e = (Entry<K,V>)getEntry(key);
>         if (e == null)
>             return null;
>         e.recordAccess(this);
>         return e.value;
>     }
> So the default of synchronization corrupts the state of LRUMap which grows 
> indefinitely
> I will submit a patch for this on the trunk.
> Philippe
> www.ubik-ingenierie.com

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to