[jira] Commented: (JCR-2524) Reduce memory usage of DocIds

2010-03-19 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12847282#action_12847282
 ] 

Marcel Reutegger commented on JCR-2524:
---

Removed System.out debug calls in test class.

svn revision: 925141

> Reduce memory usage of DocIds
> -
>
> Key: JCR-2524
> URL: https://issues.apache.org/jira/browse/JCR-2524
> Project: Jackrabbit Content Repository
>  Issue Type: Improvement
>  Components: jackrabbit-core
>Reporter: Marcel Reutegger
>Priority: Minor
> Fix For: 2.1.0
>
> Attachments: JCR-2524.patch, JCR-2524.patch
>
>
> Implementations of DocIds are used to cache parent child relations of nodes 
> in the index. Usually there are a lot of duplicate objects because a DocId 
> instance is used to identify the parent of a node in the index. That is, 
> sibling nodes will all have DocIds with the same value. Currently a new DocId 
> instance is created for each node. Caching the most recently used DocIds and 
> reuse them might help to reduce the memory usage. Furthermore there are 
> DocIds that could be represented with a short instead of an int when possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (JCR-2524) Reduce memory usage of DocIds

2010-03-04 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841737#action_12841737
 ] 

Marcel Reutegger commented on JCR-2524:
---

hmm, you are right. should have looked more closely what the memory analyzer 
reported.

here's another idea:

- use int arrays and create PlainDocIds on the fly (possibly using cached 
instances)
- a special value in the int array marks the existence of a UUIDDocId, which 
are held in a separate map


> Reduce memory usage of DocIds
> -
>
> Key: JCR-2524
> URL: https://issues.apache.org/jira/browse/JCR-2524
> Project: Jackrabbit Content Repository
>  Issue Type: Improvement
>  Components: jackrabbit-core
>Reporter: Marcel Reutegger
>Priority: Minor
> Attachments: JCR-2524.patch
>
>
> Implementations of DocIds are used to cache parent child relations of nodes 
> in the index. Usually there are a lot of duplicate objects because a DocId 
> instance is used to identify the parent of a node in the index. That is, 
> sibling nodes will all have DocIds with the same value. Currently a new DocId 
> instance is created for each node. Caching the most recently used DocIds and 
> reuse them might help to reduce the memory usage. Furthermore there are 
> DocIds that could be represented with a short instead of an int when possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (JCR-2524) Reduce memory usage of DocIds

2010-03-02 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12840074#action_12840074
 ] 

Thomas Mueller commented on JCR-2524:
-

> Caching the most recently used DocIds and reuse them might help to reduce the 
> memory usage

+1

> DocIds that could be represented with a short instead of an int

According to my test, this will not reduce memory usage: 
http://h2database.com/p.html#da4c6a321d0dc84a2b7b96cdbf468a47

For the Sun JVM (JDK 1.5, 32 bit), objects with one field of type boolean, 
byte, short, character, integer, and long all need 16 bytes. A small BigInteger 
uses 56 bytes, a small  BigDecimal uses 32 bytes (probably re-uses the same 
BigInteger internally), and a String uses 24 bytes. Object uses 8 bytes.

For JDK 1.6, 32 bit and 64 bit, it's a bit different: 20 bytes for an object, 
24 bytes for boolean - long.

For JDK 1.5, 64 bit, it's again different: 16 bytes for an object, 24 bytes for 
boolean - long.


> Reduce memory usage of DocIds
> -
>
> Key: JCR-2524
> URL: https://issues.apache.org/jira/browse/JCR-2524
> Project: Jackrabbit Content Repository
>  Issue Type: Improvement
>  Components: jackrabbit-core
>Reporter: Marcel Reutegger
>Priority: Minor
> Attachments: JCR-2524.patch
>
>
> Implementations of DocIds are used to cache parent child relations of nodes 
> in the index. Usually there are a lot of duplicate objects because a DocId 
> instance is used to identify the parent of a node in the index. That is, 
> sibling nodes will all have DocIds with the same value. Currently a new DocId 
> instance is created for each node. Caching the most recently used DocIds and 
> reuse them might help to reduce the memory usage. Furthermore there are 
> DocIds that could be represented with a short instead of an int when possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (JCR-2524) Reduce memory usage of DocIds

2010-03-01 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839855#action_12839855
 ] 

Marcel Reutegger commented on JCR-2524:
---

Forgot to mention that the proposed patch reduces the memory usage to about a 
third.

> Reduce memory usage of DocIds
> -
>
> Key: JCR-2524
> URL: https://issues.apache.org/jira/browse/JCR-2524
> Project: Jackrabbit Content Repository
>  Issue Type: Improvement
>  Components: jackrabbit-core
>Reporter: Marcel Reutegger
>Priority: Minor
> Attachments: JCR-2524.patch
>
>
> Implementations of DocIds are used to cache parent child relations of nodes 
> in the index. Usually there are a lot of duplicate objects because a DocId 
> instance is used to identify the parent of a node in the index. That is, 
> sibling nodes will all have DocIds with the same value. Currently a new DocId 
> instance is created for each node. Caching the most recently used DocIds and 
> reuse them might help to reduce the memory usage. Furthermore there are 
> DocIds that could be represented with a short instead of an int when possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (JCR-2524) Reduce memory usage of DocIds

2010-03-01 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/JCR-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839717#action_12839717
 ] 

Marcel Reutegger commented on JCR-2524:
---

Some memory stats from a real life system: a fully populated DocId cache for 
300'000 nodes consumes about 6MB of heap.

> Reduce memory usage of DocIds
> -
>
> Key: JCR-2524
> URL: https://issues.apache.org/jira/browse/JCR-2524
> Project: Jackrabbit Content Repository
>  Issue Type: Improvement
>  Components: jackrabbit-core
>Reporter: Marcel Reutegger
>Priority: Minor
>
> Implementations of DocIds are used to cache parent child relations of nodes 
> in the index. Usually there are a lot of duplicate objects because a DocId 
> instance is used to identify the parent of a node in the index. That is, 
> sibling nodes will all have DocIds with the same value. Currently a new DocId 
> instance is created for each node. Caching the most recently used DocIds and 
> reuse them might help to reduce the memory usage. Furthermore there are 
> DocIds that could be represented with a short instead of an int when possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.