Re: Jackrabbit & Performance

Enrique Medina Montenegro Thu, 21 Nov 2013 02:35:38 -0800

First, thanks to everyone for such helpful hints.

Now looks like I'm progressing further with the segment of nodes; I can see
that my bottleneck is actually in the "session.save()" when writing against
a DB, because when writing against the file system seems to behave quite
fast.


And theoretically the Bundle Cache doesn't get exhausted even with the
default 8K:

11:24:46,695  INFO cachename=iptoolBundleCache[ConcurrentCache@70911adf],
elements=93, usedmemorykb=552, maxmemorykb=8192, access=254, miss=93

I'll continue testing and share my final results with you.

On Thu, Nov 21, 2013 at 12:16 AM, Ron Wheeler <
[email protected]> wrote:

> Have you sorted the marks?
> This way you should only be switching top nodes every 1000 records and
> sitting at /marks/XXX and adding a thousand nodes here before moving to
> /marks/XXX+1 and adding a thousand there.
>
> Ron
>
>
>
>
>
>  On 20/11/2013 2:39 PM, Enrique Medina Montenegro wrote:
>
>> Bertrand,
>>
>> Your algorithm is exactly the approach I followed, but I noticed a
>> decrease
>> in performance as the import was progressing, with response times to just
>> lookup the exact path (i.e. session.getNode("/marks/XXX/YYY")) above 2
>> seconds, even when calling Session.save() every 1000 or 500 or 100
>> records...
>>
>> Using Jackrabbit 2.7.0 btw, because it's the only one working with Spring
>> Modules for JCR 0.8b
>>
>> Salu2,
>> Quique.
>>
>>
>> On Wed, Nov 20, 2013 at 8:34 PM, Bertrand Delacretaz <
>> [email protected]
>>
>>> wrote:
>>> Hi,
>>>
>>> On Wed, Nov 20, 2013 at 7:39 PM, Enrique Medina Montenegro
>>> <[email protected]> wrote:
>>>
>>>> ...at the practical level,
>>>> when I dump the 1M marks from the DB into JCR, for each an every "mark"
>>>>
>>> it
>>>
>>>> has to lookup the path in the tree where to ultimately store the "mark",
>>>> and this lookup starts to take orders of seconds as the tree structure
>>>> grows, making the full extraction process from the DB too slow for our
>>>> requirements....
>>>>
>>> If import according to the following scenario the performance should be
>>> linear:
>>>
>>> for each DB record
>>>    compute path of JCR node
>>>    for each level of that path (below storage root)
>>>      create node if not created yet
>>>      set properties if on the data node at the end of the path
>>>
>>> and you probably want to call Session.save() every N records (N=1000
>>> maybe)
>>>
>>> -Bertrand
>>>
>>>
>
> --
> Ron Wheeler
> President
> Artifact Software Inc
> email: [email protected]
> skype: ronaldmwheeler
> phone: 866-970-2435, ext 102
>
>

Re: Jackrabbit & Performance

Reply via email to