First, thanks to everyone for such helpful hints. Now looks like I'm progressing further with the segment of nodes; I can see that my bottleneck is actually in the "session.save()" when writing against a DB, because when writing against the file system seems to behave quite fast.
And theoretically the Bundle Cache doesn't get exhausted even with the default 8K: 11:24:46,695 INFO cachename=iptoolBundleCache[ConcurrentCache@70911adf], elements=93, usedmemorykb=552, maxmemorykb=8192, access=254, miss=93 I'll continue testing and share my final results with you. On Thu, Nov 21, 2013 at 12:16 AM, Ron Wheeler < [email protected]> wrote: > Have you sorted the marks? > This way you should only be switching top nodes every 1000 records and > sitting at /marks/XXX and adding a thousand nodes here before moving to > /marks/XXX+1 and adding a thousand there. > > Ron > > > > > > On 20/11/2013 2:39 PM, Enrique Medina Montenegro wrote: > >> Bertrand, >> >> Your algorithm is exactly the approach I followed, but I noticed a >> decrease >> in performance as the import was progressing, with response times to just >> lookup the exact path (i.e. session.getNode("/marks/XXX/YYY")) above 2 >> seconds, even when calling Session.save() every 1000 or 500 or 100 >> records... >> >> Using Jackrabbit 2.7.0 btw, because it's the only one working with Spring >> Modules for JCR 0.8b >> >> Salu2, >> Quique. >> >> >> On Wed, Nov 20, 2013 at 8:34 PM, Bertrand Delacretaz < >> [email protected] >> >>> wrote: >>> Hi, >>> >>> On Wed, Nov 20, 2013 at 7:39 PM, Enrique Medina Montenegro >>> <[email protected]> wrote: >>> >>>> ...at the practical level, >>>> when I dump the 1M marks from the DB into JCR, for each an every "mark" >>>> >>> it >>> >>>> has to lookup the path in the tree where to ultimately store the "mark", >>>> and this lookup starts to take orders of seconds as the tree structure >>>> grows, making the full extraction process from the DB too slow for our >>>> requirements.... >>>> >>> If import according to the following scenario the performance should be >>> linear: >>> >>> for each DB record >>> compute path of JCR node >>> for each level of that path (below storage root) >>> create node if not created yet >>> set properties if on the data node at the end of the path >>> >>> and you probably want to call Session.save() every N records (N=1000 >>> maybe) >>> >>> -Bertrand >>> >>> > > -- > Ron Wheeler > President > Artifact Software Inc > email: [email protected] > skype: ronaldmwheeler > phone: 866-970-2435, ext 102 > >
