On 26.09.2011, at 19:56, Marcel Bruch <[email protected]> wrote:

> Hi Stefan,
> 
> On 26.09.2011, at 18:13, Stefan Guggisberg wrote:
> 
>>> I wrote a fairly ad-hoc dump of the 5900 data files into Jackrabbit.
>>> Storing ~240 MB took roughly 3 minutes. Is this the expected time such
>>> an operation takes? Is it possible to improve the performance somehow?
>> 
>> the performance seems rather poor. it's hard to tell what's wrong
>> without having the test data. i noticed that you're storing the
>> content of the .json files as string properties. why aren't you
>> storing the json data as nodes & properties?
> 
> I had no code available for serializing the data as JCR nodes. Is there any 
> simple snippet available somewhere?
> However, I thought as a first baseline this would work. 
> 
> 
>> anyway, i quickly ran an adapted ad hoc test on my machine
>> (macbook pro 2.66 ghz, standard harddisk). the test imports
>> an 'svn export' of jackrabbit/trunk.
>> 
>> importing ~6500 files takes ~30s which is IMO decent.
> 
> Thanks for writing your test agains your local files!
> 
> I run your code and compared the execution times. Unfortunately, it's not 
> performing  faster :( 
> The minute delta might be cause by some file traversing differences of by the 
> additional nodes/properties created in your code.
> 
> However, the overall performance is still a bit low (2:24-3:05 minutes in a 
> clean repository). Any idea how the performance could be improved? Am I doing 
> something conceptually wrong?

did you run my test with the same test data (local svn export of jackrabbit 
trunk)?

cheers
stefan

> I'm assuming that there is no big delta between creating hundreds of nodes 
> and properties compared to dumping a file's content into Jackrabbit. Is this 
> correct?
> 
> Thanks,
> Marcel
> 
> === Experiments performance results ===
> 
> 
> Jackrabbit First Hops code adapted:
> 
> 0:00:08.522: 500 units persisted.  data 17 MB 
> 0:00:17.057: 1000 units persisted.  data 33 MB 
> 0:00:31.763: 1500 units persisted.  data 53 MB 
> 0:00:41.404: 2000 units persisted.  data 72 MB 
> 0:00:53.140: 2500 units persisted.  data 97 MB 
> 0:01:02.988: 3000 units persisted.  data 113 MB 
> 0:01:16.314: 3500 units persisted.  data 133 MB 
> 0:01:35.171: 4000 units persisted.  data 143 MB 
> 0:01:49.414: 4500 units persisted.  data 173 MB 
> 0:02:04.617: 5000 units persisted.  data 204 MB 
> 0:02:12.593: 5500 units persisted.  data 221 MB 
> Mon Sep 26 19:54:58 CEST 2011: 5927 units persisted
> Run took 0:02:24.505
> 
> 
> Mailing List proposal:
> 
> 0:00:14.853: 500 units persisted. data  17 MB
> 0:00:26.353: 1000 units persisted. data  33 MB
> 0:00:36.114: 1500 units persisted. data  53 MB
> 0:00:53.274: 2000 units persisted. data  72 MB
> 0:01:06.643: 2500 units persisted. data  97 MB
> 0:01:18.230: 3000 units persisted. data  113 MB
> 0:01:36.765: 3500 units persisted. data  133 MB
> 0:01:44.245: 4000 units persisted. data  143 MB
> 0:02:04.026: 4500 units persisted. data  173 MB
> 0:02:37.533: 5000 units persisted. data  204 MB
> 0:02:48.089: 5500 units persisted. data  221 MB
> Run took 0:03:08.458
> 
> 

Reply via email to