On 26.09.2011, at 19:56, Marcel Bruch <[email protected]> wrote:
> Hi Stefan, > > On 26.09.2011, at 18:13, Stefan Guggisberg wrote: > >>> I wrote a fairly ad-hoc dump of the 5900 data files into Jackrabbit. >>> Storing ~240 MB took roughly 3 minutes. Is this the expected time such >>> an operation takes? Is it possible to improve the performance somehow? >> >> the performance seems rather poor. it's hard to tell what's wrong >> without having the test data. i noticed that you're storing the >> content of the .json files as string properties. why aren't you >> storing the json data as nodes & properties? > > I had no code available for serializing the data as JCR nodes. Is there any > simple snippet available somewhere? > However, I thought as a first baseline this would work. > > >> anyway, i quickly ran an adapted ad hoc test on my machine >> (macbook pro 2.66 ghz, standard harddisk). the test imports >> an 'svn export' of jackrabbit/trunk. >> >> importing ~6500 files takes ~30s which is IMO decent. > > Thanks for writing your test agains your local files! > > I run your code and compared the execution times. Unfortunately, it's not > performing faster :( > The minute delta might be cause by some file traversing differences of by the > additional nodes/properties created in your code. > > However, the overall performance is still a bit low (2:24-3:05 minutes in a > clean repository). Any idea how the performance could be improved? Am I doing > something conceptually wrong? did you run my test with the same test data (local svn export of jackrabbit trunk)? cheers stefan > I'm assuming that there is no big delta between creating hundreds of nodes > and properties compared to dumping a file's content into Jackrabbit. Is this > correct? > > Thanks, > Marcel > > === Experiments performance results === > > > Jackrabbit First Hops code adapted: > > 0:00:08.522: 500 units persisted. data 17 MB > 0:00:17.057: 1000 units persisted. data 33 MB > 0:00:31.763: 1500 units persisted. data 53 MB > 0:00:41.404: 2000 units persisted. data 72 MB > 0:00:53.140: 2500 units persisted. data 97 MB > 0:01:02.988: 3000 units persisted. data 113 MB > 0:01:16.314: 3500 units persisted. data 133 MB > 0:01:35.171: 4000 units persisted. data 143 MB > 0:01:49.414: 4500 units persisted. data 173 MB > 0:02:04.617: 5000 units persisted. data 204 MB > 0:02:12.593: 5500 units persisted. data 221 MB > Mon Sep 26 19:54:58 CEST 2011: 5927 units persisted > Run took 0:02:24.505 > > > Mailing List proposal: > > 0:00:14.853: 500 units persisted. data 17 MB > 0:00:26.353: 1000 units persisted. data 33 MB > 0:00:36.114: 1500 units persisted. data 53 MB > 0:00:53.274: 2000 units persisted. data 72 MB > 0:01:06.643: 2500 units persisted. data 97 MB > 0:01:18.230: 3000 units persisted. data 113 MB > 0:01:36.765: 3500 units persisted. data 133 MB > 0:01:44.245: 4000 units persisted. data 143 MB > 0:02:04.026: 4500 units persisted. data 173 MB > 0:02:37.533: 5000 units persisted. data 204 MB > 0:02:48.089: 5500 units persisted. data 221 MB > Run took 0:03:08.458 > >
