Hi Stefan, I tried using the Derby database to upload 375000 Documents.
When i tried to add a document to this setup. It took more than 30 mins to do a checkin, The system CPU utilization was around 90% to 100% and the JVM heap size also is around 1.5GB. Is there someway to handle this out. Now i am using the following hierarchical structure: Folder1 ------ Folder -------- File1 -------- File2 Also in this i am not doing the fileNode.checkin() operation. Thanks Ajai G Thanks Ajai G Stefan Guggisberg wrote: > > hi ajai > > On Thu, Jul 23, 2009 at 9:31 AM, Ajai<ajaik...@gmail.com> wrote: >> >> Hi Stefan, >> >> Thanks for the quick response. >> >> We are running the tests on a "Core 2 Duo 2.3 GHz, 4 GB RAM running >> Windows >> Server 2003" machine. >> >> Please find attached the >> >> 1. repository.xml >> 2. indexconfiguration.xml. >> 3. source java file for upload (ThreadFeeder.java) >> >> http://www.nabble.com/file/p24620741/ThreadFeeder.java ThreadFeeder.java >> http://www.nabble.com/file/p24620741/repository.xml repository.xml >> http://www.nabble.com/file/p24620741/indexingconfiguration.xml >> indexingconfiguration.xml > > thanks! > > as far as i can tell, you're not doing anything unreasonable. > however, you have to be aware that some features come at > a certain cost. > > 1. fulltext search/text extractors do impact write performance > significantly. > 2. versioning: same here, checkin() is a pretty expensive operation on > nt:file nodes > 3. mssql server is not known to be terribly fast (at least not when used > as jackrabbit backend). > > in order to identify what's causing the appallingly bad results please > do the following: > 1. disable search index or text extractors and compare results > 2. remove checkin() call and compare results > 1. use emnedded derby and compare results > > if you could provide GenRandom.java, i'll run the test on my own machine. > > cheers > stefan > >> >> Kindly let me know your suggestions. >> >> Thanks, >> Ajai G >> >> >> >> >> Stefan Guggisberg wrote: >>> >>> On Thu, Jul 23, 2009 at 8:10 AM, Ajai<ajaik...@gmail.com> wrote: >>>> >>>> Hi, >>>> >>>> I am in the process of Evaluation of Jackrabbit. We are running few >>>> performance tests. >>>> Here we are adding 25,000 Folder nodes with each consisting of 15 >>>> documents. >>>> >>>> It is taking around 37 hours to complete this process, we also tried >>>> using >>>> thread to achieve this. >>>> But still the time hasn't come down. >>>> >>>> It also seems that, when adding 500 Folders with 15 docs each, takes ~ >>>> 20 >>>> mins for a empty repository, >>>> >>>> After uploading 25000 folders, when trying to add same 500 Folders with >>>> 15 >>>> docs each, it takes ~ 5 hrs. >>>> >>> >>> all figures are way too high. please provide more information on your >>> setup/configuration and environment. if possible, please also provide >>> some code of your tests. >>> >>> cheers >>> stefan >>> >>>> So is there a way to improve the performance of above mentioned >>>> functions >>>> ?. >>>> >>>> Also kindly suggest an alternate solution to perform bulk upload? >>>> >>>> Thanks >>>> Ajai G >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> View this message in context: >>>> http://www.nabble.com/Performance-of-Jackrabbit-tp24619853p24619853.html >>>> Sent from the Jackrabbit - Dev mailing list archive at Nabble.com. >>>> >>>> >>> >>> >> >> -- >> View this message in context: >> http://www.nabble.com/Performance-of-Jackrabbit-tp24619853p24620741.html >> Sent from the Jackrabbit - Dev mailing list archive at Nabble.com. >> >> > > -- View this message in context: http://www.nabble.com/Performance-of-Jackrabbit-tp24619853p24680489.html Sent from the Jackrabbit - Dev mailing list archive at Nabble.com.