Re: Performance of Jackrabbit

Ajai Mon, 27 Jul 2009 06:14:46 -0700

Hi Stefan,

I tried using the Derby database to upload 375000 Documents.


When i tried to add a document to this setup. It took more than 30 mins to
do a checkin, 
The system CPU utilization was around 90% to 100% and the JVM heap size also
is around 1.5GB.
Is there someway to handle this out.

Now i am using the following hierarchical structure:
Folder1
 ------ Folder
    -------- File1
    -------- File2


Also in this i am not doing the fileNode.checkin() operation.

Thanks
Ajai G




Thanks
Ajai G



Stefan Guggisberg wrote:
> 
> hi ajai
> 
> On Thu, Jul 23, 2009 at 9:31 AM, Ajai<ajaik...@gmail.com> wrote:
>>
>> Hi Stefan,
>>
>> Thanks for the quick response.
>>
>> We are running the tests on a "Core 2 Duo 2.3 GHz, 4 GB  RAM running
>> Windows
>> Server 2003" machine.
>>
>> Please find attached the
>>
>> 1. repository.xml
>> 2. indexconfiguration.xml.
>> 3. source java file for upload (ThreadFeeder.java)
>>
>> http://www.nabble.com/file/p24620741/ThreadFeeder.java ThreadFeeder.java
>> http://www.nabble.com/file/p24620741/repository.xml repository.xml
>> http://www.nabble.com/file/p24620741/indexingconfiguration.xml
>> indexingconfiguration.xml
> 
> thanks!
> 
> as far as i can tell, you're not doing anything unreasonable.
> however, you have to be aware that some features come at
> a certain cost.
> 
> 1. fulltext search/text extractors do impact write performance
> significantly.
> 2. versioning: same here, checkin() is a pretty expensive operation on
>     nt:file nodes
> 3. mssql server is not known to be terribly fast (at least not when used
>     as jackrabbit backend).
> 
> in order to identify what's causing the appallingly bad results please
> do the following:
> 1. disable search index or text extractors and compare results
> 2. remove checkin() call and compare results
> 1. use emnedded derby and compare results
> 
> if you could provide GenRandom.java, i'll run the test on my own machine.
> 
> cheers
> stefan
> 
>>
>> Kindly let me know your suggestions.
>>
>> Thanks,
>> Ajai G
>>
>>
>>
>>
>> Stefan Guggisberg wrote:
>>>
>>> On Thu, Jul 23, 2009 at 8:10 AM, Ajai<ajaik...@gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I am in the process of Evaluation of Jackrabbit. We are running few
>>>> performance tests.
>>>> Here we are adding 25,000 Folder nodes with each consisting of 15
>>>> documents.
>>>>
>>>> It is taking around 37 hours to complete this process, we also tried
>>>> using
>>>> thread to achieve this.
>>>> But still the time hasn't come down.
>>>>
>>>> It also seems that, when adding 500 Folders with 15 docs each, takes  ~
>>>> 20
>>>> mins for a empty repository,
>>>>
>>>> After uploading 25000 folders, when trying to add same 500 Folders with
>>>> 15
>>>> docs each, it takes ~ 5 hrs.
>>>>
>>>
>>> all figures are way too high. please provide more information on your
>>> setup/configuration and environment. if possible, please also provide
>>> some code of your tests.
>>>
>>> cheers
>>> stefan
>>>
>>>> So is there a way to improve the performance of above mentioned
>>>> functions
>>>> ?.
>>>>
>>>> Also kindly suggest an alternate solution to perform bulk upload?
>>>>
>>>> Thanks
>>>> Ajai G
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/Performance-of-Jackrabbit-tp24619853p24619853.html
>>>> Sent from the Jackrabbit - Dev mailing list archive at Nabble.com.
>>>>
>>>>
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Performance-of-Jackrabbit-tp24619853p24620741.html
>> Sent from the Jackrabbit - Dev mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Performance-of-Jackrabbit-tp24619853p24680489.html
Sent from the Jackrabbit - Dev mailing list archive at Nabble.com.

Re: Performance of Jackrabbit

Reply via email to