Hi Claus,

I think we have found the reason of low performance. We created the followoing stress test: 1. Uploading of 10 identical PDF files with different names in 10 threads. The size of PDF file is 100 MB
2. Deleting PDF files
3. Repeat steps 1 and 2 in a infinite loop

Analysis of dump thread showed that Jackrabbit intensively "merges" PDF files with each other during each operation of upload or save to repository. As far as we understand, Jackrabbit merges files even with different names but similar binary content in order to save disk space. We think it saves the original PDF file and difference (some delta) for second similar PDF file. I can be wrong but this is our feeling. Moreover, when file is being deleted, Jackrabbit does not delete it physically but only marks as 'deleted'. The real delete operation will be performed by Jackrabbit Garbage Collector. So the situation could be the following:
1. Test uploads PDF files.
2. Test deletes PDF files.
3. Test uploads PDF files in second loop.
4. Jackrabbit merges PDF files with already "deleted" ones.

When we tried to perform the same stress test with small files (100 Kb), the performance was much better, because there was no intensive merging.

Is there a way to "tune" merging or even switch it off, even if we loose in saving disk space?

Regards,
Anton
Hi Anton,

In our case we have 400 Gb repository, the average simultaneous amount
of people using Jackrabbit is 25. And the configuration of server is the
I think with that configuration you should handle 25 users in any case :-)

Do you think it is enough, if not maybe that is one of possible reasons
why sometimes the application does not respond for long periods of time?
Hmm it's extremely hard to say what's going on in your application ...
If it hangs you could create a thread dump to analyse where your applicaion is 
waiting

greets
claus

Reply via email to