Hi Claus,
I think we have found the reason of low performance. We created the
followoing stress test:
1. Uploading of 10 identical PDF files with different names in 10
threads. The size of PDF file is 100 MB
2. Deleting PDF files
3. Repeat steps 1 and 2 in a infinite loop
Analysis of dump thread showed that Jackrabbit intensively "merges" PDF
files with each other during each operation of upload or save to
repository. As far as we understand, Jackrabbit merges files even with
different names but similar binary content in order to save disk space.
We think it saves the original PDF file and difference (some delta) for
second similar PDF file. I can be wrong but this is our feeling.
Moreover, when file is being deleted, Jackrabbit does not delete it
physically but only marks as 'deleted'. The real delete operation will
be performed by Jackrabbit Garbage Collector. So the situation could be
the following:
1. Test uploads PDF files.
2. Test deletes PDF files.
3. Test uploads PDF files in second loop.
4. Jackrabbit merges PDF files with already "deleted" ones.
When we tried to perform the same stress test with small files (100 Kb),
the performance was much better, because there was no intensive merging.
Is there a way to "tune" merging or even switch it off, even if we loose
in saving disk space?
Regards,
Anton
Hi Anton,
In our case we have 400 Gb repository, the average simultaneous amount
of people using Jackrabbit is 25. And the configuration of server is the
I think with that configuration you should handle 25 users in any case :-)
Do you think it is enough, if not maybe that is one of possible reasons
why sometimes the application does not respond for long periods of time?
Hmm it's extremely hard to say what's going on in your application ...
If it hangs you could create a thread dump to analyse where your applicaion is
waiting
greets
claus