Hi, We have to store in our repository a high amount of data, using this kind of tree:
Project1 |_Stream1 |__Record1 |__Record2 ... |__Record120000 ... |_Stream2 |__Record1 |__Record2 ... |__Record120000 etc. It takes some time to add those records, which was expected, but it's even more time-consuming to remove them. (sometimes even crashing the VM) I understand it has to do with Jackrabbit putting it all in memory to check for referential integrity violations. While searching for answers on the mailing list I saw two ways of dealing with this: 1- Deactivate referential integrity checking. I tried that, and it did not seem to accelerate the process, so I may be doing it wrong. (And I guess it's quite wrong to even do it) 2- Recursively removing nodes by packs. I noticed than when using the second method, the more children a node have, the more time it will take to remove some of them. So I guess it would be best to try and split the records through multiple subtrees. So I'd like to know if there is a better way of organizing my data in order to improve the adding and removing operations. And if the deactivation of referential integrity checking is really risky, and how I'm supposed to do it? (I tried subclassing RepositoryImpl and using setReferentialIntegrityChecking but it didn't seem to change anything) Thank you for your help. A. Mariette DOCXA -- View this message in context: http://jackrabbit.510166.n4.nabble.com/Performance-issue-when-removing-high-amount-of-nodes-tp3175050p3175050.html Sent from the Jackrabbit - Users mailing list archive at Nabble.com.
