[ https://issues.apache.org/jira/browse/JCR-926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12506853 ]
Jukka Zitting commented on JCR-926: ----------------------------------- The above comment is probably not comparable with previous numbers, the setProperty() time should not change considerably change with the DataStore patch (in fact it should take a bit longer due to the SHA-1 calculation). To avoid things like disk caches to interfere with the test I increased the size of the test file to 3GB (I only have 1GB RAM). With DataStore2.patch and FineGrainedISMLocking the result is: Thu Jun 21 13:51:09 EEST 2007 - setProperty() - 1 Thu Jun 21 13:55:17 EEST 2007 - begin save() - 2338 Thu Jun 21 13:55:18 EEST 2007 - end save() - 2352 numReads: 2353 setProperty() = 248 seconds, save() = 1 second Without DataStore2.patch but with FineGrainedISMLocking the result is: Thu Jun 21 14:08:33 EEST 2007 - setProperty() - 0 Thu Jun 21 14:12:58 EEST 2007 - begin save() - 2419 Thu Jun 21 14:17:03 EEST 2007 - end save() - 4766 numReads: 4816 setProperty() = 265 seconds, save() = 245 seconds I guess the stream copy algorithm in FileDataStore is slightly faster than the one in BLOBFileValue, otherwise the numbers are pretty much as expected. > Global data store for binaries > ------------------------------ > > Key: JCR-926 > URL: https://issues.apache.org/jira/browse/JCR-926 > Project: Jackrabbit > Issue Type: New Feature > Components: core > Reporter: Jukka Zitting > Attachments: DataStore.patch, DataStore2.patch, > ReadWhileSaveTest.patch > > > There are three main problems with the way Jackrabbit currently handles large > binary values: > 1) Persisting a large binary value blocks access to the persistence layer for > extended amounts of time (see JCR-314) > 2) At least two copies of binary streams are made when saving them through > the JCR API: one in the transient space, and one when persisting the value > 3) Versioining and copy operations on nodes or subtrees that contain large > binary values can quickly end up consuming excessive amounts of storage space. > To solve these issues (and to get other nice benefits), I propose that we > implement a global "data store" concept in the repository. A data store is an > append-only set of binary values that uses short identifiers to identify and > access the stored binary values. The data store would trivially fit the > requirements of transient space and transaction handling due to the > append-only nature. An explicit mark-and-sweep garbage collection process > could be added to avoid concerns about storing garbage values. > See the recent NGP value record discussion, especially [1], for more > background on this idea. > [1] > http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/200705.mbox/[EMAIL > PROTECTED] -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.