[jira] Commented: (JCR-926) Global data store for binaries

Jukka Zitting (JIRA) Thu, 21 Jun 2007 04:28:55 -0700

    [ 
https://issues.apache.org/jira/browse/JCR-926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12506853
 ]


Jukka Zitting commented on JCR-926:
-----------------------------------

The above comment is probably not comparable with previous numbers, the 
setProperty() time should not change considerably change with the DataStore 
patch (in fact it should take a bit longer due to the SHA-1 calculation). To 
avoid things like disk caches to interfere with the test I increased the size 
of the test file to 3GB (I only have 1GB RAM).

With DataStore2.patch and FineGrainedISMLocking the result is:

    Thu Jun 21 13:51:09 EEST 2007 - setProperty() - 1
    Thu Jun 21 13:55:17 EEST 2007 - begin save() - 2338
    Thu Jun 21 13:55:18 EEST 2007 - end save() - 2352
    numReads: 2353

setProperty() = 248 seconds, save() = 1 second

Without DataStore2.patch but with FineGrainedISMLocking the result is:

    Thu Jun 21 14:08:33 EEST 2007 - setProperty() - 0
    Thu Jun 21 14:12:58 EEST 2007 - begin save() - 2419
    Thu Jun 21 14:17:03 EEST 2007 - end save() - 4766
    numReads: 4816

setProperty() = 265 seconds, save() = 245 seconds

I guess the stream copy algorithm in FileDataStore is slightly faster than the 
one in BLOBFileValue, otherwise the numbers are pretty much as expected.

> Global data store for binaries
> ------------------------------
>
>                 Key: JCR-926
>                 URL: https://issues.apache.org/jira/browse/JCR-926
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>            Reporter: Jukka Zitting
>         Attachments: DataStore.patch, DataStore2.patch, 
> ReadWhileSaveTest.patch
>
>
> There are three main problems with the way Jackrabbit currently handles large 
> binary values:
> 1) Persisting a large binary value blocks access to the persistence layer for 
> extended amounts of time (see JCR-314)
> 2) At least two copies of binary streams are made when saving them through 
> the JCR API: one in the transient space, and one when persisting the value
> 3) Versioining and copy operations on nodes or subtrees that contain large 
> binary values can quickly end up consuming excessive amounts of storage space.
> To solve these issues (and to get other nice benefits), I propose that we 
> implement a global "data store" concept in the repository. A data store is an 
> append-only set of binary values that uses short identifiers to identify and 
> access the stored binary values. The data store would trivially fit the 
> requirements of transient space and transaction handling due to the 
> append-only nature. An explicit mark-and-sweep garbage collection process 
> could be added to avoid concerns about storing garbage values.
> See the recent NGP value record discussion, especially [1], for more 
> background on this idea.
> [1] 
> http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/200705.mbox/[EMAIL 
> PROTECTED]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-926) Global data store for binaries

Reply via email to