[jira] Commented: (JCR-926) Global data store for binaries

Thomas Mueller (JIRA) Thu, 30 Aug 2007 07:06:55 -0700

    [ 
https://issues.apache.org/jira/browse/JCR-926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523853
 ]


Thomas Mueller commented on JCR-926:
------------------------------------

As far as I understand, one (important) use case is to use one workspace for 
'authoring' and another for 'production'. The workspaces contain mostly the 
same data (maybe 90% is the same). Having a data store for each workspace would 
mean having to copy all large files. Having one data store saves you 50% of the 
space (for large objects). Also you can move data from one workspace to the 
other very quickly (because the files don't have to be copied, only the 
identifiers). Also cloning of a workspace is very fast for the same reasons.

> i think to use different workspaces is for us the better way .. 
Do you know about blob store? If not you should try it out, because it sounds 
like this would be exactly what you need. The blob store already available.


> Global data store for binaries
> ------------------------------
>
>                 Key: JCR-926
>                 URL: https://issues.apache.org/jira/browse/JCR-926
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>            Reporter: Jukka Zitting
>         Attachments: dataStore.patch, DataStore.patch, DataStore2.patch, 
> dataStore3.patch, dataStore4.zip, dataStore5-garbageCollector.patch, 
> internalValue.patch, ReadWhileSaveTest.patch
>
>
> There are three main problems with the way Jackrabbit currently handles large 
> binary values:
> 1) Persisting a large binary value blocks access to the persistence layer for 
> extended amounts of time (see JCR-314)
> 2) At least two copies of binary streams are made when saving them through 
> the JCR API: one in the transient space, and one when persisting the value
> 3) Versioining and copy operations on nodes or subtrees that contain large 
> binary values can quickly end up consuming excessive amounts of storage space.
> To solve these issues (and to get other nice benefits), I propose that we 
> implement a global "data store" concept in the repository. A data store is an 
> append-only set of binary values that uses short identifiers to identify and 
> access the stored binary values. The data store would trivially fit the 
> requirements of transient space and transaction handling due to the 
> append-only nature. An explicit mark-and-sweep garbage collection process 
> could be added to avoid concerns about storing garbage values.
> See the recent NGP value record discussion, especially [1], for more 
> background on this idea.
> [1] 
> http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/200705.mbox/[EMAIL 
> PROTECTED]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-926) Global data store for binaries

Reply via email to