[jira] Commented: (JCR-926) Global data store for binaries

JIRA Thu, 30 Aug 2007 23:16:56 -0700

    [ 
https://issues.apache.org/jira/browse/JCR-926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12524012
 ]


Claus Köll commented on JCR-926:
--------------------------------

ok if you have a use case as you described i think a global datastore is the 
best way to make cross-workspace operations
more easy. you have only one file and no copies af same files.
if the road goes to a centralized repository a global datastore makes of course 
sense.

in my case (and i think other has also a similarly use case) a per workspace 
datastore makes things easier
i am working for a government and the office employee get a lot of paper every 
day. they scan it and put it into jackrabbit.
now we must keep the documents based on the law up to 5-7 years with fast read 
access in jackrabbit. after that time we can archive it (slow access) and 
therefore
we want to store this documents not on a SAN storage (because its expensive) 
rather save it to a cheaper storage system (tape drive system)
we have planed to make this with moving the data from one workspace (SAN) to a 
other one (tape drive system)
with the global datastore is this not possible i think

how would you solve such scenarios ?
greets
claus



> Global data store for binaries
> ------------------------------
>
>                 Key: JCR-926
>                 URL: https://issues.apache.org/jira/browse/JCR-926
>             Project: Jackrabbit
>          Issue Type: New Feature
>          Components: core
>            Reporter: Jukka Zitting
>         Attachments: dataStore.patch, DataStore.patch, DataStore2.patch, 
> dataStore3.patch, dataStore4.zip, dataStore5-garbageCollector.patch, 
> internalValue.patch, ReadWhileSaveTest.patch
>
>
> There are three main problems with the way Jackrabbit currently handles large 
> binary values:
> 1) Persisting a large binary value blocks access to the persistence layer for 
> extended amounts of time (see JCR-314)
> 2) At least two copies of binary streams are made when saving them through 
> the JCR API: one in the transient space, and one when persisting the value
> 3) Versioining and copy operations on nodes or subtrees that contain large 
> binary values can quickly end up consuming excessive amounts of storage space.
> To solve these issues (and to get other nice benefits), I propose that we 
> implement a global "data store" concept in the repository. A data store is an 
> append-only set of binary values that uses short identifiers to identify and 
> access the stored binary values. The data store would trivially fit the 
> requirements of transient space and transaction handling due to the 
> append-only nature. An explicit mark-and-sweep garbage collection process 
> could be added to avoid concerns about storing garbage values.
> See the recent NGP value record discussion, especially [1], for more 
> background on this idea.
> [1] 
> http://mail-archives.apache.org/mod_mbox/jackrabbit-dev/200705.mbox/[EMAIL 
> PROTECTED]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-926) Global data store for binaries

Reply via email to