[jira] [Commented] (JCR-3534) Efficient copying of binaries across repositories with the same data store

Jukka Zitting (JIRA) Fri, 10 May 2013 07:55:20 -0700

    [ 
https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13654527#comment-13654527
 ]


Jukka Zitting commented on JCR-3534:
------------------------------------

Note that an explicitly configured or generated "secret" is not necessary in 
all cases, as all we need is some non-public and un-guessable key that is 
different for each distinct backend store. For example the database and S3 data 
stores could combine the connection parameters, including the secret password 
or access key, to generate such a reference key without the need for extra 
configuration or key storage.

So we have at least three ways in which a data store could get such a reference 
key:

1. As an explicit "secret" (perhaps "referenceKey" was a better name) 
configuration parameter
2. As a random string that's automatically generated at first startup and 
stored along with the rest of the data store contents
3. As a combination of existing configuration parameters

An implementation could even use a combination of the above. I don't have much 
preference on any specific implementation, as they all cover the same basic 
functionality (each with its own minor benefits and downsides compared to the 
others) and there's no inherent difference security as anyone with access to 
the <DataStore/> configuration entry will any case have full access to the 
entire data store.

In any case I agree with Angela that whatever we do, the default repository.xml 
file should *not* contain a pre-defined reference key, as that'll make all 
default deployments share the same key even if their data stores are distinct.

Also, another change that occurs to me is that the getIdentifierFromReference() 
method should also verify the existence of the referenced record before 
returning the identifier, as otherwise we could end up with dangling references 
to garbage-collected binaries.
                
> Efficient copying of binaries across repositories with the same data store
> --------------------------------------------------------------------------
>
>                 Key: JCR-3534
>                 URL: https://issues.apache.org/jira/browse/JCR-3534
>             Project: Jackrabbit Content Repository
>          Issue Type: New Feature
>          Components: jackrabbit-api, jackrabbit-core
>    Affects Versions: 2.6
>            Reporter: Felix Meschberger
>            Assignee: Tommaso Teofili
>         Attachments: JCR-3534.2.patch, JCR-3534.3.patch, JCR-3534.4.patch, 
> JCR-3534.patch, JCR-3534.patch
>
>
> we have a couple of use cases, where we would like to leverage the global 
> data store to prevent sending around and copying around large binary data 
> unnecessarily: We have two separate Jackrabbit instances configured to use 
> the same DataStore (for the sake of this discussion assume we have the 
> problems of concurrent access and garbage collection under control). When 
> sending content from one instance to the other instance we don't want to send 
> potentially large binary data (e.g. video files) if not needed.
> The idea is for the sender to just send the content identity from 
> JackrabbitValue.getContentIdentity(). The receiver would then check whether 
> the such content already exists and would reuse if so:
> String ci = contentIdentity_from_sender;
> try {
>     Value v = session.getValueByContentIdentity(ci);
>     Property p = targetNode.setProperty(propName, v);
> } catch (ItemNotFoundException ie) {
>     // unknown or invalid content Identity
> } catch (RepositoryException re) {
>     // some other exception
> }
> Thus the proposed JackrabbitSession.getValueByContentIdentity(String) method 
> would allow for round tripping the JackrabbitValue.getContentIdentity() 
> preventing superfluous binary data copying and moving. 
> See also the dev@ thread 
> http://jackrabbit.markmail.org/thread/gedk5jsrp6offkhi

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (JCR-3534) Efficient copying of binaries across repositories with the same data store

Reply via email to