Re: API proposal for - Expose URL for Blob source (OAK-1963)

Ian Boston Tue, 10 May 2016 10:00:36 -0700

Hi Angela,

On 10 May 2016 at 17:19, Angela Schreiber <anch...@adobe.com> wrote:


> Hi Ian
>
> >Fair enough, provided there is a solution that addresses the issue Chetan
> >is trying to address.
>
> That's what we are all looking for :)
>
> >The alternative, for some applications, seems to store the binary data
> >outside Oak, which defeats the purpose completely.
>
> You mean with the current setup, right?
>

yes.


>
> That might well be... while I haven't been involved with a concrete
> case I wouldn't categorically reject that this might in same cases
> even be the right solution.
> But maybe I am biased due to the fact that we also have a big
> community that effectively stores and manages their user/group
> accounts outside the repository and where I am seeing plenty of
> trouble with the conception that those accounts _must_ be synced
> (i.e. copied) into the repo.
>
> So, I'd definitely like to understand why you think that this
> "completely defeats the purpose". I agree that it's not always
> desirable but nevertheless there might be valid use-cases.
>


If the purpose of Oak is to provide a content repository to store metadata
and assets, then if the application built on top of Oak, in order to
achieve its scalability targets has to store its asset data (blobs) outside
Oak, that defeats the purpose of supporting the storage of assets within
Oak. Oak should support the storage of assets within Oak supporting the
scalability requirements of the application. Since they are non trivial and
hard to quantify, that means horizontal scalability limited only by
available budget to purchase VM's or hardware.

You can argue that horizontal scalability is not really required.
I can share use cases, not exactly the same ones Chetan is working on where
it is.
Sorry I can't share them on list.



>
> >I don't have a perfect handle on the issue he is trying to address or what
> >would be an acceptable solution, but I suspect the only solution that is
> >not vulnerable by design will a solution that abstracts all the required
> >functionality behind an Oak API (ie no S3Object, File object or anything
> >that could leak) and then provide all the required functionality with an
> >acceptable level of performance in the implementation. That is doable, but
> >a lot more work.
>
> Not sure about that :-)
> Quite frankly I would very much appreciate if took the time to collect
> and write down the required (i.e. currently known and expected)
> functionality.
>

In the context of what I said above, for AWS deployment that means wrapping
[1] so nothing can leak and supporting almost everything expressed by [2]
via an Oak API/jar in a way that enables horizontal scalability.


>
> Then look at the requirements and look what is wrong with the current
> API that we can't meet those requirements:
> - is it just missing API extensions that can be added with moderate effort?
> - are there fundamental problems with the current API that we needed to
> address?
> - maybe we even have intrinsic issues with the way we think about the role
> of the repo?
>
> IMHO, sticking to kludges might look promising on a short term but
> I am convinced that we are better off with a fundamental analysis of
> the problems... after all the Binary topic comes up on a regular basis.
> That leaves me with the impression that yet another tiny extra and
> adaptables won't really address the core issues.
>

I agree.
It comes up time and again because the applications are being asked to do
something Oak does not currently support, so developers look for a work
arround.
It should be done properly, once and for all.
imvho, that is a lot of work upfront, but since I am not the one doing the
work its not right for me to estimate or suggest anyone do it.

Best Regards
Ian

1
http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/model/S3Object.html
2 http://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectOps.html



> Kind regards
> Angela
>
>
>
> >
> >
> >Best Regards
> >Ian
> >
> >
> >>
> >> Kind regards
> >> Angela
> >>
> >> >
> >> >Best Regards
> >> >Ian
> >> >
> >> >
> >> >On 3 May 2016 at 15:36, Chetan Mehrotra <chetan.mehro...@gmail.com>
> >> wrote:
> >> >
> >> >> Hi Team,
> >> >>
> >> >> For OAK-1963 we need to allow access to actaul Blob location say in
> >>form
> >> >> File instance or S3 object id etc. This access is need to perform
> >> >>optimized
> >> >> IO operation around binary object e.g.
> >> >>
> >> >> 1. The File object can be used to spool the file content with zero
> >>copy
> >> >> using NIO by accessing the File Channel directly [1]
> >> >>
> >> >> 2. Client code can efficiently replicate a binary stored in S3 by
> >>having
> >> >> direct access to S3 object using copy operation
> >> >>
> >> >> To allow such access we would need a new API in the form of
> >> >> AdaptableBinary.
> >> >>
> >> >> API
> >> >> ===
> >> >>
> >> >> public interface AdaptableBinary {
> >> >>
> >> >>     /**
> >> >>      * Adapts the binary to another type like File, URL etc
> >> >>      *
> >> >>      * @param <AdapterType> The generic type to which this binary is
> >> >> adapted
> >> >>      *            to
> >> >>      * @param type The Class object of the target type, such as
> >> >>      *            <code>File.class</code>
> >> >>      * @return The adapter target or <code>null</code> if the binary
> >> >>cannot
> >> >>      *         adapt to the requested type
> >> >>      */
> >> >>     <AdapterType> AdapterType adaptTo(Class<AdapterType> type);
> >> >> }
> >> >>
> >> >> Usage
> >> >> =====
> >> >>
> >> >> Binary binProp = node.getProperty("jcr:data").getBinary();
> >> >>
> >> >> //Check if Binary is of type AdaptableBinary
> >> >> if (binProp instanceof AdaptableBinary){
> >> >>      AdaptableBinary adaptableBinary = (AdaptableBinary) binProp;
> >> >>
> >> >>     //Adapt it to File instance
> >> >>      File file = adaptableBinary.adaptTo(File.class);
> >> >> }
> >> >>
> >> >>
> >> >>
> >> >> The Binary instance returned by Oak
> >> >> i.e. org.apache.jackrabbit.oak.plugins.value.BinaryImpl would then
> >> >> implement this interface and calling code can then check the type and
> >> >>cast
> >> >> it and then adapt it
> >> >>
> >> >> Key Points
> >> >> ========
> >> >>
> >> >> 1. Depending on backing BlobStore the binary can be adapted to
> >>various
> >> >> types. For FileDataStore it can be adapted to File. For S3DataStore
> >>it
> >> >>can
> >> >> either be adapted to URL or some S3DataStore specific type.
> >> >>
> >> >> 2. Security - Thomas suggested that for better security the ability
> >>to
> >> >> adapt should be restricted based on session permissions. So if the
> >>user
> >> >>has
> >> >> required permission then only adaptation would work otherwise null
> >> >>would be
> >> >> returned.
> >> >>
> >> >> 3. Adaptation proposal is based on Sling Adaptable [2]
> >> >>
> >> >> 4. This API is for now exposed only at JCR level. Not sure should we
> >>do
> >> >>it
> >> >> at Oak level as Blob instance are currently not bound to any
> >>session. So
> >> >> proposal is to place this in 'org.apache.jackrabbit.oak.api' package
> >> >>
> >> >> Kindly provide your feedback! Also any suggestion/guidance around how
> >> >>the
> >> >> access control be implemented
> >> >>
> >> >> Chetan Mehrotra
> >> >> [1] http://www.ibm.com/developerworks/library/j-zerocopy/
> >> >> [2]
> >> >>
> >> >>
> >> >>
> >>
> >>
> https://sling.apache.org/apidocs/sling5/org/apache/sling/api/adapter/Adap
> >> >>table.html
> >> >>
> >>
> >>
>
>

Re: API proposal for - Expose URL for Blob source (OAK-1963)

Reply via email to