Hi Ian, Thank you very much for the explanation. Before replying to this mail, I revisited the facts you mentioned and try to comeup with a end to end big picture and what are the challenges that has to face when implementing this project.
Read/write with Cassandra data seems is pretty straight forward with few lines of code using a client API. But the tricky part is to make a clean bridge between JCR wrapped sling resource API vs Cassendra column family data storage. So I took some time and went through the JCR spec and try to understand how it deals with resources (I assumed sling resource is directly based on JCR node concept). And got a good understanding of how JCR thinks on resource and how they deal with it. Because we need to think of the mapping between the sling wrapper interface for resources which is org.apache.sling.api.resource.Resource (which is a JCR Node as I understand) and the Cassendra data layer. For instance Cassandra provider will return a sling resource and it should be enrich with the properties/attributes which helps the sling resource to keep its state like resource meta data, resource type (which should be the JCR node type), and etc. And the provider should only return a resource which only has such very basic meta data. For instance, like org.apache.sling.api.resource.Resource #getChild() #getChildren() we should not keep those in memory. We should return them on the fly from Cassandra. I think we should write a separate Sling Cassandra Adapter layer and provider should talk to Cassandra through Cassandra Adapter. I hope this will make it more cleaner. Appreciate your valuable feedback. So that based on feedback I can provide a patch which will reflect the basic architecture and keep on patching with future additions. On Fri, Mar 29, 2013 at 3:48 AM, Ian Boston <[email protected]> wrote: > Hi and welcome, > Some comments inline below. > > On 29 March 2013 06:02, Dishara Wijewardana <[email protected]> > wrote: > > Hi all, > > I am Dishara Wijewardana, a student who is willing to take part in this > > GSoC 2013 . > > > > I have successfully completed GSoC 2012 in Apache Velocity and there I > have > > implemented JSR 223 support for Velocity. I found myself really > interested > > in this project since it covers very useful and interesting topics. So > > thought of getting in to this project idea and provide a good proposal > for > > this project. > > > > So I did some research around sling which might be useful for me to get > in > > to this project. I like sling as it sticks to community standards where > it > > uses a standard JCR2 repository to store resources which is a really good > > thing to have. > > > > I went through the information provided in the JIRA[1] and according to > > that at the end of this project what is expected to have implemented is a > > ResourceProvider for Sling which tunnels with a Cassandra (standalone > > one/cluster). > > yes, correct. > > > > > As far as I got to know, sling directly calls to Apache JackRabbit APIs > > (JCR APIs) to store resources. So I found a bit complicated this project > > idea in that sense. Because if we are to implement a Cassandra backend > for > > Sling (as per this proposal), and Sling storage is on top of JackRabbit, > > ideally what should happen is to make JackRabbit capable of using > Cassandra > > as its resource persistent layer, and configure it through Sling ? Please > > correct me If I am wrong. > > Your right. > The idea is this, Sling resolves paths into Resources > ie /content/mywebsite/page1.html is resolved to a Resource with a path > of /content/mywebsite/page1 See [1]. Normally a JCR repository > takes ownership of everything under /, so all Resources are JCR > Resources. > > However, with a ResourceProvider its possible to "mount" a alternative > source of Resources at any location in the tree. eg: > If I create a ResourceProvider and configure it to respond to all > resource resolution operations at > /content/cassandra > > then > /content/cassandra/columnFamilyA/cassandraRowIDB > > will generate a Cassandra Resource instead of a JCR Resource. > > Initially the aim is to write a ResourceProvider that will allow > Readonly access to a Cassandra cluster (cluster of one is ok for > testing), but ultimately we would like to be able to write to that > cluster as well. > > Why Do it ? > Every storage platform has different characteristics, some are ideal > for extreem volume writes of throw away data, some are ideal for > extreem volume reads of precious audited transactrional data. Being > able to "mount" multiple stores in Sling enables Sling to integrate > data from all types of sources using best of breed address each use > case. (Thats the theory, anyway :)) > +1 and this is a wonderful architecture interms of extensibility. Something even a repository vendor like Jackrabbit also would want to follow. Because they only have a JCR interfaced tree. > > I hope that makes things clearer. > > 1 http://sling.apache.org/site/resources.html > > > > > But if it is only to READ resources, this project is relatively less > > complex (not quite sure though ;-) ) since what is required is to have a > > JCR/Sling Resource compatible wrapper layer interface on top of Cassendra > > to read cassandra data. > > Initially, just read. Then read with access control. The read/write > with access control. > Read/Write complexity will be more or less the same as I feel. But read write with access control is something we have to discuss separately. Does sling maintaining access control directly with jackrabbit's javax.jcr.security module ? Or any inhouse access control layer ? > > > > > Appreciate any feedback and guidance on how to proceed. > > If you havent already you need to checkout the information at: > * http://www.google-melange.com/gsoc/homepage/google/gsoc2013 > * http://community.apache.org/gsoc.html > > especially the timeline and dates. > > There is no guarantee that Apache will be a GSoC organisation > (although its highly likely), and there are currently 129 project > proposals so there is no guarantee that you will get accepted as a > Student on this project, but the quality of your submission and your > enthusiasm will go a long way to making that happen. > > Good luck and I look forward to seeing you on these lists over the > summer. If you do make it through, I and everyone in this community > will try and make it fun and rewarding for you. > > Best Regards > Ian > > > > > > > [1] - https://issues.apache.org/jira/browse/SLING-2798 > > > > -- > > Thanks > > /Dishara > -- Thanks /Dishara
