Re: [Status Update] Apache Cassandra backend for Sling
Hi Ian, What is the data mapping should be between Cassandra and Sling resource. I mean is a Sling Resource maps to a Cassandra Column ? Or Column Family ? Because to get this Cassandra and Sling story correct we need to finalize this. For an example what we eventually returns is a Sling resource. Everything that needs to fill in to create Sling resource should be stored in Cassandra. In a Sling resource, - Path - direct sling resource path - ResourceType - nt:cassandra - ResourceSuperType - ? - ResourceMetadata - we can create this on the fly with the data from the corresponding column. At insertion, those need to be stored. Following are the ones which I thought might be useful by default to be set for any node. Please add if we need anything more. - ContentType - ContentLength - CreationTime - ModificationTime - ResourceResolver - Do we need a resolver in this case ? So I believe in CQL context, one ROW should represent a Sling resource. If that is the case for ResourceMetadata we might need a separate column to store it since it has multiple values. I am not sure whether we can do it with CQL, but it should be possible with hector APIs may be. Appreciate your thoughts ? On Wed, Jun 19, 2013 at 1:19 AM, Dishara Wijewardana ddwijeward...@gmail.com wrote: Hi Ian, I am starting this thread to keep track on things related to the GSoC project related milestone status updates and related discussions. So the first task over view will be as follows as per GSoC proposal provided. 1. Implementing a CassandraResourceProvider to READ from Cassandra. Implementation Details [1] [1] : Implementation Details: 1.A) Write a CassanrdaResourceProviderUtil which is basically a cassendra client which will facilitate all cassandra related operations required by other modules (CassandraResourceProvider and CassandraResourceResolver). 1.B) Implementation of CassandraResourceProvider 1.C) Implementation of CassandraResourceResolver 1.D) Implementation of CassandraResource And I will start writing the CassanrdaResourceProviderUtil class which will do basic add and get using hector API. Please provide any feedback that will be useful to accomplish this task. So for this how does path mapping should be done. Because for example, the path of the cassendra node will not be same as the jcr node path. i.e provider will ask a node path /system/myapps/test/foo and where should we return it from Cassandra. Aren't we have to first consider the WRITE aspect to Cassandra ? -- Thanks /Dishara -- Thanks /Dishara
Re: [Status Update] Apache Cassandra backend for Sling
Hi Dishara, Yes. 1 resource == 1 row. The columns within that row represent the properties of the resource. I suggest that you use standard property names where appropriate (eg sling:resourceType is the Resource.resourceType etc) The Resource itself should be adaptable to a generic CassandraResource (which will probably implement Resource) which will have a map of properties containing all the columns of the cassandra row. (optimise later) A CassandraResource might look and feel like a MapString, Object or it might have a MapString, Object getProperties() method, or better still be adaptable to a Map. The essential think is dont hard code the property names in the interface of CassandraResource for the moment. ie no getContentType() and no getMimeType(), as we dont really know what a CassandraResource will store. ResourceMetadata should be built from a subset of the CassandraResource properties. You won't need to implement a ResourceResolver, only a ResourceProvider (and Factory). I would use CQL in preference to other API methods. There is one thing that hasnt been mentioned, and thats the URL - Cassandra Row mapping. There are several ways of doing this. eg: URL = /content/cassandra/columnFamily/rowID Cassandra Column Family = columnFamily Cassandra RowID = rowID or URL = /content/cassandra/columnFamilySelector/remainder/of/the/path Cassandra Cassandra Column Family = mapOfColumnFamilies.get(columnFamilySelector) Cassandra RowID = function(/remainder/of/the/path) or to take that one stage further public interface CassandraMapper { String getCQL(String columnFamilySelector, String path); } URL = /content/cassandra/columnFamilySelector/remainderOfPath String cqlQuery = mapOfCassandraMappers.get(columnFamilySelector).getCQL(columnFamilySelector, remainderOfPath); Which would allow us provided one or more implementations of CassandraMapper to map between URL and CQL. HTH Ian On 23 June 2013 19:29, Dishara Wijewardana ddwijeward...@gmail.com wrote: Hi Ian, What is the data mapping should be between Cassandra and Sling resource. I mean is a Sling Resource maps to a Cassandra Column ? Or Column Family ? Because to get this Cassandra and Sling story correct we need to finalize this. For an example what we eventually returns is a Sling resource. Everything that needs to fill in to create Sling resource should be stored in Cassandra. In a Sling resource, - Path - direct sling resource path - ResourceType - nt:cassandra - ResourceSuperType - ? - ResourceMetadata - we can create this on the fly with the data from the corresponding column. At insertion, those need to be stored. Following are the ones which I thought might be useful by default to be set for any node. Please add if we need anything more. - ContentType - ContentLength - CreationTime - ModificationTime - ResourceResolver - Do we need a resolver in this case ? So I believe in CQL context, one ROW should represent a Sling resource. If that is the case for ResourceMetadata we might need a separate column to store it since it has multiple values. I am not sure whether we can do it with CQL, but it should be possible with hector APIs may be. Appreciate your thoughts ? On Wed, Jun 19, 2013 at 1:19 AM, Dishara Wijewardana ddwijeward...@gmail.com wrote: Hi Ian, I am starting this thread to keep track on things related to the GSoC project related milestone status updates and related discussions. So the first task over view will be as follows as per GSoC proposal provided. 1. Implementing a CassandraResourceProvider to READ from Cassandra. Implementation Details [1] [1] : Implementation Details: 1.A) Write a CassanrdaResourceProviderUtil which is basically a cassendra client which will facilitate all cassandra related operations required by other modules (CassandraResourceProvider and CassandraResourceResolver). 1.B) Implementation of CassandraResourceProvider 1.C) Implementation of CassandraResourceResolver 1.D) Implementation of CassandraResource And I will start writing the CassanrdaResourceProviderUtil class which will do basic add and get using hector API. Please provide any feedback that will be useful to accomplish this task. So for this how does path mapping should be done. Because for example, the path of the cassendra node will not be same as the jcr node path. i.e provider will ask a node path /system/myapps/test/foo and where should we return it from Cassandra. Aren't we have to first consider the WRITE aspect to Cassandra ? -- Thanks /Dishara -- Thanks /Dishara
Build failed in Jenkins: sling-healthcheck-1.6 #25
See https://builds.apache.org/job/sling-healthcheck-1.6/25/ -- Started by timer Building remotely on ubuntu2 in workspace https://builds.apache.org/job/sling-healthcheck-1.6/ws/ Updating http://svn.apache.org/repos/asf/sling/trunk/contrib/extensions/healthcheck At revision 1495899 no change for http://svn.apache.org/repos/asf/sling/trunk/contrib/extensions/healthcheck since the previous build Parsing POMs ERROR: Failed to parse POMs hudson.util.IOException2: remote file operation failed: https://builds.apache.org/job/sling-healthcheck-1.6/ws/healthcheck at hudson.remoting.Channel@167210ac:ubuntu2 at hudson.FilePath.act(FilePath.java:861) at hudson.FilePath.act(FilePath.java:838) at hudson.maven.MavenModuleSetBuild$MavenModuleSetBuildExecution.parsePoms(MavenModuleSetBuild.java:862) at hudson.maven.MavenModuleSetBuild$MavenModuleSetBuildExecution.doRun(MavenModuleSetBuild.java:620) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:592) at hudson.model.Run.execute(Run.java:1568) at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:477) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:236) Caused by: java.io.FileNotFoundException: /tmp/hudson-remoting5498224047807891324/META-INF/plexus/components.xml (No such file or directory) at java.io.FileOutputStream.open(Native Method) at java.io.FileOutputStream.init(FileOutputStream.java:209) at java.io.FileOutputStream.init(FileOutputStream.java:160) at hudson.remoting.RemoteClassLoader.makeResource(RemoteClassLoader.java:309) at hudson.remoting.RemoteClassLoader.findResources(RemoteClassLoader.java:276) at java.lang.ClassLoader.getResources(ClassLoader.java:1015) at java.lang.ClassLoader.getResources(ClassLoader.java:1011) at hudson.maven.MavenUtil$MaskingClassLoader.getResources(MavenUtil.java:295) at hudson.maven.MavenUtil.createEmbedder(MavenUtil.java:203) at hudson.maven.MavenModuleSetBuild$PomParser.invoke(MavenModuleSetBuild.java:1173) at hudson.maven.MavenModuleSetBuild$PomParser.invoke(MavenModuleSetBuild.java:997) at hudson.FilePath$FileCallableWrapper.call(FilePath.java:2348) at hudson.remoting.UserRequest.perform(UserRequest.java:118) at hudson.remoting.UserRequest.perform(UserRequest.java:48) at hudson.remoting.Request$2.run(Request.java:326) at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679)