In my opinion, triggers/stored procedures are an absolute requirement for any distributed database.
We've been using stored procedures in Cassandra now for a while, we've made modifications such that we don't really write directly anymore but pass everything through either a default stored procedures (which is just what was there before) or a dynamically loaded piece of java. These stored procedures can call other dynamically loaded pieces of java as well - we don't have any plans to implement any scripting capabilities. We can also 'select' from procedures. The idea of downloading data from a distributed data base for processing flies in the face of what nosql and bigdata is all about - you've got to do it in the db. On Apr 22, 2012, at 11:35 AM, Brian O'Neill wrote: > Praveen, > > We are certainly interested. To get things moving we implemented an add-on > for Cassandra to demonstrate the viability (using AOP): > https://github.com/hmsonline/cassandra-triggers > > Right now the implementation executes triggers asynchronously, allowing you > to implement a java interface and plugin your own java class that will get > called for every insert. > > Per the discussion on 1311, we intend to extend our proof of concept to be > able to invoke scripts as well. (minimally we'll enable javascript, but > we'll probably allow for ruby and groovy as well) > > -brian > > On Apr 22, 2012, at 12:23 PM, Praveen Baratam wrote: > >> I found that Triggers are coming in Cassandra 1.2 >> (https://issues.apache.org/jira/browse/CASSANDRA-1311) but no mention of any >> StoreProc like pattern. >> >> I know this has been discussed so many times but never met with any >> initiative. Even Groovy was staged out of the trunk. >> >> Cassandra is great for logging and as such will be infinitely more useful if >> some logic can be pushed into the Cassandra cluster nearer to the location >> of Data to generate a materialized view useful for applications. >> >> Server Side Scripts/Routines in Distributed Databases could soon prove to be >> the differentiating factor. >> >> Let me reiterate things with a use case. >> >> In our application we store time series data in wide rows with TTL set on >> each point to prevent data from growing beyond acceptable limits. Still the >> data size can be a limiting factor to move all of it from the cluster node >> to the querying node and then to the application via thrift for processing >> and presentation. >> >> Ideally we should process the data on the residing node and pass only the >> materialized view of the data upstream. This should be trivial if Cassandra >> implements some sort of server side scripting and CQL semantics to call it. >> >> Is anybody else interested in a similar feature? Is it being worked on? Are >> there any alternative strategies to this problem? >> >> Praveen >> >> > > -- > Brian ONeill > Lead Architect, Health Market Science (http://healthmarketscience.com) > mobile:215.588.6024 > blog: http://weblogs.java.net/blog/boneill42/ > blog: http://brianoneill.blogspot.com/ >