Thanks for the pointer Alain. At a quick glance, it looks like people are looking for query time filtering/aggregation, which will suffice for small data sets. Hopefully we might be able to extend that to perform pre-computations as well. (which would support much larger data sets / volumes)
I¹ll continue the discussion on the issue. thanks again, brian --- Brian O'Neill Chief Architect Health Market Science The Science of Better Results 2700 Horizon Drive King of Prussia, PA 19406 M: 215.588.6024 @boneill42 <http://www.twitter.com/boneill42> healthmarketscience.com This information transmitted in this email message is for the intended recipient only and may contain confidential and/or privileged material. If you received this email in error and are not the intended recipient, or the person responsible to deliver it to the intended recipient, please contact the sender at the email above and delete this email and any attachments and destroy any copies thereof. Any review, retransmission, dissemination, copying or other use of, or taking any action in reliance upon, this information by persons or entities other than the intended recipient is strictly prohibited. From: Alain RODRIGUEZ <arodr...@gmail.com> Reply-To: <u...@cassandra.apache.org> Date: Wednesday, December 18, 2013 at 5:13 AM To: <u...@cassandra.apache.org> Cc: "dev@cassandra.apache.org" <dev@cassandra.apache.org> Subject: Re: Dimensional SUM, COUNT, & DISTINCT in C* (replacing Acunu) Hi, this would indeed be much appreciated by a lot of people. There is this issue, existing about this subject https://issues.apache.org/jira/browse/CASSANDRA-4914 Maybe could you help commiters there. Hope this will be usefull to you. Please let us know when you find a way to do these operations. Cheers. 2013/12/18 Brian O'Neill <b...@alumni.brown.edu> > We are seeking to replace Acunu in our technology stack / platform. It is the > only component in our stack that is not open source. > > In preparation, over the last few weeks I¹ve migrated Virgil to CQL. The > vision is that Virgil could receive a REST request to upsert/delete data > (hierarchical JSON to support collections). Virgil would lookup the > dimensions/aggregations for that table, add the key to the pertinent > dimensional tables (e.g. DISTINCT), incorporate values into aggregations (e.g. > SUMs) and increment/decrement relevant counters (COUNT). (using additional > CF¹s) > > This seems straight-forward, but appears to require a read-before-write. > (e.g. read the current value of a SUM, incorporate the new value, then use the > lightweight transactions of C* 2.0 to conditionally update the value.) > > Before I go down this path, I was wondering if anyone is designing/working on > the same, perhaps at a lower level? (CQL?) > > Is there any intent to support aggregations/filters (COUNT, SUM, DISTINCT, > etc) at the CQL level? If so, is there a preliminary design? > > I can see a lower-level approach, which would leverage the commit logs (and > mem/sstables) and perform the aggregation during read-operations (and > flush/compaction). > > thoughts? i'm open to all ideas. > > -brian > -- > Brian ONeill > Chief Architect, Health Market Science (http://healthmarketscience.com) > mobile:215.588.6024 <tel:215.588.6024> > blog: http://brianoneill.blogspot.com/ > twitter: @boneill42