Re: Node side processing
Hi David, Check out the ongoing discussion in https://issues.apache.org/jira/browse/CASSANDRA-6704 as well as some related tickets linked to from that one. No consensus at this point, but I'm personally hoping to see something along the general lines of Hive's UDFs. -Tupshin On Thu, Feb 27, 2014 at 8:50 AM, David Semeria da...@lmframework.comwrote: Hi List, I was wondering whether there have been any past proposals for implementing node side processing (NSP) in C*. By NSP, I mean the passing a reference to a Java class which would then process the result set before it being returned to the client. In our particular use case our clients typically loop through result sets of a million or more rows to produce a tiny amount of output (sums, means, variance, etc). The bottleneck -- quite obviously -- is the need to transfer a million rows to the client before processing can take place. It would be extremely useful to execute this processing on the coordinator node and only transfer the results to the client. I mention this here because I can imagine other C* users having similar requirements. Thanks D.
Re: Node side processing
A few: https://issues.apache.org/jira/browse/CASSANDRA-4914 https://issues.apache.org/jira/browse/CASSANDRA-5184 https://issues.apache.org/jira/browse/CASSANDRA-6704 https://issues.apache.org/jira/browse/CASSANDRA-6167 On Thu, Feb 27, 2014 at 7:50 AM, David Semeria da...@lmframework.comwrote: Hi List, I was wondering whether there have been any past proposals for implementing node side processing (NSP) in C*. By NSP, I mean the passing a reference to a Java class which would then process the result set before it being returned to the client. In our particular use case our clients typically loop through result sets of a million or more rows to produce a tiny amount of output (sums, means, variance, etc). The bottleneck -- quite obviously -- is the need to transfer a million rows to the client before processing can take place. It would be extremely useful to execute this processing on the coordinator node and only transfer the results to the client. I mention this here because I can imagine other C* users having similar requirements. Thanks D.
Re: Node side processing
Check intravert on github. I am working t get many of those features into cassandra. On Thursday, February 27, 2014, Brandon Williams dri...@gmail.com wrote: A few: https://issues.apache.org/jira/browse/CASSANDRA-4914 https://issues.apache.org/jira/browse/CASSANDRA-5184 https://issues.apache.org/jira/browse/CASSANDRA-6704 https://issues.apache.org/jira/browse/CASSANDRA-6167 On Thu, Feb 27, 2014 at 7:50 AM, David Semeria da...@lmframework.com wrote: Hi List, I was wondering whether there have been any past proposals for implementing node side processing (NSP) in C*. By NSP, I mean the passing a reference to a Java class which would then process the result set before it being returned to the client. In our particular use case our clients typically loop through result sets of a million or more rows to produce a tiny amount of output (sums, means, variance, etc). The bottleneck -- quite obviously -- is the need to transfer a million rows to the client before processing can take place. It would be extremely useful to execute this processing on the coordinator node and only transfer the results to the client. I mention this here because I can imagine other C* users having similar requirements. Thanks D. -- Sorry this was sent from mobile. Will do less grammar and spell check than usual.