Re: Thoughts on adding complex queries to Cassandra

2010-05-28 Thread Jeremy Davis
I wonder if any of the main project committers would like to weigh in on
what a desired API would look like, or perhaps we should start an
unscheduled Jira ticket?

On Thu, May 27, 2010 at 5:39 PM, Jake Luciani  wrote:

> I had this:
>
>
> string slice_dice_reduce(1:required list key,
>   2:required ColumnParent
> column_parent,
>   3:required SlicePredicate predicate,
>   4:required ConsistencyLevel
> consistency_level=ONE,
>   5:required string dice_js,
>   6:required string reduce_js)
> throws (1:InvalidRequestException ire,
> 2:UnavailableException ue, 3:TimedOutException te),
>
> I guess it could use a union of sorts and return either.
>
>
>
> On Thu, May 27, 2010 at 8:36 PM, Jeremy Davis <
> jerdavis.cassan...@gmail.com> wrote:
>
>>
>> I agree, I had more than filter results in mind.
>> Though I had envisioned the results to continue to use the
>> List (and not JSON). You could still create new result
>> columns that do not in any way exist in Cassandra, and you could still stuff
>> JSON in to any of result columns.
>>
>> I had envisioned:
>> list get_slice(keyspace, key, column_parent, predicate, 
>> consistency_level,
>> javascript_blob )
>>
>> -JD
>>
>>
>>
>>
>>
>> On Thu, May 27, 2010 at 5:01 PM, Jake Luciani  wrote:
>>
>>> I've secretly started working on this but nothing to show yet :( I'm
>>> calling it SliceDiceReduce or SliceReduce.
>>>
>>>  The plan is to use the js thrift bindings I've added for 0.3 release of
>>> thrift (out very soon?)
>>>
>>> This will allow the supplied js to access the results like any other
>>> thrift client.
>>>
>>> Adding a new verb handler and SEDA stage that will execute on a local
>>> node and pass this nodes slice data into the supplied js "dice" function via
>>> the thrift js bindings.
>>>
>>> The resulting js from each node would then be passed into another
>>> supplied js reduce function on the starting node.
>>>
>>> The result of this would then return a single JSON or string result.
>>>  The reason I'm keeping the results in json is you can do more than filter.
>>> You can do things like word count etc.
>>>
>>> Anyway this is little more than an idea now. But if people like this
>>> approach maybe I'll get motivated!
>>>
>>> Jake
>>>
>>>
>>>
>>>
>>>
>>> On May 27, 2010, at 7:36 PM, Steve Lihn  wrote:
>>>
>>> Mongo has it too. It could save a lot of development time if one can
>>> figure out porting Mongo's query API and stored javascript to Cassandra.
>>> It would be great if scala's list comprehension can be facilitated to
>>> write query-like code against Cassandra schema.
>>>
>>> On Thu, May 27, 2010 at 11:05 AM, Vick Khera < 
>>> vi...@khera.org> wrote:
>>>
 On Thu, May 27, 2010 at 9:50 AM, Jonathan Ellis < 
 jbel...@gmail.com> wrote:
 > There definitely seems to be demand for something like this.  Maybe
 for 0.8?
 >

 The Riak data store has something like this: you can submit queries
 (and map reduce jobs) written in javascript that run on the data nodes
 using data local to that node.  It is a very compelling feature.

>>>
>>>
>>
>


Re: Thoughts on adding complex queries to Cassandra

2010-05-27 Thread Jake Luciani
I had this:


string slice_dice_reduce(1:required list key,
  2:required ColumnParent
column_parent,
  3:required SlicePredicate predicate,
  4:required ConsistencyLevel
consistency_level=ONE,
  5:required string dice_js,
  6:required string reduce_js)
throws (1:InvalidRequestException ire,
2:UnavailableException ue, 3:TimedOutException te),

I guess it could use a union of sorts and return either.



On Thu, May 27, 2010 at 8:36 PM, Jeremy Davis
wrote:

>
> I agree, I had more than filter results in mind.
> Though I had envisioned the results to continue to use the
> List (and not JSON). You could still create new result
> columns that do not in any way exist in Cassandra, and you could still stuff
> JSON in to any of result columns.
>
> I had envisioned:
> list get_slice(keyspace, key, column_parent, predicate, 
> consistency_level,
> javascript_blob )
>
> -JD
>
>
>
>
>
> On Thu, May 27, 2010 at 5:01 PM, Jake Luciani  wrote:
>
>> I've secretly started working on this but nothing to show yet :( I'm
>> calling it SliceDiceReduce or SliceReduce.
>>
>>  The plan is to use the js thrift bindings I've added for 0.3 release of
>> thrift (out very soon?)
>>
>> This will allow the supplied js to access the results like any other
>> thrift client.
>>
>> Adding a new verb handler and SEDA stage that will execute on a local node
>> and pass this nodes slice data into the supplied js "dice" function via the
>> thrift js bindings.
>>
>> The resulting js from each node would then be passed into another supplied
>> js reduce function on the starting node.
>>
>> The result of this would then return a single JSON or string result.  The
>> reason I'm keeping the results in json is you can do more than filter. You
>> can do things like word count etc.
>>
>> Anyway this is little more than an idea now. But if people like this
>> approach maybe I'll get motivated!
>>
>> Jake
>>
>>
>>
>>
>>
>> On May 27, 2010, at 7:36 PM, Steve Lihn  wrote:
>>
>> Mongo has it too. It could save a lot of development time if one can
>> figure out porting Mongo's query API and stored javascript to Cassandra.
>> It would be great if scala's list comprehension can be facilitated to
>> write query-like code against Cassandra schema.
>>
>> On Thu, May 27, 2010 at 11:05 AM, Vick Khera < 
>> vi...@khera.org> wrote:
>>
>>> On Thu, May 27, 2010 at 9:50 AM, Jonathan Ellis < 
>>> jbel...@gmail.com> wrote:
>>> > There definitely seems to be demand for something like this.  Maybe for
>>> 0.8?
>>> >
>>>
>>> The Riak data store has something like this: you can submit queries
>>> (and map reduce jobs) written in javascript that run on the data nodes
>>> using data local to that node.  It is a very compelling feature.
>>>
>>
>>
>


Re: Thoughts on adding complex queries to Cassandra

2010-05-27 Thread Jeremy Davis
I agree, I had more than filter results in mind.
Though I had envisioned the results to continue to use the
List (and not JSON). You could still create new result
columns that do not in any way exist in Cassandra, and you could still stuff
JSON in to any of result columns.

I had envisioned:
list get_slice(keyspace, key, column_parent,
predicate, consistency_level,
javascript_blob )

-JD




On Thu, May 27, 2010 at 5:01 PM, Jake Luciani  wrote:

> I've secretly started working on this but nothing to show yet :( I'm
> calling it SliceDiceReduce or SliceReduce.
>
>  The plan is to use the js thrift bindings I've added for 0.3 release of
> thrift (out very soon?)
>
> This will allow the supplied js to access the results like any other thrift
> client.
>
> Adding a new verb handler and SEDA stage that will execute on a local node
> and pass this nodes slice data into the supplied js "dice" function via the
> thrift js bindings.
>
> The resulting js from each node would then be passed into another supplied
> js reduce function on the starting node.
>
> The result of this would then return a single JSON or string result.  The
> reason I'm keeping the results in json is you can do more than filter. You
> can do things like word count etc.
>
> Anyway this is little more than an idea now. But if people like this
> approach maybe I'll get motivated!
>
> Jake
>
>
>
>
>
> On May 27, 2010, at 7:36 PM, Steve Lihn  wrote:
>
> Mongo has it too. It could save a lot of development time if one can figure
> out porting Mongo's query API and stored javascript to Cassandra.
> It would be great if scala's list comprehension can be facilitated to write
> query-like code against Cassandra schema.
>
> On Thu, May 27, 2010 at 11:05 AM, Vick Khera < 
> vi...@khera.org> wrote:
>
>> On Thu, May 27, 2010 at 9:50 AM, Jonathan Ellis < 
>> jbel...@gmail.com> wrote:
>> > There definitely seems to be demand for something like this.  Maybe for
>> 0.8?
>> >
>>
>> The Riak data store has something like this: you can submit queries
>> (and map reduce jobs) written in javascript that run on the data nodes
>> using data local to that node.  It is a very compelling feature.
>>
>
>


Re: Thoughts on adding complex queries to Cassandra

2010-05-27 Thread Jake Luciani
I've secretly started working on this but nothing to show yet :( I'm  
calling it SliceDiceReduce or SliceReduce.


 The plan is to use the js thrift bindings I've added for 0.3 release  
of thrift (out very soon?)


This will allow the supplied js to access the results like any other  
thrift client.


Adding a new verb handler and SEDA stage that will execute on a local  
node and pass this nodes slice data into the supplied js "dice"  
function via the thrift js bindings.


The resulting js from each node would then be passed into another  
supplied js reduce function on the starting node.


The result of this would then return a single JSON or string result.   
The reason I'm keeping the results in json is you can do more than  
filter. You can do things like word count etc.


Anyway this is little more than an idea now. But if people like this  
approach maybe I'll get motivated!


Jake





On May 27, 2010, at 7:36 PM, Steve Lihn  wrote:

Mongo has it too. It could save a lot of development time if one can  
figure out porting Mongo's query API and stored javascript to  
Cassandra.
It would be great if scala's list comprehension can be facilitated  
to write query-like code against Cassandra schema.


On Thu, May 27, 2010 at 11:05 AM, Vick Khera  wrote:
On Thu, May 27, 2010 at 9:50 AM, Jonathan Ellis   
wrote:
> There definitely seems to be demand for something like this.   
Maybe for 0.8?

>

The Riak data store has something like this: you can submit queries
(and map reduce jobs) written in javascript that run on the data nodes
using data local to that node.  It is a very compelling feature.



Re: Thoughts on adding complex queries to Cassandra

2010-05-27 Thread Steve Lihn
Mongo has it too. It could save a lot of development time if one can figure
out porting Mongo's query API and stored javascript to Cassandra.
It would be great if scala's list comprehension can be facilitated to write
query-like code against Cassandra schema.

On Thu, May 27, 2010 at 11:05 AM, Vick Khera  wrote:

> On Thu, May 27, 2010 at 9:50 AM, Jonathan Ellis  wrote:
> > There definitely seems to be demand for something like this.  Maybe for
> 0.8?
> >
>
> The Riak data store has something like this: you can submit queries
> (and map reduce jobs) written in javascript that run on the data nodes
> using data local to that node.  It is a very compelling feature.
>


Re: Thoughts on adding complex queries to Cassandra

2010-05-27 Thread Vick Khera
On Thu, May 27, 2010 at 9:50 AM, Jonathan Ellis  wrote:
> There definitely seems to be demand for something like this.  Maybe for 0.8?
>

The Riak data store has something like this: you can submit queries
(and map reduce jobs) written in javascript that run on the data nodes
using data local to that node.  It is a very compelling feature.


Re: Thoughts on adding complex queries to Cassandra

2010-05-27 Thread Jonathan Ellis
There definitely seems to be demand for something like this.  Maybe for 0.8?

On Wed, May 26, 2010 at 4:31 PM, Jeremy Davis
 wrote:
>
> Are there any thoughts on adding a more complex query to Cassandra?
>
> At a high level what I'm wondering is: Would it be possible/desirable/in
> keeping with the Cassandra plan, to add something like a javascript blob on
> to a get range slice etc, that does some further filtering on the results
> before returning them. The goal being to trade off some CPU on Cassandra for
> network bandwidth.
>
> -JD
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


RE: Thoughts on adding complex queries to Cassandra

2010-05-26 Thread Nicholas Sun
I'm very curious on this topic as well.  Mainly, I'd like to know is this
functionality handled through Map/Reduce HADOOP operations?

 

Nick 

 

From: Jeremy Davis [mailto:jerdavis.cassan...@gmail.com] 
Sent: Wednesday, May 26, 2010 3:31 PM
To: user@cassandra.apache.org
Subject: Thoughts on adding complex queries to Cassandra

 


Are there any thoughts on adding a more complex query to Cassandra?

At a high level what I'm wondering is: Would it be possible/desirable/in
keeping with the Cassandra plan, to add something like a javascript blob on
to a get range slice etc, that does some further filtering on the results
before returning them. The goal being to trade off some CPU on Cassandra for
network bandwidth. 

-JD



Thoughts on adding complex queries to Cassandra

2010-05-26 Thread Jeremy Davis
Are there any thoughts on adding a more complex query to Cassandra?

At a high level what I'm wondering is: Would it be possible/desirable/in
keeping with the Cassandra plan, to add something like a javascript blob on
to a get range slice etc, that does some further filtering on the results
before returning them. The goal being to trade off some CPU on Cassandra for
network bandwidth.

-JD