Calcite’s adapter framework makes it easy to push down filters, aggregations to 
third-party sources, and  to express more powerful and data-source-specific 
optimizations.

Is Drill building on Calcite’s support or doing it its own way?

Calcite doesn’t have a Cassandra adapter but the same approach taken in the 
MongoDb, Splunk, Phoenix adapters could be used.

On Jan 8, 2015, at 9:11 AM, Tomer Shiran <tshi...@gmail.com> wrote:

> I think that any valid SQL statement should work with any data source.
> Drill should:
> 
>   - Push down as much processing as possible into the data source
>   (Cassandra in this case)
>   - Maintain as much data locality as possible (ie, spread the work so
>   that each drillbit is handling local data)
>   - In the worst case, Drill should pull the entire table from the data
>   source if that's what's needed to satisfy the query.
> 
> 
> On Thu, Jan 8, 2015 at 8:29 AM, Yash Sharma <yash...@gmail.com> wrote:
> 
>> Hi Folks,
>> This thread is to discuss few scenarios how Cassandra works - and how do we
>> think it should be supported in Drill.
>> 
>> While they are not supported in Cassandra inherently but its doable on
>> Drill's end once we fetch a superset of data without these cases.
>> 
>> 1. Filtering non indexed column in Cassandra
>> 2. Filtering by subset of primary key
>> 3. OR condition in where clause
>> 
>> Should we apply filters at Drill's end and support these features or we
>> propagate an error back to user for asking for a valid Cassandra based
>> query?
>> 
>> -----
>> Examples:
>> Here 'trending_now' is a dummy table with (id, rank, pog_id) where
>> (id,rank) is primary key pair.
>> 1.
>> cqlsh:recsys> select * from trending_now where pog_id=10004 ;
>> Bad Request: No indexed columns present in by-columns clause with Equal
>> operator
>> 
>> 2.
>> cqlsh:recsys> select * from trending_now where rank=4;
>> Bad Request: Cannot execute this query as it might involve data filtering
>> and thus may have unpredictable performance. If you want to execute this
>> query despite the performance unpredictability, use ALLOW FILTERING
>> P.S. ALLOW FILTERING is not permitted in Cassandra java driver as of now.
>> 
>> 3.
>> cqlsh:recsys> select * from trending_now where rank=4 or id='id0004';
>> Bad Request: line 1:40 missing EOF at 'or'
>> 
>> 4. Valid Query:
>> cqlsh:recsys> select * from trending_now where id='id0004' and rank=4;
>> 
>> id     | rank | pog_id
>> --------+------+--------
>> id0004 |    4 |  10002
>> 
>> (1 rows)
>> 

Reply via email to