[DISCUSS] Cassandra storage for Drill

Yash Sharma Thu, 08 Jan 2015 08:33:13 -0800

Hi Folks,
This thread is to discuss few scenarios how Cassandra works - and how do we
think it should be supported in Drill.


While they are not supported in Cassandra inherently but its doable on
Drill's end once we fetch a superset of data without these cases.

1. Filtering non indexed column in Cassandra
2. Filtering by subset of primary key
3. OR condition in where clause

Should we apply filters at Drill's end and support these features or we
propagate an error back to user for asking for a valid Cassandra based
query?

-----
Examples:
Here 'trending_now' is a dummy table with (id, rank, pog_id) where
(id,rank) is primary key pair.
1.
cqlsh:recsys> select * from trending_now where pog_id=10004 ;
Bad Request: No indexed columns present in by-columns clause with Equal
operator

2.
cqlsh:recsys> select * from trending_now where rank=4;
Bad Request: Cannot execute this query as it might involve data filtering
and thus may have unpredictable performance. If you want to execute this
query despite the performance unpredictability, use ALLOW FILTERING
P.S. ALLOW FILTERING is not permitted in Cassandra java driver as of now.

3.
cqlsh:recsys> select * from trending_now where rank=4 or id='id0004';
Bad Request: line 1:40 missing EOF at 'or'

4. Valid Query:
cqlsh:recsys> select * from trending_now where id='id0004' and rank=4;

 id     | rank | pog_id
--------+------+--------
 id0004 |    4 |  10002

(1 rows)

[DISCUSS] Cassandra storage for Drill

Reply via email to