Re: Cannot query secondary index

2014-06-13 Thread Mohit Anchlia
Some other ways to track old records is: 1) Use external queues - One queue per week or month for instance and pile up data on the queue cluster 2) Create one more table in C* to track the keys per week or month that you can scan to read the keys of the audit table. Make sure you delete the entir

Re: Cannot query secondary index

2014-06-13 Thread Jonathan Lacefield
Hello, What you are attempting to do, reminds me of the old "sliding window" partitioning trick in rdbms systems. You're right, there is no system provided tool that allows you to preform a similar operation. You could always leverage option 3, and then create a service that helps manage the

Re: Cannot query secondary index

2014-06-10 Thread Paulo Ricardo Motta Gomes
Our approach for this scenario is to run a hadoop job that periodically cleans old entries, but I admit it's far from ideal. Would be nice to have a more native way to perform these kinds of tasks. There's a legend about a compaction strategy that keeps only the N first entries of a partition key,

Re: Cannot query secondary index

2014-06-10 Thread Redmumba
Honestly, this has been by far my single biggest obstacle with Cassandra for time-based data--cleaning up the old data when the deletion criteria (i.e., date) isn't the primary key. I've asked about a few different approaches, but I haven't really seen any feasible options that can be implemented

Re: Cannot query secondary index

2014-06-09 Thread Redmumba
Of course, Jonathan, I'll do my best! It's an auditing table that, right now, uses a primary key consisting of a combination of a combined partition id of the region and the object id, the date, and the process ID. Each event in our system will create anywhere from 1-20 rows, for example, and mul

Re: Cannot query secondary index

2014-06-09 Thread Jonathan Lacefield
Hello, Will you please describe the use case and what you are trying to model. What are some questions/queries that you would like to serve via Cassandra. This will help the community help you a little better. Jonathan Lacefield Solutions Architect, DataStax (404) 822 3487

Re: Cannot query secondary index

2014-06-09 Thread Redmumba
I've been trying to work around using "date-based tables" because I'd like to avoid the overhead. It seems, however, that this is just not going to work. So here's a question--for these date-based tables (i.e., a table per day/week/month/whatever), how are they queried? If I keep 60 days worth o

Re: Cannot query secondary index

2014-06-09 Thread Redmumba
Ah, so the secondary indices are really secondary against the primary key. That makes sense. I'm beginning to see why the whole "date-based table" approach is the only one I've been able to find... thanks for the quick responses, guys! On Mon, Jun 9, 2014 at 2:45 PM, Michal Michalski < michal.mi

Re: Cannot query secondary index

2014-06-09 Thread Michal Michalski
Secondary indexes internally are just CFs that map the indexed value to a row key which that value belongs to, so you can only query these indexes using "=", not ">", ">=" etc. However, your query does not require index *IF* you provide a row key - you can use "<" or ">" like you did for the date

Re: Cannot query secondary index

2014-06-09 Thread Jonathan Lacefield
Hello, You are receiving this item because you are not passing in the Partition Key as part of your query. Cassandra is telling you it doesn't know which node to find the data and you haven't explicitly told it to search across all your nodes for the data. The ALLOW FILTERING clause bypasses t

Cannot query secondary index

2014-06-09 Thread Redmumba
I have a table with a timestamp column on it; however, when I try to query based on it, it fails saying that I must use ALLOW FILTERING--which to me, means its not using the secondary index. Table definition is (snipping out irrelevant parts)... CREATE TABLE audit ( > id bigint, > date ti