Hello, Will you please describe the use case and what you are trying to model. What are some questions/queries that you would like to serve via Cassandra. This will help the community help you a little better.
Jonathan Lacefield Solutions Architect, DataStax (404) 822 3487 <http://www.linkedin.com/in/jlacefield> <http://www.datastax.com/cassandrasummit14> On Mon, Jun 9, 2014 at 7:51 PM, Redmumba <redmu...@gmail.com> wrote: > I've been trying to work around using "date-based tables" because I'd like > to avoid the overhead. It seems, however, that this is just not going to > work. > > So here's a question--for these date-based tables (i.e., a table per > day/week/month/whatever), how are they queried? If I keep 60 days worth of > auditing data, for example, I'd need to query all 60 tables--can I do that > smoothly? Or do I have to have 60 different select statements? Is there a > way for me to run the same query against all the tables? > > > On Mon, Jun 9, 2014 at 3:42 PM, Redmumba <redmu...@gmail.com> wrote: > >> Ah, so the secondary indices are really secondary against the primary >> key. That makes sense. >> >> I'm beginning to see why the whole "date-based table" approach is the >> only one I've been able to find... thanks for the quick responses, guys! >> >> >> On Mon, Jun 9, 2014 at 2:45 PM, Michal Michalski < >> michal.michal...@boxever.com> wrote: >> >>> Secondary indexes internally are just CFs that map the indexed value to >>> a row key which that value belongs to, so you can only query these indexes >>> using "=", not ">", ">=" etc. >>> >>> However, your query does not require index *IF* you provide a row key - >>> you can use "<" or ">" like you did for the date column, as long as you >>> refer to a single row. However, if you don't provide it, it's not going to >>> work. >>> >>> M. >>> >>> Kind regards, >>> MichaĆ Michalski, >>> michal.michal...@boxever.com >>> >>> >>> On 9 June 2014 21:18, Redmumba <redmu...@gmail.com> wrote: >>> >>>> I have a table with a timestamp column on it; however, when I try to >>>> query based on it, it fails saying that I must use ALLOW FILTERING--which >>>> to me, means its not using the secondary index. Table definition is >>>> (snipping out irrelevant parts)... >>>> >>>> CREATE TABLE audit ( >>>>> id bigint, >>>>> date timestamp, >>>>> ... >>>>> PRIMARY KEY (id, date) >>>>> ); >>>>> CREATE INDEX date_idx ON audit (date); >>>>> >>>> >>>> There are other fields, but they are not relevant to this example. The >>>> date is part of the primary key, and I have a secondary index on it. When >>>> I run a SELECT against it, I get an error: >>>> >>>> cqlsh> SELECT * FROM asinauditing.asinaudit WHERE date < '2014-05-01'; >>>>> Bad Request: Cannot execute this query as it might involve data >>>>> filtering and thus may have unpredictable performance. If you want to >>>>> execute this query despite the performance unpredictability, use ALLOW >>>>> FILTERING >>>>> cqlsh> SELECT * FROM asinauditing.asinaudit WHERE date < '2014-05-01' >>>>> ALLOW FILTERING; >>>>> Request did not complete within rpc_timeout. >>>>> >>>> >>>> How can I force it to use the index? I've seen rebuild_index tasks >>>> running, but can I verify the "health" of the index? >>>> >>> >>> >> >