The short answer is that CQL isn't SQL. It looks a bit like it, but the structure of the data is totally different. Essentially (ignoring secondary indexes, which have some issues in practice and I think are generally not recommended) the only way to look the data up is by the partition key. Anything else is a full-table scan and if you need more querying flexibility Cassandra is probably not your best option. With only 260GB, I think I'd lean towards suggesting PostgreSQL or MySQL.
On Tue, Apr 21, 2020 at 7:20 AM Marc Richter <m...@marc-richter.info> wrote: > Hi everyone, > > I'm very new to Cassandra. I have, however, some experience with SQL. > > I need to extract some information from a Cassandra database that has > the following table definition: > > CREATE TABLE tagdata.central ( > signalid int, > monthyear int, > fromtime bigint, > totime bigint, > avg decimal, > insertdate bigint, > max decimal, > min decimal, > readings text, > PRIMARY KEY (( signalid, monthyear ), fromtime, totime) > ) > > The database is already of round about 260 GB in size. > I now need to know what is the most recent entry in it; the correct > column to learn this would be "insertdate". > > In SQL I would do something like this: > > SELECT insertdate FROM tagdata.central > ORDER BY insertdate DESC LIMIT 1; > > In CQL, however, I just can't get it to work. > > What I have tried already is this: > > SELECT insertdate FROM "tagdata.central" > ORDER BY insertdate DESC LIMIT 1; > > But this gives me an error: > ERROR: ORDER BY is only supported when the partition key is restricted > by an EQ or an IN. > > So, after some trial and error and a lot of Googling, I learned that I > must include all rows from the PRIMARY KEY from left to right in my > query. Thus, this is the "best" I can get to work: > > > SELECT > * > FROM > "tagdata.central" > WHERE > "signalid" = 4002 > AND "monthyear" = 201908 > ORDER BY > "fromtime" DESC > LIMIT 10; > > > The "monthyear" column, I crafted like a fool by incrementing the date > one month after another until no results could be found anymore. > The "signalid" I grabbed from one of the unrestricted "SELECT * FROM" - > query results. But these can't be as easily guessed as the "monthyear" > values could. > > This is where I'm stuck! > > 1. This does not really feel like the ideal way to go. I think there is > something more mature in modern IT systems. Can anyone tell me what is a > better way to get these informations? > > 2. I need a way to learn all values that are in the "monthyear" and > "signalid" columns in order to be able to craft that query. > How can I achieve that in a reasonable way? As I said: The DB is round > about 260 GB which makes it next to impossible to just "have a look" at > the output of "SELECT *".. > > Thanks for your help! > > Best regards, > Marc Richter > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > For additional commands, e-mail: user-h...@cassandra.apache.org > >