Marc, have you had any exposure to DynamoDB at all? The API approach is different, but the fundamental concepts are similar. That’s actually a better reference point to have than an RDBMS, because really it’s a small subset of usage patterns that would overlap with CQL. If you were, for example, dealing with databases that did a lot of table partitions and supported apps that focused bulk loads and analytics on a partition level, then you would be in a space somewhat similar to C*.
C* is at its best when your common usage pattern, at least on reads, is effectively “I want a bunch of stuff, so you may as well give it to me by the bunch… what I do with the bunch after is my problem”. That’s very different from an RDBMS, which historically has always tried to find some balance between minimizing disk I/O and network I/O… but if it takes developers a lot more head scratching to get there, it was considered an acceptable investment to help scale the usage of an expensive resource. As a result, language features for the two cases are quite different. From: Elliott Sims <elli...@backblaze.com> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org> Date: Tuesday, April 21, 2020 at 12:13 PM To: "user@cassandra.apache.org" <user@cassandra.apache.org> Subject: Re: Issues, understanding how CQL works Message from External Sender The short answer is that CQL isn't SQL. It looks a bit like it, but the structure of the data is totally different. Essentially (ignoring secondary indexes, which have some issues in practice and I think are generally not recommended) the only way to look the data up is by the partition key. Anything else is a full-table scan and if you need more querying flexibility Cassandra is probably not your best option. With only 260GB, I think I'd lean towards suggesting PostgreSQL or MySQL. On Tue, Apr 21, 2020 at 7:20 AM Marc Richter <m...@marc-richter.info<mailto:m...@marc-richter.info>> wrote: Hi everyone, I'm very new to Cassandra. I have, however, some experience with SQL. I need to extract some information from a Cassandra database that has the following table definition: CREATE TABLE tagdata.central ( signalid int, monthyear int, fromtime bigint, totime bigint, avg decimal, insertdate bigint, max decimal, min decimal, readings text, PRIMARY KEY (( signalid, monthyear ), fromtime, totime) ) The database is already of round about 260 GB in size. I now need to know what is the most recent entry in it; the correct column to learn this would be "insertdate". In SQL I would do something like this: SELECT insertdate FROM tagdata.central ORDER BY insertdate DESC LIMIT 1; In CQL, however, I just can't get it to work. What I have tried already is this: SELECT insertdate FROM "tagdata.central" ORDER BY insertdate DESC LIMIT 1; But this gives me an error: ERROR: ORDER BY is only supported when the partition key is restricted by an EQ or an IN. So, after some trial and error and a lot of Googling, I learned that I must include all rows from the PRIMARY KEY from left to right in my query. Thus, this is the "best" I can get to work: SELECT * FROM "tagdata.central" WHERE "signalid" = 4002 AND "monthyear" = 201908 ORDER BY "fromtime" DESC LIMIT 10; The "monthyear" column, I crafted like a fool by incrementing the date one month after another until no results could be found anymore. The "signalid" I grabbed from one of the unrestricted "SELECT * FROM" - query results. But these can't be as easily guessed as the "monthyear" values could. This is where I'm stuck! 1. This does not really feel like the ideal way to go. I think there is something more mature in modern IT systems. Can anyone tell me what is a better way to get these informations? 2. I need a way to learn all values that are in the "monthyear" and "signalid" columns in order to be able to craft that query. How can I achieve that in a reasonable way? As I said: The DB is round about 260 GB which makes it next to impossible to just "have a look" at the output of "SELECT *".. Thanks for your help! Best regards, Marc Richter --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org<mailto:user-unsubscr...@cassandra.apache.org> For additional commands, e-mail: user-h...@cassandra.apache.org<mailto:user-h...@cassandra.apache.org>