Just throwing my 2 cents in. I'm probably in the unpopular camp of wanting to to move the other direction towards a grpc endpoint that is even more restrictive than cql. This is coming from a standpoint of needing to clean up after mistakes (application/modeling etc, not cassandra) than the standpoint of trying to sell people on using the database. I would prefer to see all the features and endpoints we provide work well without breaking than make cool demos and feature bullet points. That said I know in order for a database to be successful we need the cool feature sets as well. CQL works for now and deprecating that would be an absolute nightmare for people *already* using it (ie thrift migration was not fun for anyone). I say create a new entrypoint or layer, mark it experimental and allow operators to disable it but leave the existing CQL interface alone.
Chris On Tue, Nov 4, 2025 at 10:53 AM Isaac Reath <[email protected]> wrote: > I share Joey's opinions on this. Many features that resemble SQL (e.g., > indexes, materialized views) come with caveats that stem from > their implementation details rather than the query language itself. If we > expose these same features through SQL as they are today, I think we'd risk > setting users up for disappointment, since they will come in with implicit > expectations about how a given SQL feature should work based on their > previous experience and more often than not we won't meet that expectation. > At least with CQL we set the expectation that this is a different database, > where familiar concepts might behave differently than you would expect. > > That said, in terms of a long term direction, I think having SQL support > is a good guiding light and implementing it as a stateless component as > Jeff suggests would help make this easier to realize. > > On Tue, Nov 4, 2025 at 10:23 AM Joseph Lynch <[email protected]> > wrote: > >> Removing CQL is, in my opinion, completely off the table. When we >> deprecated Thrift and gave CQL as the new query language, we imposed >> significant pain on our existing functional Thrift applications to migrate >> to it - I feel we should not hurt our users like that again. >> >> I worry that we already struggle to implement the current surface area of >> CQL correctly and in a way that scales safely. For example, CQL allows us >> to create arbitrarily large partitions, but large partitions and large >> columns continue to be something our storage engine can't currently handle >> well. CQL allows us to create secondary indices for improved filter support >> but few can (or at least we struggle) to safely use them in production. We >> still struggle with how page timeouts, hedges and retries work in an >> idempotent and reliable way in our current protocol - although CQL at least >> gives us a path to implementing those. >> >> I wonder if we should focus on being excellent at the basic write and >> read operations we already support before adding more complexity at the API >> layer. I am excited by the recent proposals around unbounded partitions, >> byte ordered partitioner with safe data movement, ability to execute >> analytics queries efficiently via a separate columnar representation etc >> ... and *all* of those and more would likely be *required* to tackle SQL >> in any meaningful way. >> >> The surface area of SQL is much much wider, requiring functional >> implementation of all of that plus joins, interactive transactions and >> more. The SQL protocol itself is also quite poor for reliable communication >> and rarely has performant async clients with size based pagination, per >> page timeouts, per page hedging, incremental progress over a streaming >> async interface, pagination resumption, etc ... A lot of this difficulty >> stems from the protocol often being tied to TCP connections and the >> inherently unbounded complexity of the read interface. >> >> I guess I'm saying, I think we should prioritize succeeding at the API >> scope we already have before adding more. Deferring to standard SQL syntax >> or naming when we can just seems like a good idea (why reinvent concepts), >> but I don't think the friction with CQL is because it's not SQL, I think >> it's because users can't tell what works and what doesn't work. >> >> -Joey >> >> On Tue, Nov 4, 2025 at 8:42 AM Josh McKenzie <[email protected]> >> wrote: >> >>> +1 to Mick and Aleksey. I think the key for me was this: >>> >>> One is Cassandra’s wide-partition model with flexible clustering >>> columns, which supports very large, ordered partitions (e.g. time-series >>> and efficient range scans), rather than a strictly normalised, join-centric >>> model. These patterns don’t always map cleanly to SQL semantics, and CQL’s >>> query-driven, table-per-query modelling helps move users toward designs >>> that scale predictably. >>> >>> >>> We'd need really robust EXPLAIN / EXPLAIN ANALYZE support (see here >>> <https://www.postgresql.org/docs/current/sql-explain.html>) for users >>> to be able to make sense of how their SQL queries translate into underlying >>> disk access patterns. Having a wide-open field of full SQL compliance they >>> then need to understand how to constrain to get horizontal scale out of it >>> would be *much more challenging* than the already somewhat "new" >>> cognitive muscle our users have to build to realize that horizontal scaling >>> of data access doesn't come free. >>> >>> I think that would give us a future state of "Use SQL when you need / >>> want a lot of expressivity, use CQL when you need to be constrained to >>> language primitives that keep your data access scalable". The part that >>> gets me wary here is how we've run into pain in the past trying to be both >>> a database that allows more query expressivity (ALLOW FILTERING, legacy 2i >>> come to mind) and a database that also wants horizontal scale. >>> >>> I'd love us to be able to have our cake and eat it too but I don't know >>> if that's possible. So at the very least I'd advocate for SQL + CQL going >>> forward, or SQL + a constrained "CQL-like" mode that gives the same >>> constraints CQL does today on modeling that guide people towards that very >>> partitionable path. >>> >>> On Tue, Nov 4, 2025, at 8:12 AM, Aleksey Yeshchenko wrote: >>> >>> I don’t mind us implementing some Postgres syntax support in some >>> capacity, but I do not like the idea of limiting what Cassandra is allowed >>> to do, or expose via CQL, to what is expressible by Postgres’s SQL. >>> >>> Many moons ago, before we started work on native protocol and CQL, I >>> could perhaps a bigger benefit to going Postgres route - for the client >>> protocol and the language. We could piggyback on existing client >>> infrastructure and SQL familiarity. But at this stage, when we have already >>> made the effort to develop decent drivers, and CQL is fleshed out, and C* >>> is quite mature overall, how much would we gain from this transition? >>> >>> I’m broadly with Mick here. And I support using Postgres’ SQL as >>> inspiration for implementing new CQL features wherever it makes sense - >>> it’s something we’ve been doing for a decade already. But I don’t believe >>> that deprecating CQL is the way to go at this point. >>> >>> > On 4 Nov 2025, at 06:38, Mick <[email protected]> wrote: >>> > >>> > >>> > >>> >> On 3 Nov 2025, at 20:32, Joel Shepherd <[email protected]> wrote: >>> >> >>> >> At the same time, my personal opinion is that if SQL compatibility is >>> pursued, then the end game should be to deprecate CQL. That will probably >>> take years, but at the limit I don't see a lot of benefit to supporting >>> both. >>> > >>> > >>> > >>> > We want SQL, but _why_ (in all its nuances) do we want SQL ? A lot is >>> obvious, but it is a very broad question. >>> > >>> > The adoption and standardisation benefits are obvious, but CQL has >>> strengths relative to SQL in Cassandra’s context. >>> > >>> > One is Cassandra’s wide-partition model with flexible clustering >>> columns, which supports very large, ordered partitions (e.g. time-series >>> and efficient range scans), rather than a strictly normalised, join-centric >>> model. These patterns don’t always map cleanly to SQL semantics, and CQL’s >>> query-driven, table-per-query modelling helps move users toward designs >>> that scale predictably. >>> > >>> > I can see CQL continuing as Cassandra’s high-throughput, query-driven >>> DSL, while we pursue SQL compatibility. I appreciate Dinesh’s ‘lanes’ >>> framing, e.g. eventually default to a SQL interface (with Accord) for the >>> broadest UX, while CQL remains a high-throughput path. >>> > >>> > Should we also be discussing storage-engine implications ? >>> Cassandra’s LSMT/SSTable design optimises write paths; while a SQL presents >>> a logical view without constraining physical layout; so data on disk stays >>> optimised for dominant access patterns. I can also see the need to discuss >>> transport vs query languages differences. >>> > >>> > Are we after both SQL's DML and DDL abilities ? Beyond accessibility >>> and exploration, SQL often comes with mature tooling for schema change >>> management. Cassandra supports online schema changes (e.g., ALTER TABLE), >>> but cross-table/primary-key changes remain constrained. A SQL interface >>> alone won’t ‘solve’ this: it’s about migration tooling and engine >>> capabilities; changing data models at-scale faces separate challenges. >>> > >>> > Especially outside of early-stage apps and ad-hoc exploration I find >>> SQL less interesting and its ergonomics less aligned with Cassandra’s >>> runtime performance model. That doesn't make me opposed to the endeavour >>> of SQL compatibility, it pushes me on the why question a bit more for >>> alignment clarity to our strengths. >>> >>> >>> >>>
