Re: [DISCUSS] SQL support in Cassandra

mapyourown Tue, 04 Nov 2025 07:13:15 -0800

I remember we had an in-depth discussion with Patrick after his talk in
Minneapolis, where I raised my concern about pursuing full PostgreSQL
compatibility, particularly considering the mindset shift it requires and
the fact that Cassandra doesn’t support joins.


While I understand the team’s direction toward adopting a PostgreSQL-style
syntax and compatibility, I believe it’s equally important to continue
maintaining strong CQL support. Many companies and developers are deeply
invested in CQL, and as other contributors mentioned, it takes time for
them to adapt to major changes.

>From past experience, when users feel key functionality is being taken
away, they often hesitate to upgrade to newer versions or even consider
forking the project to maintain their own version.

Just wanted to share my thoughts on this.


On Tue, Nov 4, 2025 at 8:57 AM Jeff Jirsa <[email protected]> wrote:

> I’m sorta confused. You can do single table design in sql if you don’t
> have a join centric workload. You still get to tell the database how to
> order your data on disk.
>
> BOP gives you efficient range scans  without having partition size
> problems that trap users when they cross into mega partition traps. I don’t
> think you have to say clustering data together is Cassandra’s key benefit,
> virtually  every database is doing that, we just happen to do it with
> chunks of the users set of data instead did all of it.
>
> Similarly suggesting the LSM / SStables somehow benefit write heavy cql
> but not sql is sorta weird since the explosion of rocksdb backed sql makes
> it clear you can use LSM + sstables for that too
>
>
> On Nov 4, 2025, at 5:43 AM, Josh McKenzie <[email protected]> wrote:
>
> 
>
> +1 to Mick and Aleksey. I think the key for me was this:
>
> One is Cassandra’s wide-partition model with flexible clustering columns,
> which supports very large, ordered partitions (e.g. time-series and
> efficient range scans), rather than a strictly normalised, join-centric
> model. These patterns don’t always map cleanly to SQL semantics, and CQL’s
> query-driven, table-per-query modelling helps move users toward designs
> that scale predictably.
>
>
> We'd need really robust EXPLAIN / EXPLAIN ANALYZE support (see here
> <https://www.postgresql.org/docs/current/sql-explain.html>) for users to
> be able to make sense of how their SQL queries translate into underlying
> disk access patterns. Having a wide-open field of full SQL compliance they
> then need to understand how to constrain to get horizontal scale out of it
> would be *much more challenging* than the already somewhat "new"
> cognitive muscle our users have to build to realize that horizontal scaling
> of data access doesn't come free.
>
> I think that would give us a future state of "Use SQL when you need / want
> a lot of expressivity, use CQL when you need to be constrained to language
> primitives that keep your data access scalable". The part that gets me wary
> here is how we've run into pain in the past trying to be both a database
> that allows more query expressivity (ALLOW FILTERING, legacy 2i come to
> mind) and a database that also wants horizontal scale.
>
> I'd love us to be able to have our cake and eat it too but I don't know if
> that's possible. So at the very least I'd advocate for SQL + CQL going
> forward, or SQL + a constrained "CQL-like" mode that gives the same
> constraints CQL does today on modeling that guide people towards that very
> partitionable path.
>
> On Tue, Nov 4, 2025, at 8:12 AM, Aleksey Yeshchenko wrote:
>
> I don’t mind us implementing some Postgres syntax support in some
> capacity, but I do not like the idea of limiting what Cassandra is allowed
> to do, or expose via CQL, to what is expressible by Postgres’s SQL.
>
> Many moons ago, before we started work on native protocol and CQL, I could
> perhaps a bigger benefit to going Postgres route - for the client protocol
> and the language. We could piggyback on existing client infrastructure and
> SQL familiarity. But at this stage, when we have already made the effort to
> develop decent drivers, and CQL is fleshed out, and C* is quite mature
> overall, how much would we gain from this transition?
>
> I’m broadly with Mick here. And I support using Postgres’ SQL as
> inspiration for implementing new CQL features wherever it makes sense -
> it’s something we’ve been doing for a decade already. But I don’t believe
> that deprecating CQL is the way to go at this point.
>
> > On 4 Nov 2025, at 06:38, Mick <[email protected]> wrote:
> >
> >
> >
> >> On 3 Nov 2025, at 20:32, Joel Shepherd <[email protected]> wrote:
> >>
> >> At the same time, my personal opinion is that if SQL compatibility is
> pursued, then the end game should be to deprecate CQL. That will probably
> take years, but at the limit I don't see a lot of benefit to supporting
> both.
> >
> >
> >
> > We want SQL, but _why_ (in all its nuances) do we want SQL ?  A lot is
> obvious, but it is a very broad question.
> >
> > The adoption and standardisation benefits are obvious, but CQL has
> strengths relative to SQL in Cassandra’s context.
> >
> > One is Cassandra’s wide-partition model with flexible clustering
> columns, which supports very large, ordered partitions (e.g. time-series
> and efficient range scans), rather than a strictly normalised, join-centric
> model. These patterns don’t always map cleanly to SQL semantics, and CQL’s
> query-driven, table-per-query modelling helps move users toward designs
> that scale predictably.
> >
> > I can see CQL continuing as Cassandra’s high-throughput, query-driven
> DSL, while we pursue SQL compatibility.  I appreciate Dinesh’s ‘lanes’
> framing, e.g. eventually default to a SQL interface (with Accord) for the
> broadest UX, while CQL remains a high-throughput path.
> >
> > Should we also be discussing storage-engine implications ?  Cassandra’s
> LSMT/SSTable design optimises write paths; while a SQL presents a logical
> view without constraining physical layout; so data on disk stays optimised
> for dominant access patterns.  I can also see the need to discuss transport
> vs query languages differences.
> >
> > Are we after both SQL's DML and DDL abilities ?  Beyond accessibility
> and exploration, SQL often comes with mature tooling for schema change
> management. Cassandra supports online schema changes (e.g., ALTER TABLE),
> but cross-table/primary-key changes remain constrained. A SQL interface
> alone won’t ‘solve’ this: it’s about migration tooling and engine
> capabilities; changing data models at-scale faces separate challenges.
> >
> > Especially outside of early-stage apps and ad-hoc exploration I find SQL
> less interesting and its ergonomics less aligned with Cassandra’s runtime
> performance model.  That doesn't make me opposed to the endeavour of SQL
> compatibility, it pushes me on the why question a bit more for alignment
> clarity to our strengths.
>
>
>
>

Re: [DISCUSS] SQL support in Cassandra

Reply via email to