Re: Tradeoffs for Cassandra transaction management

Jordan West Thu, 14 Oct 2021 21:33:43 -0700

Hi All,

First off, thank you for the very interesting technical discussions on this
topic. It's been great to see some back and forth on it. I haven't been
involved mainly because my research on this topic is relatively stale. I
did however want to chime in to encourage us to step back and take a look
at the topic of whether SQL support is the direction we want to be going
with Cassandra. For some context, I now work on and operate both Cassandra
and CockroachDB at a relatively large scale. In this case, CockroachDB is
not positioned as a potential replacement for Cassandra but as an
additional choice to meet different needs. Meeting those needs necessitates
different tradeoffs. Tradeoffs that have concrete impacts on how the
database performs, how production support works, how the user can break the
database, and what can be accomplished successfully by the user. When I
look at what my users need from Cassandra, it's not to have a competing
solution to CockroachDB -- a solution that exists and is becoming more and
more production proven every day. They do however need things like
scalable, consistent secondary indexing -- a feature I envision Accord
could unlock with its multi-partition CAS/transactions -- or better
performing single-partition LWTs -- ones that take significantly less round
trips and work over the WAN. I would encourage those pushing for SQL
support to consider that and to start a discussion first with the community
on whether SQL support is the direction we should be heading in the best
interest of the project.


The technical understanding I do have of both Accord and CockroachDB leads
me to believe that holding up CEP-15 for that decision, regardless of
whether we decide SQL support is the direction to go or not, is not
necessary. I believe it was stated earlier in the thread but if Accord
provides similar or better guartunees than Raft then a similar distributed
transaction protocol can be built on top of it to support interactive SQL.

Jordan


On Tue, Oct 12, 2021 at 8:21 PM Jonathan Ellis <jbel...@gmail.com> wrote:

> Blake (and Benedict), I’ll ask for your patience here.  We don’t have a
> precedent of pushing through major initiatives in this project in a matter
> of weeks.  We [members of the PMC that weren’t involved in creating Accord]
> need time to do thorough research and make sure both that we understand
> what is being proposed and that we have evaluated reasonable alternatives.
>
> One of the difficulties in evaluating Accord is that it combines a
> state-of-the-art consensus/ordering protocol with a fairly limited
> transaction manager.  So it may be useful to decouple the consensus and
> transaction processing components, which would both allow non-Cassandra
> usage of the consensus piece, and also make explicit the boundaries with
> transaction processing with the consequence of making it easier to evolve
> independently.
>
> In the meantime, it’s very important to me to understand on which
> dimensions the transaction manager can be improved easily, and which
> dimensions resist such improvement.  I get that Accord is your [plural]
> baby and it’s awkward for me to come along and start pointing at its
> limitations, but that’s part of creating a complete understanding of any
> system.
>
> If I keep coming back to the subject of SQL support and interactive
> transactions, that’s because it’s becoming table stakes in the distributed
> database space. People are using Cockroach or Yugabyte or Cloud Spanner for
> use cases where a couple years ago they would have used Cassandra. We can
> expect this trend to continue and strengthen.
>
> On Mon, Oct 11, 2021 at 11:39 PM Blake Eggleston
> <beggles...@apple.com.invalid> wrote:
>
> > Let’s get back on topic.
> >
> > Jonathan, in your opening email you stated that, in your view, the 2 main
> > areas of tradeoff were:
> >
> > > 1. Is it worth giving up local latencies to get full global
> consistency?
> >
> > Now we’ve established that we don’t need to give up local latencies with
> > Accord, which leaves:
> >
> > > 2. Is it worth giving up the possibility of SQL support, to get the
> > benefits of deterministic transaction design?
> >
> > I pointed out that this was a false dilemma and that, in the worst case,
> a
> > hypothetical SQL feature could have it’s own consensus system. I hope
> that
> > won’t be necessary, but as I later pointed out (and you did not address,
> > although maybe I should have phrased it as a question), if we’re going to
> > weigh accord against a hypothetical SQL feature that lacks design goals,
> or
> > any clear ideas about how it might be implemented, how can we rule that
> out?
> >
> > So Jonathan, how can we rule that out? How can we have a productive
> > discussion about a feature you yourself are unable to describe in any
> > meaningful detail?
> >
> > > On Oct 11, 2021, at 6:34 PM, Jonathan Ellis <jbel...@gmail.com> wrote:
> > >
> > > On Mon, Oct 11, 2021 at 5:11 PM bened...@apache.org <
> bened...@apache.org
> > >
> > > wrote:
> > >
> > >> If we want to fully unpack this particular point, as far as I can tell
> > >> claiming ANSI SQL would indeed require interactive transactions in
> which
> > >> arbitrary conditional work may be performed by a client within a
> > >> transaction in response to other actions within that transaction.
> > >>
> > >> However:
> > >>
> > >>  1.  The ANSI SQL standard permits these transactions to fail and
> > >> rollback (e.g. in the event that your optimistic transaction fails).
> So
> > if
> > >> you want to be pedantic, you may modify my statement to “SQL does not
> > >> necessitate support for abort-free interactive transactions” and we
> can
> > >> leave it there.
> > >>
> > >>  2.  I would personally consider “SQL support” to include the
> capability
> > >> of defining arbitrary SQL stored procedures that may be executed by
> > clients
> > >> in an interactive session
> > >
> > >
> > > I note your personal preference and I further note that this is not the
> > > common understanding of "SQL support" in the industry.  If you tell 100
> > > developers that your database supports SQL, then at least 99 of them
> are
> > > going to assume that you can work with APIs like JDBC that expose
> > > interactive transactions as a central feature, and hence that you will
> be
> > > reasonably compatible with the vast array of SQL-based applications out
> > > there.
> > >
> > > Historical side note: VoltDB tried to convince people that stored
> > > procedures were good enough.  It didn't work, and VoltDB had to add
> > > interactive transactions as fast as they could.
> > >
> > >  3.  Most importantly, as I pointed out in the previous email, Accord
> is
> > >> compatible with a YugaByte/Cockroach-like approach, and indeed makes
> > this
> > >> approach both easier to accomplish and enables stronger isolation than
> > the
> > >> equivalent Raft-based approach. These approaches are able to reduce
> the
> > >> number of conflicts, at a cost of significantly higher transaction
> > >> management burden.
> > >>
> > >
> > > If you're saying that you could use Accord instead of Raft or Paxos,
> and
> > > layer 2PC on top of that as in Spanner, then I agree, but I don't think
> > > that is a very good design, as you would no longer get any of the
> > benefits
> > > of the deterministic approach you started with.  If you mean something
> > > else, then perhaps an example would help clarify.
> > >
> > > --
> > > Jonathan Ellis
> > > co-founder, http://www.datastax.com
> > > @spyced
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>
> --
> Jonathan Ellis
> co-founder, http://www.datastax.com
> @spyced
>

Re: Tradeoffs for Cassandra transaction management

Reply via email to