[Moving followup here from the other thread]

I think there is in fact a difference here.

Consider a workload consisting of two clients.  One of them is submitting a
stream of TPC-C new order transactions (new order client = NOC), and the
other is performing a simple increment of district next order ids
(increment district client = IDC).

If we run these two workloads in postgresql under READ COMMITTED, both
clients will proceed happily (although we will get serialization anomalies).

If we run them in pg under SERIALIZABLE, then the NOC client will get the
"could not serialize access" error whenever the IDC client updates the
district concurrently, which will be effectively every time since the IDC
transaction is much simpler.  But, SQL gives you a tool to allow NOC to
make progress, which is SELECT FOR UPDATE.  If the NOC performs its first
read with FOR UPDATE then it will (1) block until the current IDC
transaction completes and then (2) grab a lock that prevents further
updates from happening concurrently, allowing NOC to make progress.
Neither NOC nor IDC will ever get a "could not serialize access" error.

It looks to me like the proposed design here would (1) not allow NOC to
make progress at READ COMMITTED, but also (2) does not provide the tools to
achieve progress with SERIALIZABLE either since locking outside of the
global consensus does not make sense.

On Wed, Oct 13, 2021 at 1:59 PM Henrik Ingo <henrik.i...@datastax.com>
wrote:

> Sorry Jonathan, didn't see this reply earlier today.
>
> That would be common behaviour for many MVCC databases, including MongoDB,
> MySQL Galera Cluster, PostgreSQL...
>
> https://www.postgresql.org/docs/9.5/transaction-iso.html
>
> *"Applications using this level must be prepared to retry transactions due
> to serialization failures."*
>
> On Wed, Oct 13, 2021 at 3:19 AM Jonathan Ellis <jbel...@gmail.com> wrote:
>
> > Hi Henrik,
> >
> > I don't see how this resolves the fundamental problem that I outlined to
> > start with, namely, that without having the entire logic of the
> transaction
> > available to it, the server cannot retry the transaction when concurrent
> > changes are found to have been applied after the reconnaissance reads
> (what
> > you call the conversational phase).


On Wed, Oct 13, 2021 at 5:00 AM Henrik Ingo <henrik.i...@datastax.com>
wrote:

> On Wed, Oct 13, 2021 at 1:26 AM Blake Eggleston
> <beggles...@apple.com.invalid> wrote:
>
> > Hi Henrik,
> >
> > I would agree that the local serial experience for valid use cases should
> > be supported in some form before legacy LWT is replaced by Accord.
> >
> >
> Great! It seems there's a seed of consensus on this point.
>
>
> > Regarding your read committed proposal, I think this CEP discussion has
> > already spent too much time talking about hypothetical SQL
> implementations,
> > and I’d like to avoid veering off course again. However, since you’ve
> asked
> > a well thought out question with concrete goals and implementation ideas,
> > I’m happy to answer it. I just ask that if you want to discuss it beyond
> my
> > reply, you start a separate ‘[IDEA] Read committed transaction with
> Accord’
> > thread where we could talk about it a bit more without it feeling like we
> > need to delay a vote.
> >
> >
> This is a reasonable request. We were already in a side thread I guess, but
> I like organizing discussions into separate threads...
>
> Let's see if I manage to break the thread correctly simply by editing the
> subject...
>
> FWIW, my hope for this discussion was that by providing a simple yet
> concrete example, it would facilitate the discussion toward a CEP-15 vote,
> not distract from it. As it happened, Alex Miller was writing a hugely
> helpful email concurrently with mine, which improves details in CEP-15, so
> I don't know if expecting the discussion to die out just yet is ignoring
> people who maybe working off list to still understand this rather advanced
> reading material.
>
>
>
> > So I think it could work with some modifications.
> >
> > First you’d need to perform your select statements as accord reads, not
> > quorum reads. Otherwise you may not see writes that have been (or could
> > have been) committed. A multi-partition write could also appear to become
> > undone, if a write commit has not reached one of the keys or needs to be
> > recovered.
> >
>
> Ah right. I think we established early on that tables should be either
> Accord-only, or legacy C* only. I was too fixated on the "no other changes"
> and forgot this.
>
> This is then a very interesting detail you point out! It seems like
> potentially every statement now needs to go through the Accord consensus
> protocol, and this could become expensive, where my goal was to design the
> simplest and most lightweight example thinkable. BUT for read-only Accord
> transactions, where I specifically also don't care about serializability,
> wouldn't this be precisely the case where I can simply pick my own
> timestamp and do a stale read from a  nearby replica?
>
>
> >
> > Second, when you talk about transforming mutations, I’m assuming you’re
> > talking about confirming primary keys do or do not exist,
>
>
> No, I was thinking more broadly of operations like `UPDATE table1 SET
> column1=x WHERE pk >= 10 and pk <= 20`
>
> My thinking was that I need to know the exact primary keys touched both
> during the conversational phase and the commit phase. In essence, this is
> an interactive reconnaisance phase.
>
> You make a great point that for statements where the PK is explicit, they
> can just be directly added to the write set and transaction state. Ex:
> `UPDATE table1 SET column1=x WHERE pk IN (1,2,3)`
>
>
>
> > and supporting auto-incrementing primary keys. To confirm primary keys do
> > or do not exist, you’d also need to perform an accord read also.
>
>
> For sure.
>
>
> > For auto-incrementing primary keys, you’d need to do an accord read/write
> > operation to increment a counter somewhere (or just use uuids).
> >
> >
> I had not considered auto-increment at all, but if that would be a
> requirement, then I tend to translate "auto-increment" into "any service
> that can hand out unique integers". (In practice, no database can force me
> to commit the integers in the order that they're actually monotonically
> increasing, so "auto-increment" is an illusion, I realized at some point in
> my career.)
>
>
> > Finally, read committed does lock rows, so you’d still need to perform a
> > read on commit to confirm that the rows being written to haven’t been
> > modified since the transaction began.
> >
>
> Hmm...
>
> As we see in a separate discussion is already diving into this, it seems
> like at least the SQL 1992 standard only says read committed must protect
> against P1 and that's it. My suspicion is that since most modern databases
> start from MVCC, they essentially "over deliver" when providing read
> committed, since the implementation naturally provides snapshot reads and
> in fact it would be complicated to do something less consistent.
>
> For this discussion it's not really important which interpretation is
> correct, since either is a reasonable semantic. For my purposes I'll just
> note that needing to re-execute all reads during the Accord phase (commit
> phase) would make the design more expensive, since the transaction is now
> executed twice. The goal of a simplistic light weight semantic is achieved
> by not doing so and claiming the weaker interpretation of read committed is
> "correct".
>
> henrik
>
> --
>
> Henrik Ingo
>
> +358 40 569 7354 <358405697354>
>
> [image: Visit us online.] <https://www.datastax.com/>  [image: Visit us on
> Twitter.] <https://twitter.com/DataStaxEng>  [image: Visit us on YouTube.]
> <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.youtube.com_channel_UCqA6zOSMpQ55vvguq4Y0jAg&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=IFj3MdIKYLLXIUhYdUGB0cTzTlxyCb7_VUmICBaYilU&m=bmIfaie9O3fWJAu6lESvWj3HajV4VFwgwgVuKmxKZmE&s=16sY48_kvIb7sRQORknZrr3V8iLTfemFKbMVNZhdwgw&e=
> >
>   [image: Visit my LinkedIn profile.] <https://www.linkedin.com/in/heingo/
> >
>


-- 
Jonathan Ellis
co-founder, http://www.datastax.com
@spyced

Reply via email to