Re: [GENERAL] Fwd: How does PostgreSQL serializable transaction works? (SELECT/UPDATE vs INSERT conflict)

Alexandru Lazarev Thu, 10 Mar 2016 11:51:35 -0800

Ok,
Now it's more clear for me. Thanks to all, especially on @Kevin deep
explanation and (tx3) example. Question is closed.


I intuited that it might be as all of You explained, but was not sure, I
was confused by "predicate lock" - I thought it's related to `SELECT+WHERE`
and not to data (a kind of "subject lock").
Now I understood that key-words are "serial execution in any order will
lead to conflict" - here I also was a little bit confused by chronological
order of commit.

P.S. One more "offtop" question - What kind of frameworks do automatically
retries for failed transactions? Are Hibernate/Spring in that list?


Best Regards,
AlexL

On Thu, Mar 10, 2016 at 12:41 AM, Kevin Grittner <[email protected]> wrote:

> On Wed, Mar 9, 2016 at 1:39 PM, Alexandru Lazarev
> <[email protected]> wrote:
>
> Jeff's answer is entirely correct; I'm just going to go into more
> detail -- just in case you're interested enough to work through it.
>
> > `CREATE TABLE public.foo (id serial PRIMARY KEY, mynum integer);`
> >
> > and following data
> >
> >      id | mynum
> >     ----+-------
> >       1 |    10
> >       2 |    10
> >       3 |    10
> >       4 |    10
> >     (4 rows)
> >
> > I run 2 serialize transactions in parallel (2 `psql` consoles):
> >
> >     -- both transactions
> >     mydb=# begin;
> >     BEGIN
> >     mydb=# set transaction isolation level serializable;
> >     SET
> >
> >     -- tx1
> >     mydb=# select * from foo where mynum < 100;
> >     id | mynum
> >     ----+-------
> >       1 |    10
> >       2 |    10
> >       3 |    10
> >       4 |    10
> >     (4 rows)
> >     --tx1: Shouldn't freeze data visible for tx1 select?
>
> Yes, tx1 does have a snapshot which will guarantee that it sees a
> repeatable set view of the data for this predicate.
>
> >         --tx2
> >         mydb=# insert into foo (mynum) values (10);
> >         INSERT 0 1
> >         -- tx2 will insert next row with id 5 in foo table
> >         -- Shouldn't insert of tx2 broke data snapshot visible for tx1?
>
> The snapshot tx1 has guarantees that overlapping changes won't
> change it's view of things, and there is no reason for anything to
> be blocked or canceled here.  The insert creates what is called a
> read-write dependency (or rw-conflict for short) that establishes
> that in any serial ordering of a set of transactions which includes
> tx1 and tx2, tx1 must precede tx2 in the apparent order of
> execution.
>
> >     --tx1
> >     mydb=# update foo set mynum = 20 where id < 100;
> >     UPDATE 4
> >     -- Shouldn't here appear serialization fail or at least on tx1
> commit?
>
> No, there is no cycle in the apparent order of execution.  The
> snapshot for tx1 still limits it to the same set of rows, and there
> is nothing visible that is inconsistent with tx1 running before
> tx2.
>
> >         --tx2
> >         mydb=# commit;
> >         COMMIT
> >
> >     --tx1
> >     mydb=# commit;
> >     COMMIT
> >     -- tx1 Commit is OK - no any error
>
> According to the SQL standard, and in the PostgreSQL implementation
> of SERIALIZABLE transactions, commit order does not, by itself,
> establish apparent  order of execution.
>
> >     -- implicit tx
> >     mydb=# select * from foo;
> >     id | mynum
> >     ----+-------
> >       1 |    20
> >       2 |    20
> >       3 |    20
> >       4 |    20
> >       5 |    10
> >     (5 rows)
>
> As Jeff said, this is consistent with the implicit transaction
> running last, so tx1 -> tx2 -> implicit_tx.
>
> Now, you are pretty close to a situation which does need to trigger
> a serialization failure -- just switch the commit of tx1 and the
> implicit transaction.  If tx2 has committed but tx1 has not yet
> committed:
>
> mydb=# select * from foo;
>  id | mynum
> ----+-------
>   1 |    10
>   2 |    10
>   3 |    10
>   4 |    10
>   5 |    10
> (5 rows)
>
> *Now* we have a problem -- this only makes sense if the implicit tx
> was run after tx2 and before tx1.  So apparent order of execution
> is tx1 -> tx2 -> implicit_tx -> tx1.  There is a cycle in the
> apparent order of execution, which causes anomalies which can ruin
> data integrity.  Now, if the implicit transaction is not
> serializable, it is allowed to see such things, but if you make it
> serializable (and let's call it tx3 now) it sees a state where only
> tx2 ran; tx1 could not have run:
>
> -- tx3
> mydb=# begin;
> BEGIN
> mydb=# set transaction isolation level serializable;
> SET
> mydb=# select * from foo;
>  id | mynum
> ----+-------
>   1 |    10
>   2 |    10
>   3 |    10
>   4 |    10
>   5 |    10
> (5 rows)
>
> mydb=# commit;
> COMMIT
>
> So now, tx1 is not allowed to commit, or for that matter do
> anything else -- it has been "doomed" by tx3:
>
> mydb=# select * from foo;
> ERROR:  could not serialize access due to read/write dependencies
> among transactions
> DETAIL:  Reason code: Canceled on identification as a pivot, during
> conflict out checking.
> HINT:  The transaction might succeed if retried.
>
> Hopefully you are using some framework to automatically detect this
> SQLSTATE and retry the transaction from the start.  So on retry,
> tx1 does this:
>
> -- tx1 (retry)
> mydb=# begin;
> BEGIN
> mydb=# set transaction isolation level serializable;
> SET
> mydb=# select * from foo where mynum < 100;
>  id | mynum
> ----+-------
>   1 |    10
>   2 |    10
>   3 |    10
>   4 |    10
>   5 |    10
> (5 rows)
>
> mydb=# update foo set mynum = 20 where id < 100;
> UPDATE 5
> mydb=# select * from foo;
>  id | mynum
> ----+-------
>   1 |    20
>   2 |    20
>   3 |    20
>   4 |    20
>   5 |    20
> (5 rows)
>
> mydb=# commit;
> COMMIT
>
> Now the result of all successfully committed serializiable
> transactions is consistent with the order tx2 -> tx3 -> tx1.  All
> is good.
>
> Kevin Grittner
>

Re: [GENERAL] Fwd: How does PostgreSQL serializable transaction works? (SELECT/UPDATE vs INSERT conflict)

Reply via email to