Re: [HACKERS] Re: How to reproduce serialization failure for a read only transaction.

2014-01-07 Thread Dan Ports
On Mon, Jan 06, 2014 at 05:14:12PM -0800, AK wrote:
 Also I cannot reproduce a scenario when applications must not depend on
 results read during a transaction that later aborted;. In this example the
 SELECT itself has failed.
 Can you show an example where a SELECT completes, but the COMMIT blows up?

Actually, no, not for a read-only transaction. It happens that the
final serialization failure check executed on COMMIT only affects
read/write transactions, not read-only ones. That's a pretty specific
implementation detail, though, so I wouldn't necessarily rely on it...

Here's an example of why applications must not depend on results read
during a transaction that later aborted:

   W2: BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE
   W2: UPDATE t SET count=1 WHERE id=1;
   W1: BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE
   W1: SELECT * FROM t WHERE id=1;
   W2: COMMIT;
   R : BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE READ ONLY
   R : SELECT * FROM t;
   R : COMMIT;
 ! W1: UPDATE t SET count=1 WHERE id=2;
   W1: COMMIT;

If you try this, it'll cause a serialization failure on the line marked
with a '!'. W1 saw (1,0) in the table, so W1 appears to have executed
before W2. But R saw both (1,1) and (2,0) in the table, and that has to
be a consistent snapshot of the database state, meaning W2 appears to
have executed before W1. That's an inconsistency, so something has to
be rolled back. This particular anomaly requires all three of the
transactions, and so it can't be detected until W1 does its UPDATE.
Postgres detects the conflict at that point and rolls back W1.

So what does this have to do with relying on the results of read-only
transactions that abort? Well, what if you had instead had R ROLLBACK
instead of COMMIT -- maybe because you expected ROLLBACK and COMMIT to
be equivalent for transactions that don't modify the database, or maybe
because something else caused the transaction to abort? When W1 does
its update, it will be checked for serialization failures, but aborted
transactions are (intentionally) not included in those checks. W1 is
therefore allowed to commit; the apparent serial order of execution is
W1 followed by W2, and the results of the aborted transaction R aren't
consistent with that.

Dan

-- 
Dan R. K. PortsUW CSEhttp://drkp.net/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: How to reproduce serialization failure for a read only transaction.

2014-01-07 Thread Kevin Grittner
Dan Ports d...@csail.mit.edu wrote:
 On Mon, Jan 06, 2014 at 05:14:12PM -0800, AK wrote:

 If you try this, it'll cause a serialization failure on the line
 marked with a '!'. W1 saw (1,0) in the table, so W1 appears to
 have executed before W2. But R saw both (1,1) and (2,0) in the
 table, and that has to be a consistent snapshot of the database
 state, meaning W2 appears to have executed before W1. That's an
 inconsistency, so something has to be rolled back. This
 particular anomaly requires all three of the transactions, and so
 it can't be detected until W1 does its UPDATE. Postgres detects
 the conflict at that point and rolls back W1.

Yeah, neither of the provided examples rolled back the read only
transaction itself; the read only transaction caused a situation
where something needed to be rolled back, but since we try to roll
back a transaction which has a good chance of succeeding on retry,
the read only transaction is not usually a good candidate.  I
created a new example on the Wiki page where the read only
transaction itself must be rolled back because both of the other
transactions involved have already committed:

https://wiki.postgresql.org/wiki/SSI#Rollover

Regarding other questions on the thread:

I have no objections to moving the Wiki examples into the docs, but
it seemed like a lot to include, and I'm not sure where it belongs.
Ideas?

Regarding the different results AK got, I set
default_transaction_isolation = 'serializable' on my connections
before running these for two reasons.
(1)  It keeps the examples more concise.
(2)  I think most people using serializable transactions in
PostgreSQL set the default and don't set the transaction isolation
level on each transaction, since (unlike strategies which rely on
blocking, like S2PL) all transactions must be participating in the
stricter isolation level for it to be reliable.  In fact, given the
performance benefits of declaring transactions READ ONLY when
possible, I have seen shops that make *that* a default, too, and
override it for transactions which need to write.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: How to reproduce serialization failure for a read only transaction.

2014-01-07 Thread Kevin Grittner
AK alk...@gmail.com wrote:

 Session 1. Setting up:

 CREATE TABLE cars(
   license_plate VARCHAR NOT NULL,
   reserved_by VARCHAR NULL
 );
 INSERT INTO cars(license_plate)
 VALUES ('SUPRUSR'),('MIDLYPH');

 Session 2: W1

 BEGIN ISOLATION LEVEL SERIALIZABLE;

 UPDATE cars SET reserved_by = 'Julia'
   WHERE license_plate = 'SUPRUSR'
   AND reserved_by IS NULL;

 SELECT * FROM Cars
 WHERE license_plate IN('SUPRUSR','MIDLYPH');

 Session 3: W2

 BEGIN ISOLATION LEVEL SERIALIZABLE;

 UPDATE cars SET reserved_by = 'Ryan'
   WHERE license_plate = 'MIDLYPH'
   AND reserved_by IS NULL;

 COMMIT;

 Session 4: R

 BEGIN ISOLATION LEVEL SERIALIZABLE READ ONLY;

 SELECT * FROM Cars
 WHERE license_plate IN('SUPRUSR','MIDLYPH');

 Session 2: W1

 COMMIT;

 ERROR:  could not serialize access due to read/write dependencies
 among transactions

 What am I doing wrong?

Even without the read only transaction the W1 and W2 transactions
are a classic case of write skew.  It looks like it might actually
be benign, since neither transaction is updating license_plate, but
serializable logic works at the row level, not the column level.
After both transactions update the table there is write skew which
must be resolved by cancelling one of the transactions.  The first
to commit wins and the other one will be cancelled when it
attempts to run its next statement, which may or may not be a
COMMIT.

If, for purposes of demonstration, you add a unique index on
license_plate and set enable_seqscan = off, you eliminate the
simple write skew and get into more complex ways of breaking
things.  With that tweak you can run all of those transactions if
W1 skips the SELECT.  You can let W1 do the SELECT as long as you
don't run R.  The problem is that the SELECT in W1 sees the work of
W1 but not W2 and the SELECT in R sees the work of W2 but not W1. 
We can't allow that.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: How to reproduce serialization failure for a read only transaction.

2014-01-07 Thread AK
Regarding this:  So what does this have to do with relying on the results
of read-only 
transactions that abort? Well, what if you had instead had R ROLLBACK 
instead of COMMIT -- maybe because you expected ROLLBACK and COMMIT to 
be equivalent for transactions that don't modify the database, or maybe 
because something else caused the transaction to abort? When W1 does 
its update, it will be checked for serialization failures, but aborted 
transactions are (intentionally) not included in those checks. W1 is 
therefore allowed to commit; the apparent serial order of execution is 
W1 followed by W2, and the results of the aborted transaction R aren't 
consistent with that. 

So if I am reading the data and then commit, I should be always fine,
correct?



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/How-to-reproduce-serialization-failure-for-a-read-only-transaction-tp5785569p5785757.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: How to reproduce serialization failure for a read only transaction.

2014-01-07 Thread Kevin Grittner
AK alk...@gmail.com wrote:

 So if I am reading the data and then commit, I should be always
 fine, correct?

If a serializable transaction successfully commits, that means that
all data read within that transaction can be trusted.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: How to reproduce serialization failure for a read only transaction.

2014-01-06 Thread AK
Hi Florian,

can you explain why do you state that default isolation level is assumed to
be
serializable, of course, when you explicitly specify isolation level for
every session - why should he default matter at all?

When I am trying to reproduce the scenario which you have posted, I am
observing different results. Here is my full scenario:

Session 1. Setting up:

CREATE TABLE cars(
  license_plate VARCHAR NOT NULL,
  reserved_by VARCHAR NULL
);
INSERT INTO cars(license_plate)
VALUES ('SUPRUSR'),('MIDLYPH');

Session 2: W1

BEGIN ISOLATION LEVEL SERIALIZABLE;

UPDATE cars SET reserved_by = 'Julia'
  WHERE license_plate = 'SUPRUSR'
  AND reserved_by IS NULL;

SELECT * FROM Cars
WHERE license_plate IN('SUPRUSR','MIDLYPH');

Session 3: W2

BEGIN ISOLATION LEVEL SERIALIZABLE;

UPDATE cars SET reserved_by = 'Ryan'
  WHERE license_plate = 'MIDLYPH'
  AND reserved_by IS NULL;

COMMIT;

Session 4: R

BEGIN ISOLATION LEVEL SERIALIZABLE READ ONLY;

SELECT * FROM Cars 
WHERE license_plate IN('SUPRUSR','MIDLYPH');

Session 2: W1

COMMIT;

ERROR:  could not serialize access due to read/write dependencies among
transactions

What am I doing wrong?

Thank you for your help!



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/How-to-reproduce-serialization-failure-for-a-read-only-transaction-tp5785569p5785597.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: How to reproduce serialization failure for a read only transaction.

2014-01-06 Thread Florian Pflug
On Jan6, 2014, at 23:28 , AK alk...@gmail.com wrote:
 can you explain why do you state that default isolation level is assumed to
 be
 serializable, of course, when you explicitly specify isolation level for
 every session - why should he default matter at all?

Sorry, that was a leftover - I initially wrote just START TRANSACTION with
specifying an isolation level.

 
 When I am trying to reproduce the scenario which you have posted, I am
 observing different results.

Hm, yeah, I missed two things.

First, dependency tracking can produce false positives, i.e. assume that
dependencies exist between transactions which are actually independent.
In my example, postgres fails to realize that W2 can be executed after W1,
unless it uses an index scan for the UPDATE in W2. You can avoid that either
by creating an index on the id column, and forcing W2 to use that by setting
enable_seqscan to off, or by creating two tables t1 and t2 instead of one
table t with two records (You'll have to modify the SELECT to scan both tables
too).

Second, since R executes it's SELECT before W1 commits, postgres is already
aware that R poses a problem when W1 commits, and it chooses to cancel W1
instead of R. To avoid that, R needs to do the SELECT after W1 committed.
Yet still force R to acquire a snapshot *before* that commit (without that,
there's no serialization failure since R than simply executes after W1 and
W2), you'll need to do e.g. SELECT 1 after R's START TRANSACTION command.

I think the following should work (or, rather, fail)

CREATE TABLE t (id INT PRIMARY KEY, count INT);
INSERT INTO t (id, count) SELECT i, 0 FROM generate_series(1,2);

W1: START TRANSACTION ISOLATION LEVEL SERIALIZABLE;
W1: UPDATE t SET count=count+1 WHERE id=1;
W1: SELECT count FROM t WHERE id=2;
W2: SET enable_seqscan=off;
W2: START TRANSACTION ISOLATION LEVEL SERIALIZABLE;
W2: UPDATE t SET count=count+1 WHERE id=2;
W2: COMMIT;
R : START TRANSACTION ISOLATION LEVEL SERIALIZABLE READ ONLY;
R : SELECT 1;
W1: COMMIT;
R : SELECT data FROM t WHERE id IN (1,2);  -- Should fail

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: How to reproduce serialization failure for a read only transaction.

2014-01-06 Thread Jim Nasby

On 1/6/14, 5:27 PM, Florian Pflug wrote: On Jan6, 2014, at 23:28 , AK 
alk...@gmail.com wrote:
 First, dependency tracking can produce false positives, i.e. assume that
 dependencies exist between transactions which are actually independent.
 In my example, postgres fails to realize that W2 can be executed after W1,
 unless it uses an index scan for the UPDATE in W2. You can avoid that either
 by creating an index on the id column, and forcing W2 to use that by setting
 enable_seqscan to off, or by creating two tables t1 and t2 instead of one
 table t with two records (You'll have to modify the SELECT to scan both tables
 too).

 Second, since R executes it's SELECT before W1 commits, postgres is already
 aware that R poses a problem when W1 commits, and it chooses to cancel W1
 instead of R. To avoid that, R needs to do the SELECT after W1 committed.
 Yet still force R to acquire a snapshot *before* that commit (without that,
 there's no serialization failure since R than simply executes after W1 and
 W2), you'll need to do e.g. SELECT 1 after R's START TRANSACTION command.

 I think the following should work (or, rather, fail)

This email and the previous one are an awesome bit of information, can we add 
it to the docs somehow? Even if it's just dumping the emails into a wiki page 
and referencing it?
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: How to reproduce serialization failure for a read only transaction.

2014-01-06 Thread AK
This worked for me - thank you so much! The SELECT did fail.

Also I cannot reproduce a scenario when applications must not depend on
results read during a transaction that later aborted;. In this example the
SELECT itself has failed.
Can you show an example where a SELECT completes, but the COMMIT blows up?



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/How-to-reproduce-serialization-failure-for-a-read-only-transaction-tp5785569p5785618.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers