Re: [HACKERS] Re: How to reproduce serialization failure for a read only transaction.
On Mon, Jan 06, 2014 at 05:14:12PM -0800, AK wrote: Also I cannot reproduce a scenario when applications must not depend on results read during a transaction that later aborted;. In this example the SELECT itself has failed. Can you show an example where a SELECT completes, but the COMMIT blows up? Actually, no, not for a read-only transaction. It happens that the final serialization failure check executed on COMMIT only affects read/write transactions, not read-only ones. That's a pretty specific implementation detail, though, so I wouldn't necessarily rely on it... Here's an example of why applications must not depend on results read during a transaction that later aborted: W2: BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE W2: UPDATE t SET count=1 WHERE id=1; W1: BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE W1: SELECT * FROM t WHERE id=1; W2: COMMIT; R : BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE READ ONLY R : SELECT * FROM t; R : COMMIT; ! W1: UPDATE t SET count=1 WHERE id=2; W1: COMMIT; If you try this, it'll cause a serialization failure on the line marked with a '!'. W1 saw (1,0) in the table, so W1 appears to have executed before W2. But R saw both (1,1) and (2,0) in the table, and that has to be a consistent snapshot of the database state, meaning W2 appears to have executed before W1. That's an inconsistency, so something has to be rolled back. This particular anomaly requires all three of the transactions, and so it can't be detected until W1 does its UPDATE. Postgres detects the conflict at that point and rolls back W1. So what does this have to do with relying on the results of read-only transactions that abort? Well, what if you had instead had R ROLLBACK instead of COMMIT -- maybe because you expected ROLLBACK and COMMIT to be equivalent for transactions that don't modify the database, or maybe because something else caused the transaction to abort? When W1 does its update, it will be checked for serialization failures, but aborted transactions are (intentionally) not included in those checks. W1 is therefore allowed to commit; the apparent serial order of execution is W1 followed by W2, and the results of the aborted transaction R aren't consistent with that. Dan -- Dan R. K. PortsUW CSEhttp://drkp.net/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: How to reproduce serialization failure for a read only transaction.
Dan Ports d...@csail.mit.edu wrote: On Mon, Jan 06, 2014 at 05:14:12PM -0800, AK wrote: If you try this, it'll cause a serialization failure on the line marked with a '!'. W1 saw (1,0) in the table, so W1 appears to have executed before W2. But R saw both (1,1) and (2,0) in the table, and that has to be a consistent snapshot of the database state, meaning W2 appears to have executed before W1. That's an inconsistency, so something has to be rolled back. This particular anomaly requires all three of the transactions, and so it can't be detected until W1 does its UPDATE. Postgres detects the conflict at that point and rolls back W1. Yeah, neither of the provided examples rolled back the read only transaction itself; the read only transaction caused a situation where something needed to be rolled back, but since we try to roll back a transaction which has a good chance of succeeding on retry, the read only transaction is not usually a good candidate. I created a new example on the Wiki page where the read only transaction itself must be rolled back because both of the other transactions involved have already committed: https://wiki.postgresql.org/wiki/SSI#Rollover Regarding other questions on the thread: I have no objections to moving the Wiki examples into the docs, but it seemed like a lot to include, and I'm not sure where it belongs. Ideas? Regarding the different results AK got, I set default_transaction_isolation = 'serializable' on my connections before running these for two reasons. (1) It keeps the examples more concise. (2) I think most people using serializable transactions in PostgreSQL set the default and don't set the transaction isolation level on each transaction, since (unlike strategies which rely on blocking, like S2PL) all transactions must be participating in the stricter isolation level for it to be reliable. In fact, given the performance benefits of declaring transactions READ ONLY when possible, I have seen shops that make *that* a default, too, and override it for transactions which need to write. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: How to reproduce serialization failure for a read only transaction.
AK alk...@gmail.com wrote: Session 1. Setting up: CREATE TABLE cars( license_plate VARCHAR NOT NULL, reserved_by VARCHAR NULL ); INSERT INTO cars(license_plate) VALUES ('SUPRUSR'),('MIDLYPH'); Session 2: W1 BEGIN ISOLATION LEVEL SERIALIZABLE; UPDATE cars SET reserved_by = 'Julia' WHERE license_plate = 'SUPRUSR' AND reserved_by IS NULL; SELECT * FROM Cars WHERE license_plate IN('SUPRUSR','MIDLYPH'); Session 3: W2 BEGIN ISOLATION LEVEL SERIALIZABLE; UPDATE cars SET reserved_by = 'Ryan' WHERE license_plate = 'MIDLYPH' AND reserved_by IS NULL; COMMIT; Session 4: R BEGIN ISOLATION LEVEL SERIALIZABLE READ ONLY; SELECT * FROM Cars WHERE license_plate IN('SUPRUSR','MIDLYPH'); Session 2: W1 COMMIT; ERROR: could not serialize access due to read/write dependencies among transactions What am I doing wrong? Even without the read only transaction the W1 and W2 transactions are a classic case of write skew. It looks like it might actually be benign, since neither transaction is updating license_plate, but serializable logic works at the row level, not the column level. After both transactions update the table there is write skew which must be resolved by cancelling one of the transactions. The first to commit wins and the other one will be cancelled when it attempts to run its next statement, which may or may not be a COMMIT. If, for purposes of demonstration, you add a unique index on license_plate and set enable_seqscan = off, you eliminate the simple write skew and get into more complex ways of breaking things. With that tweak you can run all of those transactions if W1 skips the SELECT. You can let W1 do the SELECT as long as you don't run R. The problem is that the SELECT in W1 sees the work of W1 but not W2 and the SELECT in R sees the work of W2 but not W1. We can't allow that. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: How to reproduce serialization failure for a read only transaction.
Regarding this: So what does this have to do with relying on the results of read-only transactions that abort? Well, what if you had instead had R ROLLBACK instead of COMMIT -- maybe because you expected ROLLBACK and COMMIT to be equivalent for transactions that don't modify the database, or maybe because something else caused the transaction to abort? When W1 does its update, it will be checked for serialization failures, but aborted transactions are (intentionally) not included in those checks. W1 is therefore allowed to commit; the apparent serial order of execution is W1 followed by W2, and the results of the aborted transaction R aren't consistent with that. So if I am reading the data and then commit, I should be always fine, correct? -- View this message in context: http://postgresql.1045698.n5.nabble.com/How-to-reproduce-serialization-failure-for-a-read-only-transaction-tp5785569p5785757.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: How to reproduce serialization failure for a read only transaction.
AK alk...@gmail.com wrote: So if I am reading the data and then commit, I should be always fine, correct? If a serializable transaction successfully commits, that means that all data read within that transaction can be trusted. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: How to reproduce serialization failure for a read only transaction.
Hi Florian, can you explain why do you state that default isolation level is assumed to be serializable, of course, when you explicitly specify isolation level for every session - why should he default matter at all? When I am trying to reproduce the scenario which you have posted, I am observing different results. Here is my full scenario: Session 1. Setting up: CREATE TABLE cars( license_plate VARCHAR NOT NULL, reserved_by VARCHAR NULL ); INSERT INTO cars(license_plate) VALUES ('SUPRUSR'),('MIDLYPH'); Session 2: W1 BEGIN ISOLATION LEVEL SERIALIZABLE; UPDATE cars SET reserved_by = 'Julia' WHERE license_plate = 'SUPRUSR' AND reserved_by IS NULL; SELECT * FROM Cars WHERE license_plate IN('SUPRUSR','MIDLYPH'); Session 3: W2 BEGIN ISOLATION LEVEL SERIALIZABLE; UPDATE cars SET reserved_by = 'Ryan' WHERE license_plate = 'MIDLYPH' AND reserved_by IS NULL; COMMIT; Session 4: R BEGIN ISOLATION LEVEL SERIALIZABLE READ ONLY; SELECT * FROM Cars WHERE license_plate IN('SUPRUSR','MIDLYPH'); Session 2: W1 COMMIT; ERROR: could not serialize access due to read/write dependencies among transactions What am I doing wrong? Thank you for your help! -- View this message in context: http://postgresql.1045698.n5.nabble.com/How-to-reproduce-serialization-failure-for-a-read-only-transaction-tp5785569p5785597.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: How to reproduce serialization failure for a read only transaction.
On Jan6, 2014, at 23:28 , AK alk...@gmail.com wrote: can you explain why do you state that default isolation level is assumed to be serializable, of course, when you explicitly specify isolation level for every session - why should he default matter at all? Sorry, that was a leftover - I initially wrote just START TRANSACTION with specifying an isolation level. When I am trying to reproduce the scenario which you have posted, I am observing different results. Hm, yeah, I missed two things. First, dependency tracking can produce false positives, i.e. assume that dependencies exist between transactions which are actually independent. In my example, postgres fails to realize that W2 can be executed after W1, unless it uses an index scan for the UPDATE in W2. You can avoid that either by creating an index on the id column, and forcing W2 to use that by setting enable_seqscan to off, or by creating two tables t1 and t2 instead of one table t with two records (You'll have to modify the SELECT to scan both tables too). Second, since R executes it's SELECT before W1 commits, postgres is already aware that R poses a problem when W1 commits, and it chooses to cancel W1 instead of R. To avoid that, R needs to do the SELECT after W1 committed. Yet still force R to acquire a snapshot *before* that commit (without that, there's no serialization failure since R than simply executes after W1 and W2), you'll need to do e.g. SELECT 1 after R's START TRANSACTION command. I think the following should work (or, rather, fail) CREATE TABLE t (id INT PRIMARY KEY, count INT); INSERT INTO t (id, count) SELECT i, 0 FROM generate_series(1,2); W1: START TRANSACTION ISOLATION LEVEL SERIALIZABLE; W1: UPDATE t SET count=count+1 WHERE id=1; W1: SELECT count FROM t WHERE id=2; W2: SET enable_seqscan=off; W2: START TRANSACTION ISOLATION LEVEL SERIALIZABLE; W2: UPDATE t SET count=count+1 WHERE id=2; W2: COMMIT; R : START TRANSACTION ISOLATION LEVEL SERIALIZABLE READ ONLY; R : SELECT 1; W1: COMMIT; R : SELECT data FROM t WHERE id IN (1,2); -- Should fail best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: How to reproduce serialization failure for a read only transaction.
On 1/6/14, 5:27 PM, Florian Pflug wrote: On Jan6, 2014, at 23:28 , AK alk...@gmail.com wrote: First, dependency tracking can produce false positives, i.e. assume that dependencies exist between transactions which are actually independent. In my example, postgres fails to realize that W2 can be executed after W1, unless it uses an index scan for the UPDATE in W2. You can avoid that either by creating an index on the id column, and forcing W2 to use that by setting enable_seqscan to off, or by creating two tables t1 and t2 instead of one table t with two records (You'll have to modify the SELECT to scan both tables too). Second, since R executes it's SELECT before W1 commits, postgres is already aware that R poses a problem when W1 commits, and it chooses to cancel W1 instead of R. To avoid that, R needs to do the SELECT after W1 committed. Yet still force R to acquire a snapshot *before* that commit (without that, there's no serialization failure since R than simply executes after W1 and W2), you'll need to do e.g. SELECT 1 after R's START TRANSACTION command. I think the following should work (or, rather, fail) This email and the previous one are an awesome bit of information, can we add it to the docs somehow? Even if it's just dumping the emails into a wiki page and referencing it? -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: How to reproduce serialization failure for a read only transaction.
This worked for me - thank you so much! The SELECT did fail. Also I cannot reproduce a scenario when applications must not depend on results read during a transaction that later aborted;. In this example the SELECT itself has failed. Can you show an example where a SELECT completes, but the COMMIT blows up? -- View this message in context: http://postgresql.1045698.n5.nabble.com/How-to-reproduce-serialization-failure-for-a-read-only-transaction-tp5785569p5785618.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers