Re: [HACKERS] Promise index tuples for UPSERT

2014-10-08 Thread Anssi Kääriäinen
On Tue, 2014-10-07 at 13:33 +0100, Simon Riggs wrote:
 Is there a way of detecting that we are updating a unique constraint
 column and then applying the HW locking only in that case? Or can we
 only apply locking when we have multiple unique constraints on a
 table?

What is the use case of doing an UPSERT into a table with multiple
unique constraints?

Consider table user with unique columns name and email and a non-unique
column age. If it has data

Jack | j...@example.com |33 
Tom | t...@example.com | 35

And the user does UPSERT values (Jack, t...@example.com, 34). The
proposed specification will pick random unique index (either name or
email index) and do an update of that row.

First, this will cause unique violation, second, doing the UPSERT on
random index seems confusing.

The MySQL documentation says that you should try to avoid using an ON
DUPLICATE KEY UPDATE clause on tables with multiple unique indexes[1].
The proposed feature's documentation has the same suggestion[2]. Still,
the feature defaults to this behavior. Why is the default something the
documentation says you shouldn't do?

Going a bit further, I am wondering what is the use case of doing an
UPSERT against multiple unique indexes? If multiple unique indexes
UPSERT could be dropped that might allow for faster or cleaner index
locking techniques.

 - Anssi

1: http://dev.mysql.com/doc/refman/5.6/en/insert-on-duplicate.html
2: 
http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-08 Thread Peter Geoghegan
On Wed, Oct 8, 2014 at 12:41 AM, Anssi Kääriäinen
anssi.kaariai...@thl.fi wrote:
 The MySQL documentation says that you should try to avoid using an ON
 DUPLICATE KEY UPDATE clause on tables with multiple unique indexes[1].
 The proposed feature's documentation has the same suggestion[2]. Still,
 the feature defaults to this behavior. Why is the default something the
 documentation says you shouldn't do?

MySQL started saying that when they realized it broke their
statement-based replication. They have their own weird reasons for
disliking it. Now, that's not to minimize the risks.

The reasoning behind making the unique index specification optional is:

We cannot easily cover corner cases with another syntax - unique
indexes must be named directly to cover every case, and make the
user's intent absolutely clear. That's not obviously the case, but
consider partial unique indexes, for example. Or consider uniquely
constrained columns, with an almost equivalent uniquely constrained
expression on those same columns. On the one hand I am not comfortable
failing to support those, but on the other hand it could get very
messy to do it another way.

As we all know, naming a unique index in DML is ugly, and has poor
support in ORMs. It seems likely that we're better off making it
optional - it hasn't been much of a problem with the existing subxact
looping pattern. A lot of the time, there will only be a single unique
index anyway, or there will be a trivial serial PK unique index and
the unique index we always want to treat would-be conflicts within as
triggering the alternative UPDATE/IGNORE path.

 Going a bit further, I am wondering what is the use case of doing an
 UPSERT against multiple unique indexes? If multiple unique indexes
 UPSERT could be dropped that might allow for faster or cleaner index
 locking techniques.

I see no reason to think that. I don't think it buys us much at all.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-08 Thread Heikki Linnakangas

On 10/08/2014 11:10 AM, Peter Geoghegan wrote:

The reasoning behind making the unique index specification optional is:

We cannot easily cover corner cases with another syntax - unique
indexes must be named directly to cover every case, and make the
user's intent absolutely clear. That's not obviously the case, but
consider partial unique indexes, for example. Or consider uniquely
constrained columns, with an almost equivalent uniquely constrained
expression on those same columns. On the one hand I am not comfortable
failing to support those, but on the other hand it could get very
messy to do it another way.

As we all know, naming a unique index in DML is ugly, and has poor
support in ORMs.


I vehemently object to naming indexes in the UPSERT statement. That 
mixes logical and physical database design, which is a bad idea. This is 
not ISAM.


Instead of naming the index, you should name the columns, and the system 
can look up the index or indexes that match those columns.


(Remind me again, where did this need to name an index come from in the 
first place?)


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-08 Thread Peter Geoghegan
On Wed, Oct 8, 2014 at 1:25 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 Instead of naming the index, you should name the columns, and the system can
 look up the index or indexes that match those columns.

It's not totally clear that we need *any* WITHIN clause, BTW. I'm not
dead set on it. It was something I mainly added at Kevin's request. I
do see the risks, though.

 (Remind me again, where did this need to name an index come from in the
 first place?)

I agree that naming indexes is ugly, and best avoided. It's tricky,
though. The first thing you might try is to look up the index during
parse analysis and/or planning. That could work in simple cases (which
are probably the vast majority), but I'm worried about stuff like
before row triggers that change the values being inserted out from
under us, in a way that interferes with partial unique indexes. Maybe
you could choose the wrong one if you didn't leave it until the last
minute. But it's not very convenient to leave it until the last
minute.

If you're willing to live with the command conservatively failing when
there isn't a clean specification (although not in a way that can make
the command fail when the user innocently adds a unique index later),
then I think I can do it. Otherwise, it could be surprisingly hard to
cover all the cases non-surprisingly. I freely admit that putting a
unique index in a DML statement is weird, but it is sort of the most
direct way of expressing what we want. Oracle actually have an
INSERT-IGNORE like hint that names a unique index (yes, really - see
the UPSERT wiki page). That's really bizarre, but the point is that
they may have felt that there was no reasonable alternative.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-08 Thread Anssi Kääriäinen
On Wed, 2014-10-08 at 02:22 -0700, Peter Geoghegan wrote:
 On Wed, Oct 8, 2014 at 1:25 AM, Heikki Linnakangas
 hlinnakan...@vmware.com wrote:
  Instead of naming the index, you should name the columns, and the system can
  look up the index or indexes that match those columns.
 
 It's not totally clear that we need *any* WITHIN clause, BTW. I'm not
 dead set on it. It was something I mainly added at Kevin's request. I
 do see the risks, though.

To be usable in Django ORM's .save() method there must be an option to
use the primary key index, and only the primary key index for conflict
resolution.

Assume an author table with id SERIAL PRIMARY KEY, email TEXT UNIQUE,
age INTEGER, then when saving an object, Django must update the row with
matching primary key value, or otherwise do an insert. Doing an update
of matching email column isn't allowed. So, Django can't use the query:

   INSERT INTO author values(1, 't...@example.com', 35)
   ON CONFLICT UPDATE
   SET email = conflicting(email), age = conflicting(age);

If the table contains data (id=2, email='t...@example.com', age=34), the
query would update tom's age to 35 when it should have resulted in
unique constraint violation.

Other ORMs have similar save/persist implementations, that is objects
are persisted on primary key identity alone.

 - Anssi



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-08 Thread Anssi Kääriäinen
On Wed, 2014-10-08 at 01:10 -0700, Peter Geoghegan wrote:
 On Wed, Oct 8, 2014 at 12:41 AM, Anssi Kääriäinen
 anssi.kaariai...@thl.fi wrote:
  The MySQL documentation says that you should try to avoid using an ON
  DUPLICATE KEY UPDATE clause on tables with multiple unique indexes[1].
  The proposed feature's documentation has the same suggestion[2]. Still,
  the feature defaults to this behavior. Why is the default something the
  documentation says you shouldn't do?

 As we all know, naming a unique index in DML is ugly, and has poor
 support in ORMs. It seems likely that we're better off making it
 optional - it hasn't been much of a problem with the existing subxact
 looping pattern.

The subxact approach is a bit different than the proposed UPSERT
command. It loops:

   try:
   INSERT INTO author VALUE('Jack', 't...@example.com', 34)
   except UniqueConstraintViolation:
   UPDATE author SET ... WHERE name = 'Jack'

while the UPSERT command does something like:

   try:
   INSERT INTO author VALUE('Jack', 't...@example.com', 34)
   except UniqueConstaintViolation:
   UPDATE author SET ... WHERE name = 'Jack' OR email = 't...@example.com' 
LIMIT 1;

 - Anssi



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-08 Thread Simon Riggs
On 8 October 2014 00:34, Peter Geoghegan p...@heroku.com wrote:

 INSERTs see #2 win, and by a wider margin than #1 beat #2 with
 UPDATEs. However, insert.sh is by design very unsympathetic towards
 #1. It uses a serial primary key, so every INSERT uselessly obtains a
 HW lock on the same leaf page for the duration of heap insertion.
 Anyway, the median INSERT TPS numbers is 17,759.53 for #1, and
 20,441.57 TPS for #2. So you're pretty much seeing the full brunt of
 page heavyweight locking, and it isn't all that bad.

Lets see the results of running a COPY please.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-08 Thread Kevin Grittner
Peter Geoghegan p...@heroku.com wrote:
On Wed, Oct 8, 2014 at 1:25 AM, Heikki Linnakangas hlinnakan...@vmware.com 
wrote:

 Instead of naming the index, you should name the columns, and
 the system can look up the index or indexes that match those
 columns.

+1

That is what I have been consistently requesting instead of index
names, every time I notice it mentioned.

 It's not totally clear that we need *any* WITHIN clause, BTW.
 I'm not dead set on it. It was something I mainly added at
 Kevin's request. I do see the risks, though.

What I said was in response to your assertion that your technique
would *never* generate a duplicate key error.  As others have again
been pointing out, when there is more than one unique index you can
have rows to apply which cannot be applied accurately without
causing such an error.  What I requested was that the behavior in
those cases be clear and documented.  I didn't take a position on
whether you pick an index or ignore the input row (with a
warning?), but we need to decide how it is handled.  I have made
the same point Heikki is making, though -- we have no business
referencing an index name in the statement.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-08 Thread Peter Geoghegan
On Wed, Oct 8, 2014 at 2:50 PM, Kevin Grittner kgri...@ymail.com wrote:
 What I said was in response to your assertion that your technique
 would *never* generate a duplicate key error.

That is strictly true: the INSERT cannot raise a unique violation
error, because to do so would cause it to take the UPDATE path
(assuming there is no WITHIN clause). The UPDATE may get one, though.
It doesn't have to get one for your statement to have effects that are
surprising, though.

 As others have again
 been pointing out, when there is more than one unique index you can
 have rows to apply which cannot be applied accurately without
 causing such an error.

It's simpler than that. You want to make sure that the right condition
- the right unique index having a would-be duplicate violation - is
the one taken *in practice*, the condition on which the UPDATE path is
taken. What happens after that is not that interesting (e.g. whether
or not there is an UPDATE-path duplicate violation), because either
way it's broken.

 What I requested was that the behavior in
 those cases be clear and documented.  I didn't take a position on
 whether you pick an index or ignore the input row (with a
 warning?), but we need to decide how it is handled.  I have made
 the same point Heikki is making, though -- we have no business
 referencing an index name in the statement.

I think that's fair enough. That's all fine - but actually doing that
is quite tricky. Look at what I've said in relation to doing that.
Consider the corner-cases I name. They're certainly only a small
minority of cases in practice, but as an implementer I still need to
worry about them (maybe even just as much). Yes, I would like to just
name columns, but it's hard to make that fully generally.

If you want me to do what you say, fine. But in order to do that, I
need support for having the corner cases make parse analysis throw up
its hands and error. Is that a price you're willing to pay? If so,
then I'll implement it. Or, alternatively, we could do WITHIN PRIMARY
key/NOT WITHIN PRIMARY KEY.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-07 Thread Simon Riggs
On 7 October 2014 03:31, Peter Geoghegan p...@heroku.com wrote:

 It may be that people on reading this now believe Peter's HW locking
 approach is the best. I'm happy to go with consensus.

 I bet you didn't think that you'd say that a week ago.  :-)

You're right, because last week I thought heavyweight locking sucks
and I still think that; I haven't said it is the best.

What we've just discovered is that we're locking 100% of the time, but
its not needed in 99.9% of cases and is arguable in the 0.1% case -
not required at all.

The price of avoiding that rare and possibly erroneous condition seems
high to me.

Is there a way of detecting that we are updating a unique constraint
column and then applying the HW locking only in that case? Or can we
only apply locking when we have multiple unique constraints on a
table?
If so, I would withdraw any objection to the HW locking technique.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-07 Thread Robert Haas
On Tue, Oct 7, 2014 at 8:33 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 7 October 2014 03:31, Peter Geoghegan p...@heroku.com wrote:
 It may be that people on reading this now believe Peter's HW locking
 approach is the best. I'm happy to go with consensus.

 I bet you didn't think that you'd say that a week ago.  :-)

 You're right, because last week I thought heavyweight locking sucks
 and I still think that; I haven't said it is the best.

 What we've just discovered is that we're locking 100% of the time, but
 its not needed in 99.9% of cases and is arguable in the 0.1% case -
 not required at all.

 The price of avoiding that rare and possibly erroneous condition seems
 high to me.

 Is there a way of detecting that we are updating a unique constraint
 column and then applying the HW locking only in that case? Or can we
 only apply locking when we have multiple unique constraints on a
 table?
 If so, I would withdraw any objection to the HW locking technique.

I'm not up on the details of what Peter's patch does with heavyweight
locking, but I'd say it this way: if the patch uses heavyweight
locking routinely, that's probably not going to scale well[1].   If
the patch detects possible conflicts and uses heavyweight locking only
in those cases and for the specific purpose of untangling those
conflicts, then that might well be OK.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] As evidence, I offer the fact that removing 1 of 2 places where
heavyweight locking is used in hash indexes resulted in a large
performance gain at high client counts.  See commit
76837c1507cb5a5a0048046233568447729e66dd.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-07 Thread Simon Riggs
On 7 October 2014 14:06, Robert Haas robertmh...@gmail.com wrote:

 Is there a way of detecting that we are updating a unique constraint
 column and then applying the HW locking only in that case? Or can we
 only apply locking when we have multiple unique constraints on a
 table?
 If so, I would withdraw any objection to the HW locking technique.

 I'm not up on the details of what Peter's patch does with heavyweight
 locking, but I'd say it this way: if the patch uses heavyweight
 locking routinely, that's probably not going to scale well[1].   If
 the patch detects possible conflicts and uses heavyweight locking only
 in those cases and for the specific purpose of untangling those
 conflicts, then that might well be OK.

Yes, what I meant, just more clearly phrased. Thanks for the link.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-07 Thread Peter Geoghegan
On Tue, Oct 7, 2014 at 6:06 AM, Robert Haas robertmh...@gmail.com wrote:
 I'm not up on the details of what Peter's patch does with heavyweight
 locking, but I'd say it this way: if the patch uses heavyweight
 locking routinely, that's probably not going to scale well[1].   If
 the patch detects possible conflicts and uses heavyweight locking only
 in those cases and for the specific purpose of untangling those
 conflicts, then that might well be OK.

The patch opportunistically tries to use shared buffer locks when a
conflict is expected, when we restart (but only on the unique index
where a conflict was detected). So in the event of a lot of
near-conflicts, the hwlock traffic is quite modest. That, combined
with the fact that it uses what I've called an index scan with an
identity crisis (could be a near-insertion + hwlock in advance of
insertion proper, or just something akin to a regular index scan)
makes it perform best (at least with one or two unique indexes, which
is what I tested a few months back). It does not have a pre-check that
is always wasted with insertion-heavy workloads.

Now, we're not talking about a huge advantage here (I should re-test
that). And, in case I wasn't clear: I have misgivings about all 3
designs. Like Simon, I think it is appropriate that we figure out our
exact requirements using the two working prototype patches. Although,
right now #1 and #2 (the prototypes) seem quite comparable, that might
just be down to a failure of imagination. It's hard to be completely
confident about something like that.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-06 Thread Simon Riggs
On 3 October 2014 11:54, Heikki Linnakangas hlinnakan...@vmware.com wrote:

 Simon's approach would actually pass that test case just fine. It inserts
 the (promise) index tuple first, and heap tuple only after that. It will
 fail the test case with more than one unique index, however.

Please explain what you mean by fail here?

My understanding of what you're saying is that if

* we have a table with 1 unique index
* and we update the values of the uniquely index columns (e.g. PK update)
* on both of the uniquely indexed column sets
then we get occaisonal deadlocks, just as we would do using current
UPDATE/INSERT.

Is their a business use case that requires that? (Or exactly what you
meant, if that isn't it?)

My view is if we are going to base the whole design on this point,
then we need to have it very clearly accessible for all to understand.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-06 Thread Heikki Linnakangas

On 10/06/2014 03:05 PM, Simon Riggs wrote:

On 3 October 2014 11:54, Heikki Linnakangas hlinnakan...@vmware.com wrote:


Simon's approach would actually pass that test case just fine. It inserts
the (promise) index tuple first, and heap tuple only after that. It will
fail the test case with more than one unique index, however.


Please explain what you mean by fail here?


I meant that the test case will sometimes deadlock, and some 
transactions will therefore be rolled back.



My understanding of what you're saying is that if

* we have a table with 1 unique index
* and we update the values of the uniquely index columns (e.g. PK update)
* on both of the uniquely indexed column sets
then we get occaisonal deadlocks, just as we would do using current
UPDATE/INSERT.


Right. To be precise: you don't need to update both of the columns in 
the same transaction, it's enough that some of the concurrent 
transactions update one column, while other transactions update the 
other column.



Is their a business use case that requires that?


I don't know. Conceivably any use case where you have two unique 
constraints to begin with.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-06 Thread Heikki Linnakangas

On 10/06/2014 03:21 PM, Heikki Linnakangas wrote:

On 10/06/2014 03:05 PM, Simon Riggs wrote:

My understanding of what you're saying is that if

* we have a table with 1 unique index
* and we update the values of the uniquely index columns (e.g. PK update)
* on both of the uniquely indexed column sets
then we get occaisonal deadlocks, just as we would do using current
UPDATE/INSERT.


Right. To be precise: you don't need to update both of the columns in
the same transaction, it's enough that some of the concurrent
transactions update one column, while other transactions update the
other column.


Ok, that didn't make much sense. With UPSERT, you have to specify values 
for both columns. But it's sufficient that you have a mix of 
transactions where only some are UPSERTs, and others are regular UPDATEs 
on one of the columns.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-06 Thread Simon Riggs
On 6 October 2014 13:21, Heikki Linnakangas hlinnakan...@vmware.com wrote:

 My understanding of what you're saying is that if

 * we have a table with 1 unique index
 * and we update the values of the uniquely index columns (e.g. PK update)
 * on both of the uniquely indexed column sets
 then we get occaisonal deadlocks, just as we would do using current
 UPDATE/INSERT.


 Right. To be precise: you don't need to update both of the columns in the
 same transaction, it's enough that some of the concurrent transactions
 update one column, while other transactions update the other column.

CREATE TABLE foo
(id1integer not null primary key
,id2integer not null unique
,valinteger);

Given the table above, which one do we mean?

1. When we mix UPDATE foo SET id2 = X WHERE id1 = Y;  and UPDATE foo
SET id1 = Y WHERE id2 = X; we can deadlock
2. When we mix UPDATE foo SET val = Z WHERE id1 = Y;  and UPDATE foo
SET val = W WHERE id2 = X; we can deadlock

(2) is a common use case, (1) is a very rare use case and most likely
a poor design

If the user wishes to protect against such deadlocks they retain the
option to use row locking. Yes?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-06 Thread Heikki Linnakangas

On 10/06/2014 04:44 PM, Simon Riggs wrote:

On 6 October 2014 13:21, Heikki Linnakangas hlinnakan...@vmware.com wrote:


My understanding of what you're saying is that if

* we have a table with 1 unique index
* and we update the values of the uniquely index columns (e.g. PK update)
* on both of the uniquely indexed column sets
then we get occaisonal deadlocks, just as we would do using current
UPDATE/INSERT.



Right. To be precise: you don't need to update both of the columns in the
same transaction, it's enough that some of the concurrent transactions
update one column, while other transactions update the other column.


CREATE TABLE foo
(id1integer not null primary key
,id2integer not null unique
,valinteger);

Given the table above, which one do we mean?

1. When we mix UPDATE foo SET id2 = X WHERE id1 = Y;  and UPDATE foo
SET id1 = Y WHERE id2 = X; we can deadlock
2. When we mix UPDATE foo SET val = Z WHERE id1 = Y;  and UPDATE foo
SET val = W WHERE id2 = X; we can deadlock

(2) is a common use case, (1) is a very rare use case and most likely
a poor design


Well, at least one of the statements has to be an UPSERT, and at least 
one of them has to update a column with a unique constraint on it. This 
pair of transactions could deadlock, for example:


Transaction 1:
INSERT INTO foo VALUES (Y, X, Z) ON CONFLICT IGNORE;
Transaction 2:
UPDATE foo SET id2 = X WHERE id1 = Y;

That's made-up syntax, but the idea is that the first transaction 
attempts to insert a row with values id1=Y, id2=X, val=Z. If that fails 
because of a row with id1=Y or id2=X already exists, then it's supposed 
to do nothing.



If the user wishes to protect against such deadlocks they retain the
option to use row locking. Yes?


Sorry, I didn't understand that. Row locking?

In general, this is of course a lot easier to implement if we restrict 
it so that it only works in some limited cases. That may be fine, but 
then we have to be able to document clearly what the limitations are, 
and throw an error if you violate those limitations.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-06 Thread Simon Riggs
On 6 October 2014 15:04, Heikki Linnakangas hlinnakan...@vmware.com wrote:
 On 10/06/2014 04:44 PM, Simon Riggs wrote:

 On 6 October 2014 13:21, Heikki Linnakangas hlinnakan...@vmware.com
 wrote:

 My understanding of what you're saying is that if

 * we have a table with 1 unique index
 * and we update the values of the uniquely index columns (e.g. PK
 update)
 * on both of the uniquely indexed column sets
 then we get occaisonal deadlocks, just as we would do using current
 UPDATE/INSERT.



 Right. To be precise: you don't need to update both of the columns in the
 same transaction, it's enough that some of the concurrent transactions
 update one column, while other transactions update the other column.


 CREATE TABLE foo
 (id1integer not null primary key
 ,id2integer not null unique
 ,valinteger);

 Given the table above, which one do we mean?

 1. When we mix UPDATE foo SET id2 = X WHERE id1 = Y;  and UPDATE foo
 SET id1 = Y WHERE id2 = X; we can deadlock
 2. When we mix UPDATE foo SET val = Z WHERE id1 = Y;  and UPDATE foo
 SET val = W WHERE id2 = X; we can deadlock

 (2) is a common use case, (1) is a very rare use case and most likely
 a poor design


 Well, at least one of the statements has to be an UPSERT, and at least one
 of them has to update a column with a unique constraint on it. This pair of
 transactions could deadlock, for example:

 Transaction 1:
 INSERT INTO foo VALUES (Y, X, Z) ON CONFLICT IGNORE;
 Transaction 2:
 UPDATE foo SET id2 = X WHERE id1 = Y;

 That's made-up syntax, but the idea is that the first transaction attempts
 to insert a row with values id1=Y, id2=X, val=Z. If that fails because of a
 row with id1=Y or id2=X already exists, then it's supposed to do nothing.

Lets look at a real world example

CREATE TABLE citizen
(ssninteger not null primary key
,email text not null unique
,tax_amount  decimal);

Transaction 1:
INSERT INTO citizen VALUES (555123456, 'si...@2ndquadrant.com',
1000.00) ON CONFLICT IGNORE;
Transaction 2:
UPDATE foo SET email = 'si...@2ndquadrant.com', tax_amount = 1000.00
WHERE ssn = 555123456;

OK, now I understand how a deadlock is possible. Thanks for your help.
Again I note that there is no isolation test that refers to this
situation, nor any documentation, internal or user facing that
describes the situation or its workaround.

My feeling is that is an unlikely situation. To have two actors
concurrently updating the same data AND in different ways from two
different angles.

How likely is it that we would issue those two transactions
concurrently AND we would be concerned because this caused an error?
If the tax_amount was the same, it wouldn't matter that one failed.
If the tax_amount differeed, we would want to know about the error,
not accept it in silence.

Are any of those things substantially worse than the current situation
using INSERT/UPDATE loops?

It might be nice if the above never deadlocked. What is the price of
ensuring that in terms of code maintainability and performance? What
would this do to  COPY performance?


 If the user wishes to protect against such deadlocks they retain the
 option to use row locking. Yes?


 Sorry, I didn't understand that. Row locking?

I think that thought doesn't apply here.

 In general, this is of course a lot easier to implement if we restrict it so
 that it only works in some limited cases. That may be fine, but then we have
 to be able to document clearly what the limitations are, and throw an error
 if you violate those limitations.

Seems reasonable.

My point here is to establish that...

a) there are multiple ways to implement the UPSERT feature and none
should be thrown away too quickly
b) the current patch does not implement something we all agree on yet
c) not all requirements have been properly documented, understood or
agreed by hackers

If we want to move forwards we need to agree things based upon clarity
and real world usage.

It may be that people on reading this now believe Peter's HW locking
approach is the best. I'm happy to go with consensus.

My feeling is that substantially more work is required on explaining
the details around multiple unique index constraints, trigger
behaviour and various other corner cases.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-06 Thread Peter Geoghegan
On Mon, Oct 6, 2014 at 5:33 PM, Simon Riggs si...@2ndquadrant.com wrote:
 Lets look at a real world example

 CREATE TABLE citizen
 (ssninteger not null primary key
 ,email text not null unique
 ,tax_amount  decimal);

 Transaction 1:
 INSERT INTO citizen VALUES (555123456, 'si...@2ndquadrant.com',
 1000.00) ON CONFLICT IGNORE;
 Transaction 2:
 UPDATE foo SET email = 'si...@2ndquadrant.com', tax_amount = 1000.00
 WHERE ssn = 555123456;

 OK, now I understand how a deadlock is possible. Thanks for your help.
 Again I note that there is no isolation test that refers to this
 situation, nor any documentation, internal or user facing that
 describes the situation or its workaround.

This seems like a concern specific to other approaches to value
locking. But fair enough.

 My feeling is that is an unlikely situation. To have two actors
 concurrently updating the same data AND in different ways from two
 different angles.

Hard to say for sure.

 How likely is it that we would issue those two transactions
 concurrently AND we would be concerned because this caused an error?
 If the tax_amount was the same, it wouldn't matter that one failed.
 If the tax_amount differeed, we would want to know about the error,
 not accept it in silence.

 Are any of those things substantially worse than the current situation
 using INSERT/UPDATE loops?

Yes, because the new feature is supposed to make it so that you
yourself don't have to put your UPSERT statement in a loop with
subxacts. Taking away the burden of having to think about this stuff
is something I'm striving for here.

 It might be nice if the above never deadlocked. What is the price of
 ensuring that in terms of code maintainability and performance?

I am going to reserve judgement on that, at least for a little while.
It seems like the person proposing an alternative ought to have a
better sense of what the price of avoiding this is. I'd understand
what you were getting at more here if it immediately made our lives
easier in some obvious way. I don't see that it does, though I admit
that I may simply not understand where you're coming from. So sure,
let's not be prejudiced about what's important, but at the same time I
don't see that either Heikki or I have actually been inflexible to a
degree that hurts things WRT not giving up on important high-level-ish
goals.

I am not completely inflexible on never error. I am very close to
totally inflexible, though. I think I could live with an error that
literally no one would ever see. For example, we could error if there
was an excessive number of retries, which I find acceptable because it
will never happen in the real world. I tend to think that what you're
talking about is pretty far from that, though.

 My point here is to establish that...

 a) there are multiple ways to implement the UPSERT feature and none
 should be thrown away too quickly
 b) the current patch does not implement something we all agree on yet
 c) not all requirements have been properly documented, understood or
 agreed by hackers

 If we want to move forwards we need to agree things based upon clarity
 and real world usage.

I certainly agree with that.

 It may be that people on reading this now believe Peter's HW locking
 approach is the best. I'm happy to go with consensus.

I bet you didn't think that you'd say that a week ago.  :-)

I hope I don't sound smug when I say that. I just mean, as you say,
that we all need to keep an open mind on this. A healthy respect for
the problem is recommended. I think it's still possible that there are
problems with design #1, even on its own terms.

 My feeling is that substantially more work is required on explaining
 the details around multiple unique index constraints, trigger
 behaviour and various other corner cases.

Probably. Ideally, we should do that in a way driven by real-world
prototypes. In that spirit, I attach a new version of my patch, but
now implemented using approach #2 to value locking. I haven't spent
all that much time testing this (at least recently, in this form), but
it does pass all existing tests, including my stress-tests when run
for half an hour.

A lot of those corner cases you mention are big concerns. It's much
easier to identify these issues by breaking real implementations. So
surprisingly, to a certain extent (with something like this) it makes
sense to have requirements driven by actual implementations. If we
cannot do this iteratively, we are likely to fail. That's just how it
is, I think.
-- 
Peter Geoghegan


0001-Make-UPDATE-privileges-distinct-from-INSERT-privileg.patch.gz
Description: GNU Zip compressed data


0006-User-visible-documentation-for-INSERT-.-ON-CONFLICT-.patch.gz
Description: GNU Zip compressed data


0005-Internal-documentation-for-INSERT-.-ON-CONFLICT-UPDA.patch.gz
Description: GNU Zip compressed data


0004-Tests-for-INSERT-.-ON-CONFLICT-UPDATE-IGNORE.patch.gz
Description: GNU Zip compressed data



Re: [HACKERS] Promise index tuples for UPSERT

2014-10-03 Thread Heikki Linnakangas

On 10/03/2014 11:07 AM, Simon Riggs wrote:

On 1 October 2014 20:54, Heikki Linnakangas hlinnakan...@vmware.com wrote:

On 10/01/2014 02:34 PM, Simon Riggs wrote:



...

When later insert scans see the promise tuple they perform
XactLockTableWait() and when they get control they look again for the
key. If they find a promise tuple with an aborted xid they replace
that value with their own xid and continue as a). Otherwise b).



XactLockTableWait() waits until the end of transaction, that's not you want
here. If the backend that inserted the promise tuple decides to not proceed
with the insertion, and removes the promise tuple, the backend waiting on it
needs to be woken up more or less immediately, not when the transaction
completes.


There is no remove promise tuple part of the above design.


I had this exact same issue in my earlier patch versions, fixed it with a
new kind of heavy-weight lock, and a new field in PGPROC
(http://www.postgresql.org/message-id/52d00d2d.6030...@vmware.com). That
wasn't very pretty, but it got the job done. Some other design might be
better.


Currently, if some fool developer decides to insert rows that already
duplicate values in the table, then currently he inserts a heap row,
then sees a potential conflict in the index and waits for the
conflicting xact to end. His fool insert, his wait. That's how btree
does this now.
I don't see any reason why we need to do better for Upsert.

If the row already exists we need to be able to quickly change into an
update, but we still only support one write xact at a time.


That lowers the bar from what I thought everyone agreed on. Namely, if 
two backends run a similar UPSERT command concurrently on a table that 
has more than one unique constraint, they might deadlock, causing one of 
them to throw an error instead of INSERTing or UPDATEing anything.


I'm sure that's useful enough in many applications, but I'd like to have 
a more robust implementation. The shorter we can keep the list of 
caveats, the better.



The value in the index needs to be protected by a block level lock, so
we can check it quickly, but the eventual heap work is serialized by
transactional semantics.

I think a little perspective is due here and we should stick to the
main use case, not cater for bizarre edge cases.


I'm trying to bisect your thoughts on exactly what use cases you think 
we must support, and which ones you consider bizarre edge cases, and 
what exactly is acceptable behavior in those edge cases.



What we are looking for is something that can decide whether it is an
insert or an update and progress quickly with that. It needs to be
correct, but there is no requirement to be faster than currently
possible, in the case of concurrency.


I believe we all agree on that.


Any form of tuple locking that uses the general lock manager will not
be usable. I can't see it is worth the overhead of doing that to
protect against deadlocks that would only be experienced by people
doing foolish things.


Maybe, maybe not, but let's define the acceptable behavior first, and 
think about the implementation second. I'm pretty sure all of the 
approaches discussed so far can be made fast enough, and the bloat 
issues can be made small enough, that it doesn't matter much which one 
we choose from a performance point of view. The differences are in what 
use cases they can support, and the maintainability of the code.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-03 Thread Peter Geoghegan
On Fri, Oct 3, 2014 at 2:03 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 That lowers the bar from what I thought everyone agreed on. Namely, if two
 backends run a similar UPSERT command concurrently on a table that has more
 than one unique constraint, they might deadlock, causing one of them to
 throw an error instead of INSERTing or UPDATEing anything.

It lowers the bar to a level that I am not willing to limbo dance
under. You don't even need two unique constraints. Nothing as
complicated as that is required.

When this happens with MySQL, they have the good sense to call it a
bug [1], and even fix it. I find the comparison with conventional
insertion entirely unconvincing.

 I'm sure that's useful enough in many applications, but I'd like to have a
 more robust implementation. The shorter we can keep the list of caveats, the
 better.

INSERT and UPDATE are supposed to be fairly well balanced here.
Conflicts are the norm.

 The value in the index needs to be protected by a block level lock, so
 we can check it quickly, but the eventual heap work is serialized by
 transactional semantics.

 I think a little perspective is due here and we should stick to the
 main use case, not cater for bizarre edge cases.


 I'm trying to bisect your thoughts on exactly what use cases you think we
 must support, and which ones you consider bizarre edge cases, and what
 exactly is acceptable behavior in those edge cases.

Lots of concurrency is not an edge-case.

 Any form of tuple locking that uses the general lock manager will not
 be usable. I can't see it is worth the overhead of doing that to
 protect against deadlocks that would only be experienced by people
 doing foolish things.

 Maybe, maybe not, but let's define the acceptable behavior first, and think
 about the implementation second.

+1. Updating a lot with UPSERT is not foolish. That's all it took to
make earlier prototypes deadlock.

 I'm pretty sure all of the approaches
 discussed so far can be made fast enough, and the bloat issues can be made
 small enough, that it doesn't matter much which one we choose from a
 performance point of view. The differences are in what use cases they can
 support, and the maintainability of the code.

+1

What do we get for giving up on not having unprincipled deadlocks
here? What's the advantage? Assuming that this is a bizarre edge-case
(note: it isn't), what do we get in return for giving up on fixing it?

[1] MySQL bug #52020
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-03 Thread Simon Riggs
On 3 October 2014 10:03, Heikki Linnakangas hlinnakan...@vmware.com wrote:

 That lowers the bar from what I thought everyone agreed on. Namely, if two
 backends run a similar UPSERT command concurrently on a table that has more
 than one unique constraint, they might deadlock, causing one of them to
 throw an error instead of INSERTing or UPDATEing anything.

Now we get to a productive discussion, this is good.

When we first make requirements, obviously everyone agrees a long list
of things since there is initially not much reason to say No to it. As
we go towards implementation we begin to understand the true price of
meeting each requirement. It was good that this detail was raised and
sensible to attempt to avoid unprincipled deadlocks. If the price of
avoiding them is high, it is worth reconsidering how important that
is.

My view is that I can't see the above use case from happening in real
situations, except by infrequent mistake. In most cases, unique
indexes represent some form of object identity and those don't change
frequently in the real world. So to be changing two unique fields at
the same time and it not representing some form of business process
error that people would like to see fail anyway, I'd be surprised by.
If someone has an example of that in a real, common case then I would
like to see it and I would revise my view accordingly

We are frequently hampered by trying to design something that can sing
and dance at the same time. That thought is exactly how we are looking
at upsert now, not merge. So trimming our objectives to what makes
sense is an accepted part of this project already.

 Any form of tuple locking that uses the general lock manager will not
 be usable. I can't see it is worth the overhead of doing that to
 protect against deadlocks that would only be experienced by people
 doing foolish things.


 Maybe, maybe not, but let's define the acceptable behavior first, and think
 about the implementation second.

Hand in hand, I think, given the other constraints of time, review,
maintainability etc..

 I'm pretty sure all of the approaches
 discussed so far can be made fast enough, and the bloat issues can be made
 small enough, that it doesn't matter much which one we choose from a
 performance point of view. The differences are in what use cases they can
 support, and the maintainability of the code.

The discussion of approaches has up to now focused only on what
impossibities exist, with a we must do this because feature A can't
do aspect X. I haven't yet seen much discussion of maintainability of
code, but I would guess simpler is better, overall.

Realistically, I won't be coding any separate approaches, so this is
down to Peter and maybe yourself Heikki. I hope only to avoid
foreclosing viable and simple approaches for the wrong reasons. There
are many other considerations that make up the final view.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-03 Thread Peter Geoghegan
On Fri, Oct 3, 2014 at 2:50 AM, Simon Riggs si...@2ndquadrant.com wrote:
 My view is that I can't see the above use case from happening in real
 situations, except by infrequent mistake. In most cases, unique
 indexes represent some form of object identity and those don't change
 frequently in the real world. So to be changing two unique fields at
 the same time and it not representing some form of business process
 error that people would like to see fail anyway, I'd be surprised by.

Are we talking about two different things here?

Unprincipled deadlocks can be seen without updating any constrained
column in the UPSERT. The test-case that originally highlighted the
issue only had one unique index, and it was *not* in the update's
targetlist. See:

https://wiki.postgresql.org/wiki/Value_locking#.22Unprincipled_Deadlocking.22_and_value_locking

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-03 Thread Simon Riggs
On 3 October 2014 10:32, Peter Geoghegan p...@heroku.com wrote:
 On Fri, Oct 3, 2014 at 2:03 AM, Heikki Linnakangas
 hlinnakan...@vmware.com wrote:
 That lowers the bar from what I thought everyone agreed on. Namely, if two
 backends run a similar UPSERT command concurrently on a table that has more
 than one unique constraint, they might deadlock, causing one of them to
 throw an error instead of INSERTing or UPDATEing anything.

 It lowers the bar to a level that I am not willing to limbo dance
 under. You don't even need two unique constraints. Nothing as
 complicated as that is required.

 When this happens with MySQL, they have the good sense to call it a
 bug [1], and even fix it. I find the comparison with conventional
 insertion entirely unconvincing.

Is there a test case that demonstrates the problem?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-03 Thread Peter Geoghegan
On Fri, Oct 3, 2014 at 3:04 AM, Simon Riggs si...@2ndquadrant.com wrote:
 Is there a test case that demonstrates the problem?

Yes. See my e-mail to Heikki here:

http://www.postgresql.org/message-id/cam3swzshbe29kpod44cvc3vpzjgmder6k_6fghiszeozgmt...@mail.gmail.com

Testcase is attached.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-03 Thread Simon Riggs
On 3 October 2014 10:57, Peter Geoghegan p...@heroku.com wrote:
 On Fri, Oct 3, 2014 at 2:50 AM, Simon Riggs si...@2ndquadrant.com wrote:
 My view is that I can't see the above use case from happening in real
 situations, except by infrequent mistake. In most cases, unique
 indexes represent some form of object identity and those don't change
 frequently in the real world. So to be changing two unique fields at
 the same time and it not representing some form of business process
 error that people would like to see fail anyway, I'd be surprised by.

 Are we talking about two different things here?

 Unprincipled deadlocks can be seen without updating any constrained
 column in the UPSERT. The test-case that originally highlighted the
 issue only had one unique index, and it was *not* in the update's
 targetlist. See:

 https://wiki.postgresql.org/wiki/Value_locking#.22Unprincipled_Deadlocking.22_and_value_locking

I followed that to a wiki page, then clicked again to an old email. No
test case.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-03 Thread Heikki Linnakangas

On 10/03/2014 01:05 PM, Peter Geoghegan wrote:

On Fri, Oct 3, 2014 at 3:04 AM, Simon Riggs si...@2ndquadrant.com wrote:

Is there a test case that demonstrates the problem?


Yes. See my e-mail to Heikki here:

http://www.postgresql.org/message-id/cam3swzshbe29kpod44cvc3vpzjgmder6k_6fghiszeozgmt...@mail.gmail.com

Testcase is attached.


Simon's approach would actually pass that test case just fine. It 
inserts the (promise) index tuple first, and heap tuple only after that. 
It will fail the test case with more than one unique index, however.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-03 Thread Peter Geoghegan
On Fri, Oct 3, 2014 at 3:54 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 Simon's approach would actually pass that test case just fine. It inserts
 the (promise) index tuple first, and heap tuple only after that. It will
 fail the test case with more than one unique index, however.

Oh, I see. Still, I don't think you need to UPDATE a
uniquely-constrained attribute - even if updating constrained
attributes is rare (dubious), non-HOT updates will have the same
effect, no? I still think that's unacceptable.

In any case, I still don't see what this buys us over the other two
designs. What's the pay-off for giving up on the general avoidance of
unprincipled deadlocks?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Promise index tuples for UPSERT

2014-10-01 Thread Simon Riggs
Summary of algorithm to use promise tuples for concurrency control
during UPSERT

1. Perform btree search to location of key, if it exists.
a) If an unkilled index tuple exists, we decide this is an UPDATE and
drop straight thru to step 2
b) If it does not exist, insert a promise tuple into unique index(s)
- marked with the xid of the inserting transaction, but using the key.
This happens while the page is locked, so it is not possible to insert
a second promise tuple concurrently.
Record the btree blockid on the index scan and move to step 3
When later insert scans see the promise tuple they perform
XactLockTableWait() and when they get control they look again for the
key. If they find a promise tuple with an aborted xid they replace
that value with their own xid and continue as a). Otherwise b).

2. Find existing heap tuple
Find heap tuple.
Check it is actually valid. If not, go back to (1), kill the prior
tuple and follow 1b) path
If it is valid, perform heap_update as normal.

3. Insert new heap tuple
Perform heap_insert
Re-find index tuple using the btree blockid recorded at step 1; this
may require moving right until we find the actual value we are looking
for, so block splits don't negatively affect this approach.
Once re-found we change the index tuple from a promise tuple to a
normal index tuple, by setting tid and removing promise flag. Tuple
remains same length because the value was known when promise tuple
inserted, so this is an inplace update.
Insert other index values normally.

If a transaction that inserted a promise tuple dies, how is that cleaned up?
Any user that sees a dead promise tuple with a value they want will clean it up.
Dead promise tuples can be removed as needed, just as killed tuples
currently are.
VACUUM can remove dead transactions from index as it scans, no problems.

Index bloat is less of a problem than with normal inserts since there
are additional ways of removing promise tuples. Only one index tuple
at a time can have a specific value, so we would actually reduce heap
bloat caused by concurrent inserts.

It's possible we find existing rows that we can't see according to our
snapshot. That is handled in exactly the same as the way we treat
UPDATEs.

There is a potential loop here in that we might find an index tuple,
follow it, find out the tuple is actually dead then return to insert a
promise tuple, only to find that someone else just did that and we
have to wait/re-follow the link to update the new tuple. That is an
extremely unlikely race condition, though possible; there is no heavy
locking to prevent that in this approach. In principle deadlocks are
possible from that, but that is not any different from the current
case where people that insert same key at same time might cause
deadlocks. Its not a common application pattern and not something we
should be protecting against.

All of this is only needed for unique indexes.

It can handled by a new API call for acquire_value_lock() and
release_value_lock(), or similar.

Advantages
* We don't do anything weird in the heap - if this breaks, data is not corrupt
* There is no heap bloat or index bloat above existing levels
* The approach leverages existing mechanism for transaction waiting
* Optimistic approach to value locking will improve performance over
heavy weight locking

Disadvantages
* Not written yet - 1 month to code, still possible for 9.5, doesn't look hard

Other stated possible disadvantages
* Violates existing invariant that every index tuple has a
corresponding heap tuple, which goes back to the Berkeley days.
Currently, we always create heap tuples first, and physically delete
them last. [Why is that a problem? Could be, but why?]
(Deleting them last implies something is being touched in that
regard by this suggestion. I'm not sure what you are referring to)

* Relatedly, concern about breaking VACUUM. VACUUMing of btrees occurs
with a list of TIDs to kill in index, built from the heap. Current
bulk delete callback cannot be used for this. Super-exclusive lock is
needed to delete tuples in btree page (i.e. VACUUM). Normally skipping
of LockBufferForCleanup() (as added by bbb6e559) works okay in heap
because it tends to imply that list of tids won't be bulk deleted in
index, where we currently cannot skip for failure to quickly acquire
super exclusive lock. So this could make the problem addressed by
bbb6e559 return to some extent.
[Don't see any problems; just test the xid as we scan a promise tuple
and remove it if needed]

Index-only bloat becomes a possibility. Implications are not well understood.
[FUD, no reason to believe there is a problem.]

We have to re-find any items using an ordinary index scan, not a tid.
Must match our xid to that.
[Explained above, easy and efficient.]

Doesn't have a strategy for dealing with unprincipled deadlocks, at
least at the moment. Presumably some aspects of #2 could be adopted
here.
[FUD. Unprincipled deadlock still not properly explained as yet. No
way of 

Re: [HACKERS] Promise index tuples for UPSERT

2014-10-01 Thread Heikki Linnakangas

On 10/01/2014 02:34 PM, Simon Riggs wrote:

Summary of algorithm to use promise tuples for concurrency control
during UPSERT

1. Perform btree search to location of key, if it exists.
a) If an unkilled index tuple exists, we decide this is an UPDATE and
drop straight thru to step 2
b) If it does not exist, insert a promise tuple into unique index(s)
- marked with the xid of the inserting transaction, but using the key.
This happens while the page is locked, so it is not possible to insert
a second promise tuple concurrently.
Record the btree blockid on the index scan and move to step 3
When later insert scans see the promise tuple they perform
XactLockTableWait() and when they get control they look again for the
key. If they find a promise tuple with an aborted xid they replace
that value with their own xid and continue as a). Otherwise b).


XactLockTableWait() waits until the end of transaction, that's not you 
want here. If the backend that inserted the promise tuple decides to not 
proceed with the insertion, and removes the promise tuple, the backend 
waiting on it needs to be woken up more or less immediately, not when 
the transaction completes.


I had this exact same issue in my earlier patch versions, fixed it with 
a new kind of heavy-weight lock, and a new field in PGPROC 
(http://www.postgresql.org/message-id/52d00d2d.6030...@vmware.com). That 
wasn't very pretty, but it got the job done. Some other design might be 
better.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Promise index tuples for UPSERT

2014-10-01 Thread Peter Geoghegan
On Wed, Oct 1, 2014 at 4:34 AM, Simon Riggs si...@2ndquadrant.com wrote:
 Summary of algorithm to use promise tuples for concurrency control
 during UPSERT

 Index bloat is less of a problem than with normal inserts since there
 are additional ways of removing promise tuples. Only one index tuple
 at a time can have a specific value, so we would actually reduce heap
 bloat caused by concurrent inserts.

I am not all that concerned about bloat. I didn't think it was a major
advantage of #1 relative to #2 when I investigated the differences,
and looked at the numbers. And I'm speaking as the advocate of the
design with zero bloat.

 It's possible we find existing rows that we can't see according to our
 snapshot. That is handled in exactly the same as the way we treat
 UPDATEs.

This isn't a special case in the patch. It's more like REPEATABLE READ
and SERIALIZABLE have a special responsibility to make sure they're
not updating an invisible-to-MVCC-snapshot (not instantaneously
invisible, by which I mean invisible in the
HeapTupleSatisfiesUpdate() sense). Otherwise, we don't care if it's
MVCC-visible or not.

I imagine that by That is handled in exactly the same as the way we
treat UPDATEs, you mean that we do the EvalPlanQual() stuff. We only
do that in the event of a concurrent UPDATE/DELETE, though. Otherwise,
we trust the underlying relation scan to accurate represent that
tuples pulled up are visible. So I don't understand the comparison
(but tell me if I've misunderstood).

 There is a potential loop here in that we might find an index tuple,
 follow it, find out the tuple is actually dead then return to insert a
 promise tuple, only to find that someone else just did that and we
 have to wait/re-follow the link to update the new tuple. That is an
 extremely unlikely race condition, though possible; there is no heavy
 locking to prevent that in this approach. In principle deadlocks are
 possible from that, but that is not any different from the current
 case where people that insert same key at same time might cause
 deadlocks. Its not a common application pattern and not something we
 should be protecting against.

People are going to use this feature in cases where they could almost
use an UPDATE. It will happen if you let it happen. But even if you
don't, a guarantee is infinitely more useful than a non-guarantee to
app developers. I'll be up-front about this: you have very little
chance of getting me to budge on this point. I *really* hate the idea
of allowing this kind of thing. I don't think I'm alone here.

 All of this is only needed for unique indexes.

 It can handled by a new API call for acquire_value_lock() and
 release_value_lock(), or similar.

 Advantages
 * We don't do anything weird in the heap - if this breaks, data is not corrupt

Index corruption leads to wrong answers. I don't think this is a very
good argument, fwiw.

 * There is no heap bloat or index bloat above existing levels

Fair enough.

 * The approach leverages existing mechanism for transaction waiting

That's not really an advantage. That more or less applies to all designs.

 * Optimistic approach to value locking will improve performance over
 heavy weight locking

There is no evidence that promise index tuples will perform better. #2
didn't perform better than #1.

 Disadvantages
 * Not written yet - 1 month to code, still possible for 9.5, doesn't look 
 hard

Maybe, but that is beside the point, which is: If there were any
fundamental problems with either #1 or #2, I think I'd have figured
them out by now. They are less risky today - it might be that #3 turns
out to have the same properties, but I cannot be sure. That counts for
something. I feel perfectly entitled to hold that kind of uncertainty
against any design, tbh.

 Other stated possible disadvantages
 * Violates existing invariant that every index tuple has a
 corresponding heap tuple, which goes back to the Berkeley days.
 Currently, we always create heap tuples first, and physically delete
 them last. [Why is that a problem? Could be, but why?]

Unknown unknowns. Huge amounts of code must be audited to figure out
that it isn't an issue. So I don't know, but then I'm not the one
making the proposal.

 (Deleting them last implies something is being touched in that
 regard by this suggestion. I'm not sure what you are referring to)

The uncertain scope of the problem is a big part of the problem.

 * Relatedly, concern about breaking VACUUM. VACUUMing of btrees occurs
 with a list of TIDs to kill in index, built from the heap. Current
 bulk delete callback cannot be used for this. Super-exclusive lock is
 needed to delete tuples in btree page (i.e. VACUUM). Normally skipping
 of LockBufferForCleanup() (as added by bbb6e559) works okay in heap
 because it tends to imply that list of tids won't be bulk deleted in
 index, where we currently cannot skip for failure to quickly acquire
 super exclusive lock. So this could make the problem addressed by
 bbb6e559