subject:"\[HACKERS\] INSERT ... ON CONFLICT IGNORE \(and UPDATE\) 3.0"

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-24 Thread Peter Geoghegan

On Thu, Apr 23, 2015 at 6:07 PM, Peter Geoghegan  wrote:
> Curious about what you think about my proposal to go back to putting
> the inference specification WHERE clause within the parenthesis, since
> this solves several problems, including clarifying to users that the
> predicate is part of the inference clause.

I've *provisionally* pushed code that goes back to the old way,
Andres: 
https://github.com/petergeoghegan/postgres/commit/2a5d80b27d2c5832ad26dde4651c64dd2004f401

Perhaps this is the least worst way, after all.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-24 Thread Stephen Frost

* Peter Geoghegan (p...@heroku.com) wrote:
> On Fri, Apr 24, 2015 at 5:50 PM, Stephen Frost  wrote:
> > * Peter Geoghegan (p...@heroku.com) wrote:
> >> I came up with something that is along the lines of what we discussed.
> >> I'll wait for you to push Dean's code, and rebase on top of that
> >> before submitting what I came up with. Maybe this can be rolled into
> >> my next revision if I work through Andres' most recent feedback
> >> without much delay.
> >
> > This is done (finally!).  Please take a look and let me know if you have
> > any questions or concerns.  Happy to chat again also, of course.
> 
> Great, thanks.
> 
> I didn't actually wait for you (as earlier indicated) before posting
> the new approach to RLS in V3.4. However, I have some decent tests for
> the new behaviors (I did test-driven development), so I think that
> when I rebase on top of the new RLS stuff tomorrow, I'll find that it
> won't be too difficult.

Fantastic, glad to hear it!

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-24 Thread Peter Geoghegan

On Fri, Apr 24, 2015 at 5:50 PM, Stephen Frost  wrote:
> * Peter Geoghegan (p...@heroku.com) wrote:
>> I came up with something that is along the lines of what we discussed.
>> I'll wait for you to push Dean's code, and rebase on top of that
>> before submitting what I came up with. Maybe this can be rolled into
>> my next revision if I work through Andres' most recent feedback
>> without much delay.
>
> This is done (finally!).  Please take a look and let me know if you have
> any questions or concerns.  Happy to chat again also, of course.

Great, thanks.

I didn't actually wait for you (as earlier indicated) before posting
the new approach to RLS in V3.4. However, I have some decent tests for
the new behaviors (I did test-driven development), so I think that
when I rebase on top of the new RLS stuff tomorrow, I'll find that it
won't be too difficult.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-24 Thread Stephen Frost

Peter,

* Peter Geoghegan (p...@heroku.com) wrote:
> I came up with something that is along the lines of what we discussed.
> I'll wait for you to push Dean's code, and rebase on top of that
> before submitting what I came up with. Maybe this can be rolled into
> my next revision if I work through Andres' most recent feedback
> without much delay.

This is done (finally!).  Please take a look and let me know if you have
any questions or concerns.  Happy to chat again also, of course.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-24 Thread Peter Geoghegan

On Fri, Apr 24, 2015 at 12:31 AM, Heikki Linnakangas  wrote:
> Ok, I see now that I totally screwed up the conditions on "waitMode" in that
> commit. I just pushed a fix to your github repository.

I can confirm that the commit you just pushed prevents the
implementation from attempting to lock an invisible tuple, where
previously it would reliably fall given the same testcase.

Thanks
-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-24 Thread Heikki Linnakangas


On 04/24/2015 02:52 AM, Peter Geoghegan wrote:

I found a bug that seems to be down to commit
e3144183562d08e347f832f0b29daefe8bac617b on Github:

"""
commit e3144183562d08e347f832f0b29daefe8bac617b
Author: Heikki Linnakangas 
Date:   Thu Apr 23 18:38:11 2015 +0300

 Minor cleanup of check_exclusion_or_unique_constraint.

 To improve readability.

"""

At least, that's what a "git bisect" session showed.


Ok, I see now that I totally screwed up the conditions on "waitMode" in 
that commit. I just pushed a fix to your github repository.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-23 Thread Peter Geoghegan

On Thu, Apr 23, 2015 at 12:45 PM, Peter Geoghegan  wrote:
>>> * We need to figure out the tuple lock strength details. I think this
>>> is doable, but it is the greatest challenge to committing ON CONFLICT
>>> UPDATE at this point. Andres feels that we should require no greater
>>> lock strength than an equivalent UPDATE. I suggest we get to this
>>> after committing the IGNORE variant. We probably need to discuss this
>>> some more.
>>
>> I want to see a clear way forward before we commit parts. It doesn't
>> have to be completely iron-clad, but the way forward should be pretty
>> clear. What's the problem you're seeing right now making this work? A
>> couple days on jabber you seemed to see a way doing this?
>
> I was really just identifying it as the biggest problem the patch now
> faces, and I want to face those issues down ASAP. Of course, that's
> good, because as you say it is a small problem in an absolute sense.
> The second most significant open item is rebasing on top of the recent
> RLS changes, IMV.


OK, I pushed out code to do this a while ago. I must admit that I
probably significantly over-estimated the difficulty of doing this.

With Heikki's problematic commit reverted this works fine  (I have not
actually reverted it myself...I'll leave it to Heikki to clean that up
when he gets around to it). The usually jjanes_upsert stress tests
show up no problems.

Curious about what you think about my proposal to go back to putting
the inference specification WHERE clause within the parenthesis, since
this solves several problems, including clarifying to users that the
predicate is part of the inference clause.
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-23 Thread Peter Geoghegan

On Thu, Apr 23, 2015 at 1:08 PM, Heikki Linnakangas  wrote:
> That said, I'd actually like to see a separate heap_super_delete() function
> for that, rather than piggybacking on heap_delete(). It's a quite different
> operation. There'll be some duplication, but seems better than a maze of
> if-else's in heap_delete.

I found a bug that seems to be down to commit
e3144183562d08e347f832f0b29daefe8bac617b on Github:

"""
commit e3144183562d08e347f832f0b29daefe8bac617b
Author: Heikki Linnakangas 
Date:   Thu Apr 23 18:38:11 2015 +0300

Minor cleanup of check_exclusion_or_unique_constraint.

To improve readability.

"""

At least, that's what a "git bisect" session showed.

Basically, at high client counts (128 clients only), after a few
iterations of the B-Tree test, the latest version of jjanes_upsert
would see this happen (error originates fromExecOnConflictUpdate(),
with custom instrumentation added to identify invisible tuple):

2015-04-23 15:10:48 PDT [ txid:  0 ]: LOG:  database system was shut
down at 2015-04-23 15:10:09 PDT
2015-04-23 15:10:48 PDT [ txid:  0 ]: LOG:  database system is ready
to accept connections
2015-04-23 15:12:55 PDT [ txid:  472418 ]: ERROR:  attempted to lock
invisible tuple (140,39)
2015-04-23 15:12:55 PDT [ txid:  472418 ]: STATEMENT:  insert into
upsert_race_test (index, filler, count) values ('3896',
random_characters(), '1') on conflict (index)
 update set count=TARGET.count + EXCLUDED.count, filler =
EXCLUDED.filler
 where TARGET.index = EXCLUDED.index
 returning count

This seemed to only show up when fsync = on on my laptop, whereas in
the past some race conditions that I've found were easier to recreate
with fsync = off.

I attach some notes from my bisect/debugging session, including
pg_xlogdump output (from a psql session - I like to manipulate
pg_xlogdump output using SQL). That's probably not all that
interesting, but I attach it all the same. Hopefully this is something
that Heikki can easily debug, since the commit implicated here only
concerned readability. A simple oversight? If you want, Heikki, I can
try and debug it, but it seems like something we're better off having
you sign off on if possible.

Thanks
-- 
Peter Geoghegan
error for dump:

pg@hamster:~/postgresql/src/backend/executor$ postgres   
(git)-[insert_conflict_ignore]
2015-04-23 15:10:48 PDT [ txid:  0 ]: LOG:  database system was shut down at 
2015-04-23 15:10:09 PDT
2015-04-23 15:10:48 PDT [ txid:  0 ]: LOG:  database system is ready to accept 
connections
2015-04-23 15:12:55 PDT [ txid:  472418 ]: ERROR:  attempted to lock invisible 
tuple (140,39)
2015-04-23 15:12:55 PDT [ txid:  472418 ]: STATEMENT:  insert into 
upsert_race_test (index, filler, count) values ('3896', random_characters(), 
'1') on conflict (index)
  update set count=TARGET.count + EXCLUDED.count, filler = 
EXCLUDED.filler
  where TARGET.index = EXCLUDED.index
  returning count

This was after 3 full iterations of jjanes_upsert, B-Tree only with 128 clients 
only (fsync = on, the default). Settings:

max_connections = 150
shared_buffers = 6GB
work_mem = 5GB
maintenance_work_mem = 5GB
autovacuum = off
max_wal_size=10GB
wal_level = logical
max_replication_slots = 4
#log_min_duration_statement=0
#log_error_verbosity=verbose
log_line_prefix='%t [ %x ]: '
wal_keep_segments=500
max_prepared_transactions=5
shared_preload_libraries='pg_stat_statements'

pg_xlogdump output up to and including the above xact's abort:

(18545) /postgres=# select * from xlogdump_records where r_lsn <= '0/82F7F10' 
order by r_lsn desc limit 50;
rmgr | len_rec | len_tot |   tx   |   r_lsn   | prev_lsn  | 
descr   
   |   relation
-+-+-++---+---++---
 Transaction |   8 |  34 | 472418 | 0/82F7F10 | 0/82F7EE0 | ABORT 
2015-04-23 15:12:55.784621 PDT  
 | [null]
 Heap|   2 |  48 | 472420 | 0/82F7EE0 | 0/82F7EA0 | 
HEAP_CONFIRM off 144 blkref #0: rel 1663/12488/16427 blk 33 
   | upsert_race_test
 Btree   |   2 |  64 | 472420 | 0/82F7EA0 | 0/82F7E48 | INSERT_LEAF 
off 313 blkref #0: rel 1663/12488/16433 blk 22  
   | upsert_race_test_pkey
 Heap|   3 |  83 | 472420 | 0/82F7E48 | 0/82F7E18 | INSERT off 
144 blkref #0: rel 1663/12488/16427 blk 33  
| upsert_race_test
 Transaction |  20 |  46 | 472419 | 0/82F7E18 | 0/82F7DD8 | COMMIT 
2015-04-23 15:12:55.784507 PDT

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-23 Thread Andres Freund

On 2015-04-23 23:08:34 +0300, Heikki Linnakangas wrote:
> The heapam API is not that stable, we've added arguments to those functions
> every once in a while, and I don't recall any complaints.

I heard some, but not too many, that's true. I know that I've written
code that'd be broken/needed even more ifdefs ;)

> That said, I'd actually like to see a separate heap_super_delete() function
> for that, rather than piggybacking on heap_delete(). It's a quite different
> operation. There'll be some duplication, but seems better than a maze of
> if-else's in heap_delete.

+many. I've requested that changes a couple times now.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-23 Thread Heikki Linnakangas


On 04/23/2015 10:53 PM, Andres Freund wrote:

On 2015-04-23 12:45:59 -0700, Peter Geoghegan wrote:

On Thu, Apr 23, 2015 at 12:55 AM, Andres Freund  wrote:

I think you misread my statement: I'm saying we don't need the new
argument anymore, even if we still do the super-deletion in
heap_delete(). Now that the speculative insertion will not be visible
(as in seen on a tuple they could delete) to other backends we can just
do the super deletion if we see that the tuple is a promise one.


I disagree. I think it's appropriate that the one and only "super"
heap_delete() caller make a point of indicating that that's what it's
doing. The cost is only that the two other such callers must say that
they're not doing it. Super deletion is a special thing, that logical
decoding knows all about for example, and I think it's appropriate
that callers ask explicitly.


Unconvinced. Not breaking an API has its worth.


The heapam API is not that stable, we've added arguments to those 
functions every once in a while, and I don't recall any complaints. So 
I'm with Peter that super-deletion should be requested explicitly by the 
caller.


That said, I'd actually like to see a separate heap_super_delete() 
function for that, rather than piggybacking on heap_delete(). It's a 
quite different operation. There'll be some duplication, but seems 
better than a maze of if-else's in heap_delete.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-23 Thread Peter Geoghegan

On Thu, Apr 23, 2015 at 12:53 PM, Andres Freund  wrote:
> Unconvinced. Not breaking an API has its worth.

Yeah, and I acknowledge that - but it's not something that it's
appropriate to encapsulate IMV.

Let's just leave it to Heikki...I'd say he has the deciding vote,
especially as the reviewer that is more in charge of the executor
stuff than anything else.

>> The second most significant open item is rebasing on top of the recent
>> RLS changes, IMV.
>
> Not sure I agree. That part seems pretty mechanical to me.

Hopefully so.

> * The docs aren't suitable for endusers. I think this will take a fair
>   amount of work.

It's hard to explain why something like the collation field in the
inference specification matters to users...because it's only supported
at all due to forwards compatibility concerns. It's hard to document
certain things like that in an accessible way. In general, much of the
effort of the last year was making the feature play nice, and
considering a bunch of usability edge cases that are unlikely to come
up, but still must be documented.

> * We're not yet sure about the syntax yet. In addition to the keyword
>   issue I'm unconvinced about having two WHERE clauses in the same
>   statement. I think that'll end up confusing users a fair bit. Might
>   still be the best way forward.

My previous approach to allowing an index predicate did at least
clearly show that the predicate belonged to the inference
specification, since it appeared between parenthesis. Maybe we should
do that after all? I don't think it much matters if the inference
specification is not identical to CREATE INDEX. I don't want to give
up inference of partial indexes, and I don't want to give up allowing
the UPDATE to have a limited WHERE clause, and we can't just spell
those differently here. So what alternative does that leave?

Anyone else have an opinion?

> * The executor integration still isn't pretty, although Heikki is making
>   strides there

That's just clean-up, though. I'm not worried about the risk of Heikki
not succeeding at that.

> * I don't think anybody seriously has looked at the planner side yet.

Good point. That certainly needs to be looked at (and I include the
rewriter parts in that). It's really not that much code, but (ideally)
a subject matter expert should look into the whole ExcludedExpr dance
in particular.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-23 Thread Andres Freund

On 2015-04-23 12:45:59 -0700, Peter Geoghegan wrote:
> On Thu, Apr 23, 2015 at 12:55 AM, Andres Freund  wrote:
> > I think you misread my statement: I'm saying we don't need the new
> > argument anymore, even if we still do the super-deletion in
> > heap_delete(). Now that the speculative insertion will not be visible
> > (as in seen on a tuple they could delete) to other backends we can just
> > do the super deletion if we see that the tuple is a promise one.
> 
> I disagree. I think it's appropriate that the one and only "super"
> heap_delete() caller make a point of indicating that that's what it's
> doing. The cost is only that the two other such callers must say that
> they're not doing it. Super deletion is a special thing, that logical
> decoding knows all about for example, and I think it's appropriate
> that callers ask explicitly.

Unconvinced. Not breaking an API has its worth.

> The second most significant open item is rebasing on top of the recent
> RLS changes, IMV.

Not sure I agree. That part seems pretty mechanical to me.

* The docs aren't suitable for endusers. I think this will take a fair
  amount of work.

* We're not yet sure about the syntax yet. In addition to the keyword
  issue I'm unconvinced about having two WHERE clauses in the same
  statement. I think that'll end up confusing users a fair bit. Might
  still be the best way forward.
* The executor integration still isn't pretty, although Heikki is making
  strides there
* I don't think anybody seriously has looked at the planner side yet.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-23 Thread Peter Geoghegan

On Thu, Apr 23, 2015 at 12:55 AM, Andres Freund  wrote:
> I think you misread my statement: I'm saying we don't need the new
> argument anymore, even if we still do the super-deletion in
> heap_delete(). Now that the speculative insertion will not be visible
> (as in seen on a tuple they could delete) to other backends we can just
> do the super deletion if we see that the tuple is a promise one.

I disagree. I think it's appropriate that the one and only "super"
heap_delete() caller make a point of indicating that that's what it's
doing. The cost is only that the two other such callers must say that
they're not doing it. Super deletion is a special thing, that logical
decoding knows all about for example, and I think it's appropriate
that callers ask explicitly.

>> > * breinbaas on IRC just mentioned that it'd be nice to have upsert as a
>> >   link in the insert. Given that that's the pervasive term that doesn't
>> >   seem absurd.
>>
>> Not sure what you mean. Where would the link appear?
>
> The index, i.e. it'd just be another indexterm. It seems good to give
> people a link.

Oh, okay. Sure. Done on Github.

>> * We need to figure out the tuple lock strength details. I think this
>> is doable, but it is the greatest challenge to committing ON CONFLICT
>> UPDATE at this point. Andres feels that we should require no greater
>> lock strength than an equivalent UPDATE. I suggest we get to this
>> after committing the IGNORE variant. We probably need to discuss this
>> some more.
>
> I want to see a clear way forward before we commit parts. It doesn't
> have to be completely iron-clad, but the way forward should be pretty
> clear. What's the problem you're seeing right now making this work? A
> couple days on jabber you seemed to see a way doing this?

I was really just identifying it as the biggest problem the patch now
faces, and I want to face those issues down ASAP. Of course, that's
good, because as you say it is a small problem in an absolute sense.
The second most significant open item is rebasing on top of the recent
RLS changes, IMV.

I can see yourself and Heikki continuing to change small stylistic
things, which I expect to continue for a little while. That's fine,
but naturally I want to be aggressive about closing off these open
issues that are not just general clean-up, but have some small level
of risk of becoming more significant blockers.

> The reason I think this has to use the appropriate lock level is that
> it'll otherwise re-introduce the deadlocks that fk locks resolved. Given
> how much pain we endured to get fk locks, that seems like a bad deal.

Right.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-23 Thread Peter Geoghegan

On Thu, Apr 23, 2015 at 9:02 AM, Heikki Linnakangas  wrote:
> That code in ExecWithCheckOptions is not translatable. See style guide:
> http://www.postgresql.org/docs/devel/static/nls-programmer.html#NLS-GUIDELINES

It's probably going to need to change when I rebase on top of
Dean's/Stephen's work, anyway.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-23 Thread Heikki Linnakangas


On 04/20/2015 07:37 AM, Peter Geoghegan wrote:


if (wco->commandType == CMD_INSERT)
command = "INSERT-applicable ";
else if (wco->commandType == CMD_UPDATE)
command = "UPDATE-applicable ";
else if (wco->commandType == CMD_DELETE)
command = "DELETE-applicable ";
else if (wco->commandType == CMD_SELECT)
command = "SELECT-applicable ";

ereport(ERROR,

(errcode(ERRCODE_WITH_CHECK_OPTION_VIOLATION),
 errmsg("new row violates %sWITH CHECK OPTION %sfor 
\"%s\"",
command ? command : "",
wco->secBarrier ? "(originally security 
barrier) ":"",
wco->viewname),
val_desc ? errdetail("Failing row contains 
%s.", val_desc) :
   0));


That code in ExecWithCheckOptions is not translatable. See style guide: 
http://www.postgresql.org/docs/devel/static/nls-programmer.html#NLS-GUIDELINES


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-23 Thread Bruce Momjian

On Thu, Apr 23, 2015 at 05:02:19PM +0200, Andres Freund wrote:
> On 2015-04-23 15:52:40 +0100, Geoff Winkless wrote:
> > When I set out I was really only hoping to express a preference as a user;
> > on balance I would really rather not have DO IGNORE, if it were possible to
> > avoid, because it's really ugly, but DO UPDATE/DO NOTHING I could just
> > about cope with (and means you don't need to add IGNORE as a keyword,
> > win!), although it still mildly pains me that there's an additional
> > unnecessary word.
> 
> Yea, DO NOTHING is a good alternative. And I do like we're adding one
> keyword less (which is also good for the parser's
> size/performance).

No question that DO IGNORE sounds awkward.  DO NOTHING also matches
CREATE RULE --- another plus.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-23 Thread Andres Freund

On 2015-04-23 15:52:40 +0100, Geoff Winkless wrote:
> When I set out I was really only hoping to express a preference as a user;
> on balance I would really rather not have DO IGNORE, if it were possible to
> avoid, because it's really ugly, but DO UPDATE/DO NOTHING I could just
> about cope with (and means you don't need to add IGNORE as a keyword,
> win!), although it still mildly pains me that there's an additional
> unnecessary word.

Yea, DO NOTHING is a good alternative. And I do like we're adding one
keyword less (which is also good for the parser's
size/performance).

DO {UPDATE ... | NOTHING | LOCK} doesn't sound too bad to me (yes, LOCK
doesn't exist yet, except by writing UPDATE .. WHERE false ;)).

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-23 Thread Geoff Winkless

On 23 April 2015 at 14:50, Andres Freund  wrote:

> > Maybe I'm misreading it, but isn't index_predicate meant to be inside
> the
> > brackets?
> >
> >
> http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html
>
> That has changed since.

Oh, helpful. :)

I'll shut up. I have a feeling that my objection is really with the very
idea of unreserved keywords and I have a feeling that there will be rather
more people shouting me down if I go off on that particular rant; meanwhile
it's 20 years since I used yacc in earnest and it's too hazy to be able to
argue about what it is or isn't capable of.

When I set out I was really only hoping to express a preference as a user;
on balance I would really rather not have DO IGNORE, if it were possible to
avoid, because it's really ugly, but DO UPDATE/DO NOTHING I could just
about cope with (and means you don't need to add IGNORE as a keyword,
win!), although it still mildly pains me that there's an additional
unnecessary word.

But I certainly don't object enough to hold up you guys doing the actual
work for my benefit (among others, obviously!).

G

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-23 Thread Andres Freund

On 2015-04-23 14:34:02 +0100, Geoff Winkless wrote:
> > A syntax error. DO is a reserved keyword. Update is just unreserved (and
> > thus can be used as a column label). Ignore is unreserved with the patch
> > and was unreserved before.  We obviously can make both reserved, but of so
> > we have to do it for real, not by hiding the conflicts
> >
> 
> Sorry, I misunderstood: so it's not the fact that it can't be used as a
> column label (because it can) but the fact that it can't then be referenced
> within a WHERE clause without quoting

Meh. You can use any keyword in quotes - because then they're not
keywords anymore.

> INSERT INTO mytable (somevalue) VALUES (999) ON CONFLICT ('myindex') WHERE
> update UPDATE update=1
> 
> but I would have to do
> 
> INSERT INTO mytable (somevalue) VALUES (999) ON CONFLICT ('myindex') WHERE
> "do" UPDATE "do"=1

Yes.

> Maybe I'm misreading it, but isn't index_predicate meant to be inside the
> brackets?
> 
> http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html

That has changed since. And for good reason: It's pretty to have the
WHERE clause inside the brackets when that's not the case for CREATE
INDEX. But more importantly with multiple columns for stuff like ON
CONFLICT (a, b WHERE foo) it's unclear where the WHERE is actually
attached to. We have that problem with aggregates and it repeatedly
caused confusion.

> As I said, it's my personal belief that anyone using keywords (of any
> kind) unquoted deserves what they get, but I see your point.

Given that IGNORE isn't even a keyword right now (9.5 master) that
policy isn't particularly meaningful anyway.

> I think I object to the fact that you're talking about adding extra
> syntactic sugar to work around a parser-implementation problem, not an
> actual syntax problem (since UPDATE SET is unambiguous, isn't it?).

I fail to see the point of such an objection. We have an LALR parser
(generated by bison). That implies a certain expressiveness. You're
suggesting that we change to a different kind of parser?

I don't think it's necessarily unambiguous. I'm not particularly
motivated to prove it though - the point is that we rely on bison to
prevent ambiguities. That only works if we're listening. And not if
we're silencing warnings about ambiguities over the whole grammar.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-23 Thread Geoff Winkless

On 23 April 2015 at 13:51, Andres Freund  wrote:

> On April 23, 2015 3:34:07 PM GMT+03:00, Geoff Winkless <
> pgsqlad...@geoff.dj> wrote:
> 
> >
>  And what's to stop me having "SELECT true
> 
> AS

>do" in the where clause (as per your UPDATE objection)?
>
> A syntax error. DO is a reserved keyword. Update is just unreserved (and
> thus can be used as a column label). Ignore is unreserved with the patch
> and was unreserved before.  We obviously can make both reserved, but of so
> we have to do it for real, not by hiding the conflicts
>

Sorry, I misunderstood: so it's not the fact that it can't be used as a
column label (because it can) but the fact that it can't then be referenced
within a WHERE clause without quoting
. Which is in itself utterly horrible, but that's a separate argument and I
can at least now understand your point.

So I could end up with

INSERT INTO mytable (somevalue) VALUES (999) ON CONFLICT ('myindex') WHERE
update UPDATE update=1

but I would have to do

INSERT INTO mytable (somevalue) VALUES (999) ON CONFLICT ('myindex') WHERE
"do" UPDATE "do"=1

?

> >Apologies for butting in but can I (as a user) express a preference as
> >a
> 
> user against DO?
>
> Sure. If you propose an alternative ;)
>

Maybe I'm misreading it, but isn't index_predicate meant to be inside the
brackets?

http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-insert.html

certainly states that.

It's not one right now. And ignore isn't a keyword at all atm.
>

As I said, it's my personal belief that anyone using keywords (of any
kind) unquoted deserves what they get, but I see your point.

I think I object to the fact that you're talking about adding extra
syntactic sugar to work around a parser-implementation problem, not an
actual syntax problem (since UPDATE SET is unambiguous, isn't it?).

(Please don't top post)
>

Mea culpa. I blame google :)

FWIW "DO IGNORE" just reads disgustingly. If you do finally go down the DO
route, perhaps "DO NOTHING"? :)

Geoff

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-23 Thread Andres Freund

On April 23, 2015 3:34:07 PM GMT+03:00, Geoff Winkless  
wrote:
>Apologies for butting in but can I (as a user) express a preference as
>a
>user against DO?

Sure. If you propose an alternative ;)

>Firstly, it looks horrible. And what's to stop me having "SELECT true
>AS
>do" in the where clause (as per your UPDATE objection)?

A syntax error. DO is a reserved keyword. Update is just unreserved (and thus 
can be used as a column label). Ignore is unreserved with the patch and was 
unreserved before.  We obviously can make both reserved, but of so we have to 
do it for real, not by hiding the conflicts 

>Shouldn't UPDATE be a reserved keyword anyway? AIUI ANSI suggests so.
>
>http://developer.mimer.se/validator/sql-reserved-words.tml

It's not one right now. And ignore isn't a keyword at all atm.

(Please don't top post)

Andres

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-23 Thread Petr Jelinek


On 23/04/15 14:34, Geoff Winkless wrote:

Apologies for butting in but can I (as a user) express a preference as a
user against DO?

Firstly, it looks horrible. And what's to stop me having "SELECT true AS
do" in the where clause (as per your UPDATE objection)?



DO is already reserved keyword. There is also some precedence for this 
in CREATE RULE. But I agree that it's not ideal syntax.



Shouldn't UPDATE be a reserved keyword anyway? AIUI ANSI suggests so.

http://developer.mimer.se/validator/sql-reserved-words.tml

I had always assumed it was; anyone who produced a query for me that
contained update in an unusual context would get slapped heavily.


Postgres currently has UPDATE as unreserved keyword and more importantly 
IGNORE is not keyword at all so making it a new reserved keyword is not 
nice at all.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-23 Thread Geoff Winkless

Apologies for butting in but can I (as a user) express a preference as a
user against DO?

Firstly, it looks horrible. And what's to stop me having "SELECT true AS
do" in the where clause (as per your UPDATE objection)?

Shouldn't UPDATE be a reserved keyword anyway? AIUI ANSI suggests so.

http://developer.mimer.se/validator/sql-reserved-words.tml

I had always assumed it was; anyone who produced a query for me that
contained update in an unusual context would get slapped heavily.

Geoff

On 23 April 2015 at 11:54, Andres Freund  wrote:

> On 2015-04-22 16:40:07 -0700, Peter Geoghegan wrote:
> > On Wed, Apr 22, 2015 at 3:23 PM, Peter Geoghegan  wrote:
> > > * We need to sort out those issues with the grammar, since that only
> > > really applies to the inference specification. Maybe the WHERE clause
> > > that the inference specification accepts can be broken out. No ON
> > > CONFLICT UPDATE specific issues left there, AFAICT though.
> >
> > I pushed some code that deals with the predicate being within
> parenthesis:
> >
> >
> https://github.com/petergeoghegan/postgres/commit/358854645279523310f998dfc9cb3fe3e165ce1e
>
> And the way you've used nonassoc here doesn't look correct. You're
> hiding legitimate ambiguities in the grammar. UPDATE is a unreserved
> keyword, so for
>
> ... ON CONFLICT '(' index_params ')' where_clause OnConflictUpdateStmt
>
> it won't be able to discern whether an UPDATE in the WHERE clause is
> part of the where_clause or OnConflictUpdate.
>
> This is legal:
> SELECT * FROM (SELECT true as update) f WHERE update;
> i.e. 'update' can be the last part of a WHERE clause.
>
> Essentially what you're trying to do with the nonassic is hiding that
> UPDATE and IGNORE need to be reserved keywords with the syntax you're
> proposing. We can either make them reserved or change the syntax.
>
> One way to avoid making them reserved keywords - which would be somewhat
> painful - is to add a 'DO' before the IGNORE/UPDATE. I.e. something like
>
>   ON CONFLICT opt_conflict_expr DO OnConflictUpdateStmt
> | ON CONFLICT opt_conflict_expr DO IGNORE
>
> Greetings,
>
> Andres Freund
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-23 Thread Andres Freund

On 2015-04-22 16:40:07 -0700, Peter Geoghegan wrote:
> On Wed, Apr 22, 2015 at 3:23 PM, Peter Geoghegan  wrote:
> > * We need to sort out those issues with the grammar, since that only
> > really applies to the inference specification. Maybe the WHERE clause
> > that the inference specification accepts can be broken out. No ON
> > CONFLICT UPDATE specific issues left there, AFAICT though.
>
> I pushed some code that deals with the predicate being within parenthesis:
>
> https://github.com/petergeoghegan/postgres/commit/358854645279523310f998dfc9cb3fe3e165ce1e

And the way you've used nonassoc here doesn't look correct. You're
hiding legitimate ambiguities in the grammar. UPDATE is a unreserved
keyword, so for

... ON CONFLICT '(' index_params ')' where_clause OnConflictUpdateStmt

it won't be able to discern whether an UPDATE in the WHERE clause is
part of the where_clause or OnConflictUpdate.

This is legal:
SELECT * FROM (SELECT true as update) f WHERE update;
i.e. 'update' can be the last part of a WHERE clause.

Essentially what you're trying to do with the nonassic is hiding that
UPDATE and IGNORE need to be reserved keywords with the syntax you're
proposing. We can either make them reserved or change the syntax.

One way to avoid making them reserved keywords - which would be somewhat
painful - is to add a 'DO' before the IGNORE/UPDATE. I.e. something like

  ON CONFLICT opt_conflict_expr DO OnConflictUpdateStmt
| ON CONFLICT opt_conflict_expr DO IGNORE

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-23 Thread Andres Freund

On 2015-04-22 15:23:16 -0700, Peter Geoghegan wrote:
> On Tue, Apr 21, 2015 at 7:57 AM, Andres Freund  wrote:
> > * Iff we're going to have the XLOG_HEAP_AFFIRM record, I'd rather have
> >   that guide the logical decoding code. Seems slightly cleaner.
> 
> I thought that you didn't think that would always work out? That in
> some possible scenario it could break?

I don't think there's a real problem. You obviously have to do it right
(i.e. only abort insertion if there's a insert/update/delete or commit).

Speaking of commits: Without having rechecked: I think you're missing
cleanup of th pending insertion on commit.

> > * Gram.y needs a bit more discussion:
> >   * Can anybody see a problem with changing the precedence of DISTINCT &
> > ON to nonassoc? Right now I don't see a problem given both are
> > reserved keywords already.
> 
> Why did you push code that solved this in a roundabout way, but
> without actually reverting my original nonassoc changes (which would
> now not result in shift/reduce conflicts?).

I pushed the changes to a separate repo so you could polish them ;)

> What should we do about that?

I'm prety sure we should not do the %nonassoc stuff.

> I don't like that you seem to have regressed
> diagnostic output [1].

Meh^5. This is the same type of errors we get in literally hundreds of
other syntax errors. And in contrast to the old error it'll actually
have a correct error possition.

> Surely it's simpler to use the nonassoc approach?

I think it's much harder to forsee all consequences of that.

> > * SPEC_IGNORE,  /* INSERT of "ON CONFLICT IGNORE" */ looks like
> >   a wrongly copied comment.
> 
> Not sure what you mean here. Please clarify.

I'm not sure either ;)

> > * I wonder if we now couldn't avoid changing heap_delete's API - we can
> >   always super delete if we find a speculative insertion now. It'd be
> >   nice not to break out of core callers if not necessary.
> 
> Maybe, but if there is a useful way to break out only a small subset
> of heap_delete(), I'm not seeing it.

I think you misread my statement: I'm saying we don't need the new
argument anymore, even if we still do the super-deletion in
heap_delete(). Now that the speculative insertion will not be visible
(as in seen on a tuple they could delete) to other backends we can just
do the super deletion if we see that the tuple is a promise one.

> > * breinbaas on IRC just mentioned that it'd be nice to have upsert as a
> >   link in the insert. Given that that's the pervasive term that doesn't
> >   seem absurd.
> 
> Not sure what you mean. Where would the link appear?

The index, i.e. it'd just be another indexterm. It seems good to give
people a link.

> * We need to figure out the tuple lock strength details. I think this
> is doable, but it is the greatest challenge to committing ON CONFLICT
> UPDATE at this point. Andres feels that we should require no greater
> lock strength than an equivalent UPDATE. I suggest we get to this
> after committing the IGNORE variant. We probably need to discuss this
> some more.

I want to see a clear way forward before we commit parts. It doesn't
have to be completely iron-clad, but the way forward should be pretty
clear. What's the problem you're seeing right now making this work? A
couple days on jabber you seemed to see a way doing this?

The reason I think this has to use the appropriate lock level is that
it'll otherwise re-introduce the deadlocks that fk locks resolved. Given
how much pain we endured to get fk locks, that seems like a bad deal.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-22 Thread Peter Geoghegan

On Wed, Apr 22, 2015 at 3:23 PM, Peter Geoghegan  wrote:
> * We need to sort out those issues with the grammar, since that only
> really applies to the inference specification. Maybe the WHERE clause
> that the inference specification accepts can be broken out. No ON
> CONFLICT UPDATE specific issues left there, AFAICT though.

I pushed some code that deals with the predicate being within parenthesis:

https://github.com/petergeoghegan/postgres/commit/358854645279523310f998dfc9cb3fe3e165ce1e

(it now follows the attributes/expressions indexes, in the style of
CREATE INDEX).

We still need to reconcile these changes to the grammar with your own,
Andres. I'm going to wait to hear back on what you think about that.
Note that this removal:

-%nonassoc DISTINCT
-%nonassoc ON

was incidental to the commit (this is the code you could have removed
when you modified the grammar, adding a new production, but didn't).
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-22 Thread Peter Geoghegan

On Tue, Apr 21, 2015 at 7:57 AM, Andres Freund  wrote:
> I'm not 100% sure Heikki and I am on exactly the same page here :P
>
> I'm looking at git diff $(git merge-base upstream/master HEAD).. where
> HEAD is e1a5822d164db0.

Merged your stuff into my Github branch. Heikki is pushing changes
there directly now.

> * The logical stuff looks much saner.

Cool.

> * Please add tests for the logical decoding stuff. Probably both a plain
>   regression and and isolationtester test in
>   contrib/test_decoding. Including one that does spooling to disk.

Working on it...hope to push that to Github soon.

> * I don't like REORDER_BUFFER_CHANGE_INTERNAL_INSERT/DELETE as names. Why not
>   _SPECINSERT and _SPECDELETE or such?

Changed that on Github.

> * Iff we're going to have the XLOG_HEAP_AFFIRM record, I'd rather have
>   that guide the logical decoding code. Seems slightly cleaner.

I thought that you didn't think that would always work out? That in
some possible scenario it could break?

> * Still not a fan of the name 'arbiter' for the OC indexes.

What do you prefer? Seems to describe what we're talking about
reasonably well to me.

> * Gram.y needs a bit more discussion:
>   * Can anybody see a problem with changing the precedence of DISTINCT &
> ON to nonassoc? Right now I don't see a problem given both are
> reserved keywords already.

Why did you push code that solved this in a roundabout way, but
without actually reverting my original nonassoc changes (which would
now not result in shift/reduce conflicts?). What should we do about
that? Seems your unsure (otherwise you'd have reverted my thing). I
don't like that you seem to have regressed diagnostic output [1].
Surely it's simpler to use the nonassoc approach? I think this works
by giving the relevant keywords an explicit priority lower than '(',
so that a rule with ON CONFLICT '(' will shift rather than reducing a
conflicting rule (CONFLICT is an unreserved keyword). I wrote the code
so long ago that I can't really remember why I thought it was the
right thing, though.

>   * UpdateInsertStmt is a horrible name. OnConflictUpdateStmt maybe?

Done on Github.

>   * '(' index_params where_clause ')' is imo rather strange. The where
> clause is inside the parens? That's quite different from the
> original index clause.

I don't know. Maybe I was lazy about fixing shift/reduce conflicts.   :-)

I'll look at this some more.

> * SPEC_IGNORE,  /* INSERT of "ON CONFLICT IGNORE" */ looks like
>   a wrongly copied comment.

Not sure what you mean here. Please clarify.

> * The indentation in RoerderBufferCommit is clearly getting out of hand,
>   I've queued up a commit cleaning this up in my repo, feel free to merge.

Done on Github.

> * I don't think we use the term 'ordinary table' in error messages so
>   far.

Fixed on Github.

> * I still think it's unacceptable to redefine
>   XLOG_HEAP_LAST_MULTI_INSERT as XLOG_HEAP_SPECULATIVE_TUPLE like you
>   did. I'll try to find something better.

I did what you suggested in your follow-up e-mail (changes are on Github [2]).

> * I wonder if we now couldn't avoid changing heap_delete's API - we can
>   always super delete if we find a speculative insertion now. It'd be
>   nice not to break out of core callers if not necessary.

Maybe, but if there is a useful way to break out only a small subset
of heap_delete(), I'm not seeing it. Most of the callers that need a
new NULL argument are heap_insert() callers, actually. There are now 3
heap_delete() callers, up from 2.

> * breinbaas on IRC just mentioned that it'd be nice to have upsert as a
>   link in the insert. Given that that's the pervasive term that doesn't
>   seem absurd.

Not sure what you mean. Where would the link appear? It is kind of
hard to categorize that text so that we're strictly either talking
about INSERT or UPSERT. Might be possible, though.

> I think this is getting closer to a commit. Let's get this done.

Great!

The blockers to committing the IGNORE patch I see are:

* We need to tweak some of the logical decoding stuff a bit more, I
feel. Firm up on the details of whether we treat a confirmation record
as a logical decoding change that is involved in the new dance during
transaction reassembly.

* We need to sort out those issues with the grammar, since that only
really applies to the inference specification. Maybe the WHERE clause
that the inference specification accepts can be broken out. No ON
CONFLICT UPDATE specific issues left there, AFAICT though.

That really seems totally doable in just a couple of days.

The blockers for committing the ON CONFLICT UPDATE patch are larger, but doable:

* We need to figure out the tuple lock strength details. I think this
is doable, but it is the greatest challenge to committing ON CONFLICT
UPDATE at this point. Andres feels that we should require no greater
lock strength than an equivalent UPDATE. I suggest we get to this
after committing the IGNORE variant. We probably need to

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-21 Thread Andres Freund

On 2015-04-21 16:57:45 +0200, Andres Freund wrote:
> * I still think it's unacceptable to redefine
>   XLOG_HEAP_LAST_MULTI_INSERT as XLOG_HEAP_SPECULATIVE_TUPLE like you
>   did. I'll try to find something better.

I think we should "just" split this into different flag values for
insert/update/delete.

I.e. something like

/* flags for heap insert and multi insert */
#define XLH_INSERT_ALL_VISIBLE_CLEARED
#define XLH_INSERT_LAST_MULTI_INSERT
#define XLH_INSERT_IS_SPECULATIVE
#define XLH_INSERT_CONTAINS_NEW_TUPLE

/* flags for update */
#define XLH_UPDATE_OLD_ALL_VISIBLE_CLEARED
#define XLH_UPDATE_NEW_ALL_VISIBLE_CLEARED
#define XLH_UPDATE_CONTAINS_OLD_TUPLE
#define XLH_UPDATE_CONTAINS_OLD_KEY
#define XLH_UPDATE_CONTAINS_NEW_TUPLE
#define XLH_UPDATE_PREFIX_FROM_OLD
#define XLH_UPDATE_SUFFIX_FROM_OLD

/* flags for delete */
#define XLH_DELETE_ALL_VISIBLE_CLEARED
#define XLH_DELETE_CONTAINS_OLD_TUPLE
#define XLH_DELETE_CONTAINS_OLD_KEY

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-21 Thread Andres Freund

On 2015-04-19 21:37:51 -0700, Peter Geoghegan wrote:
> Attached patch, V3.4, implements what I believe you and Heikki have in
> mind here.

I'm not 100% sure Heikki and I am on exactly the same page here :P

I'm looking at git diff $(git merge-base upstream/master HEAD).. where
HEAD is e1a5822d164db0.

* The logical stuff looks much saner.

* Please add tests for the logical decoding stuff. Probably both a plain
  regression and and isolationtester test in
  contrib/test_decoding. Including one that does spooling to disk.

* I don't like REORDER_BUFFER_CHANGE_INTERNAL_INSERT/DELETE as names. Why not
  _SPECINSERT and _SPECDELETE or such?

* Iff we're going to have the XLOG_HEAP_AFFIRM record, I'd rather have
  that guide the logical decoding code. Seems slightly cleaner.

* Still not a fan of the name 'arbiter' for the OC indexes.

* Gram.y needs a bit more discussion:
  * Can anybody see a problem with changing the precedence of DISTINCT &
ON to nonassoc? Right now I don't see a problem given both are
reserved keywords already.
The reason the conflict exists AFAICS is because something like
INSERT INTO foo SELECT DISTINCT ON CONFLICT IGNORE;
is allowed by the grammar. The need for the nonassoc could be
avoided by requiring DISTINCT to be followed by a column. We
currently *do* enforce that, just not in the parser (c.f.
transformDistinctClause). That requires one more production in
simple_select, and a nonoptional distinct clause.
I've queued up a commit cleaning this up in my repo, feel free to
merge and polish.
  * UpdateInsertStmt is a horrible name. OnConflictUpdateStmt maybe?
  * '(' index_params where_clause ')' is imo rather strange. The where
clause is inside the parens? That's quite different from the
original index clause.
* SPEC_IGNORE,  /* INSERT of "ON CONFLICT IGNORE" */ looks like
  a wrongly copied comment.
* The indentation in RoerderBufferCommit is clearly getting out of hand,
  I've queued up a commit cleaning this up in my repo, feel free to merge.
* I don't think we use the term 'ordinary table' in error messages so
  far.
* I still think it's unacceptable to redefine
  XLOG_HEAP_LAST_MULTI_INSERT as XLOG_HEAP_SPECULATIVE_TUPLE like you
  did. I'll try to find something better.
* I wonder if we now couldn't avoid changing heap_delete's API - we can
  always super delete if we find a speculative insertion now. It'd be
  nice not to break out of core callers if not necessary.
* breinbaas on IRC just mentioned that it'd be nice to have upsert as a
  link in the insert. Given that that's the pervasive term that doesn't
  seem absurd.

I think this is getting closer to a commit. Let's get this done.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-20 Thread Peter Geoghegan

On Sun, Apr 19, 2015 at 9:37 PM, Peter Geoghegan  wrote:
> Attached patch, V3.4, implements what I believe you and Heikki have in
> mind here.

Any plans to look at this, Svenne? You are signed up as a reviewer on
the commitfest app. A review of the documentation, and interactions
with other features like inheritance, updatable views and postgres_fdw
would be useful at this point. Obviously I've gone to a lot of effort
to document how things fit together at a high level on the UPSERT wiki
page, but these aspects have not been commented on all that much.

Thanks
-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-17 Thread Dean Rasheed

On 17 April 2015 at 12:54, Stephen Frost  wrote:
> Dean, I've been working through your patches over the past couple of
> days (apologies for the lack of updates, just been busy) and hope to
> push them very shortly (ie: by the end of the weekend).
>

Cool. Thanks.

> One thing that I was hoping to discuss a bit is that I've gone ahead and
> added another set of hooks, so we can have both "permissive" and
> "restrictive" policies be provided from the hook.  It's a bit late to
> make the grammar and other changes which would be required to add a
> "restrictive" policy option to the built-in RLS, but adding the hooks is
> relatively low-impact.
>

Sounds interesting. Perhaps that discussion should be moved to a new thread.

> I'm also going to be including a test_rls_hooks module into
> src/test/modules which will test those hooks and provide an example of
> how to use them.
>

Good idea. I had been thinking that it would be good to test RLS hooks.

> As for the discussion- there was some concern raised about extensions
> being able to "override" built-in policies by using the hooks a certain
> way.  I don't entirely follow the logic behind that concern as an
> extension has the ability to read the files on disk directly from C
> code, should it be written to do so, and so not providing a hook to add
> "permissive" policies is denying useful functionality for very question
> gain, in my view at least.
>
> Thoughts?
>

Yeah, perhaps that concern is somewhat overblown and shouldn't stand
in the way of allowing a hook to add permissive policies.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-17 Thread Peter Geoghegan

On Fri, Apr 17, 2015 at 1:30 AM, Andres Freund  wrote:
>> However, like my second approach, there would be a "speculative
>> affirmation" WAL record.
>
> I think there should be one, but it's not required for the approach. The
> 'pending speculative insertion' can just be completed whenever there's a
> insert/update/delete that's not a super deletion.
>
> I.e. in the REORDER_BUFFER_CHANGE_INSERT/... case ReorderBufferCommit()
> just add something like:
>
> if (pending_insertion != NULL)
> {
> if (new_record_is_super_deletion)
> ReorderBufferReturnTupleBuf(pending_insertion);
> else
> rb->apply_change(rb, txn, relation, pending_insertion);
> }
> ...
>
> You'll have to be careful to store the pending_insertion *after*
> ReorderBufferToastReplace(), but that should not be a problem.

Okay. It seems like what you're saying is that I should be prepared to
have to deal with a REORDER_BUFFER_CHANGE_INTERNAL_SNAPSHOT change (or
multiple such changes) from within ReorderBufferCommit() between a
speculative insertion and a super deletion, but that I can safely
assume that once some other insert/update/delete is encountered (or
once all changes have been drained from the reorder buffer), I can
then apply the speculative insertion as a regular insertion.

Is that what you're saying, in a nutshell? IOW, when you said this:

"""
I earlier thought that'd not be ok because there could be a new catalog
snapshot inbetween, but I was mistaken: The locking on the source
transaction prevents problems.
"""

What you meant was that you'd decided that this pattern is not broken,
*provided* that the implementation simply account for the fact that a
REORDER_BUFFER_CHANGE_INTERNAL_SNAPSHOT change could come before some
other (non-REORDER_BUFFER_CHANGE_INTERNAL_DELETE, A.K.A.
non-superdelete) change came? And that we might need to be more
"patient" about deciding that a speculative insertion succeeded (more
"patient" than considering only one single non-superdelete change,
that can be anything else)?

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-17 Thread Peter Geoghegan

On Fri, Apr 17, 2015 at 11:19 AM, Heikki Linnakangas  wrote:
>> because he wanted to preserve those by doing the MagicOffsetNumber
>> thing.
>
>
> Yes, that's the way to go.
>
> Glad we cleared that up :-).

Got it, thanks!


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-17 Thread Heikki Linnakangas


On 04/17/2015 09:02 PM, Peter Geoghegan wrote:

I'm slightly surprised that Heikki now wants
to use an infomask2 bit (if that is what you mean),


No, I don't.


because he wanted to preserve those by doing the MagicOffsetNumber
thing.


Yes, that's the way to go.

Glad we cleared that up :-).

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-17 Thread Peter Geoghegan

On Fri, Apr 17, 2015 at 1:38 AM, Andres Freund  wrote:
>> Do you just mean that you think that speculative insertions should be
>> explicitly affirmed by a second record (making it not a speculative
>> tuple, but rather, a fully fledged tuple)? IOW, an MVCC snapshot has
>> no chance of seeing a tuple until it was affirmed by a second in-place
>> modification, regardless of tuple xmin xact commit status?
>
> Yes. I think

Good.

> a) HEAP_SPECULATIVE should never be visible outside in a
> committed transaction. That allows us to redefine what exactly the bit
> means and such after a simple restart. On IM Heiki said he wants to
> replace this by a bit in the item pointer, but I don't think that
> changes things substantially.

I guess you envisage that HEAP_SPECULATIVE is an infomask2 bit? I
haven't been using one of those for a couple of revisions now,
preferring to use the offset MagicOffsetNumber to indicate a
speculative t_ctid value. I'm slightly surprised that Heikki now wants
to use an infomask2 bit (if that is what you mean), because he wanted
to preserve those by doing the MagicOffsetNumber thing. But I guess
we'd still be preserving the bit under this scheme (since it's always
okay to do something different with the bit after a restart).

Why is it useful to consume an infomask2 bit after all? Why did Heikki
change his mind - due to wanting representational redundancy? Or do I
have it all wrong?

> b) t_ctid should not contain a speculative token in committed
> (i.e. visible to other sessions using mvcc semantics) tuple. Right now
> (i.e. master) t_ctid will point to itself for non-updated tuples. I
> don't think it's good to have something in there that's not an actual
> ctid in there. The number of places that look into t_ctid for
> in-progress insertions of other sessions is smaller than the ones that
> look at tuples in general.

Right. So if a tuple is committed, it should not have set
HEAP_SPECULATIVE/have a t_ctid offset of MagicOffsetNumber. But a
non-ctid t_ctid (a speculative token) remains possible for
non-committed tuples visible to some types of snapshots (in
particular, dirty snapshots).

> c) Cleaning the flag/ctid after a completed speculative insertion makes
> it far less likely that we end up waiting on a other backend when we
> wouldn't have to. If a session inserts a large number of tuples
> speculatively (surely *not* a unlikely thing in the long run) it gets
> rather likely that tokens are reused. Which means if another backend
> touches these in-progress records it's quite possible that the currently
> acquired token is the same as the one on a tuple that has actually
> finished inserting.

It's more important than that, actually. If the inserter fails to
clear its tuple's speculative token, and then releases their token
lmgr lock, it will cause livelock with many upserting sessions.
Coordinating which other session gets to lazily clear the speculative
token (cleaning up after the original inserter) seemed quite hazardous
when I looked into it. Maybe you could fix it by interleaving buffer
locks and lmgr locks, but we can't do that.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-17 Thread Peter Geoghegan

On Fri, Apr 17, 2015 at 4:54 AM, Stephen Frost  wrote:
> Dean, I've been working through your patches over the past couple of
> days (apologies for the lack of updates, just been busy) and hope to
> push them very shortly (ie: by the end of the weekend).
>
> One thing that I was hoping to discuss a bit is that I've gone ahead and
> added another set of hooks, so we can have both "permissive" and
> "restrictive" policies be provided from the hook.  It's a bit late to
> make the grammar and other changes which would be required to add a
> "restrictive" policy option to the built-in RLS, but adding the hooks is
> relatively low-impact.

I came up with something that is along the lines of what we discussed.
I'll wait for you to push Dean's code, and rebase on top of that
before submitting what I came up with. Maybe this can be rolled into
my next revision if I work through Andres' most recent feedback
without much delay.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-17 Thread Stephen Frost

Dean,

* Dean Rasheed (dean.a.rash...@gmail.com) wrote:
> In all of this, I think we should try to keep things as simple as
> possible, to give the end user a chance to understand it --- although
> I'm not sure I've achieved that with my explanation :-)

Thanks a lot for this.  It goes along with my thinking also and matches,
I believe, what I had explained to Peter on our call.

Peter, please let me know if you agree.

Dean, I've been working through your patches over the past couple of
days (apologies for the lack of updates, just been busy) and hope to
push them very shortly (ie: by the end of the weekend).

One thing that I was hoping to discuss a bit is that I've gone ahead and
added another set of hooks, so we can have both "permissive" and
"restrictive" policies be provided from the hook.  It's a bit late to
make the grammar and other changes which would be required to add a
"restrictive" policy option to the built-in RLS, but adding the hooks is
relatively low-impact.

I'm also going to be including a test_rls_hooks module into
src/test/modules which will test those hooks and provide an example of
how to use them.

As for the discussion- there was some concern raised about extensions
being able to "override" built-in policies by using the hooks a certain
way.  I don't entirely follow the logic behind that concern as an
extension has the ability to read the files on disk directly from C
code, should it be written to do so, and so not providing a hook to add
"permissive" policies is denying useful functionality for very question
gain, in my view at least.

Thoughts?

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-17 Thread Andres Freund

On 2015-04-16 09:43:54 -0700, Peter Geoghegan wrote:
> On Thu, Apr 16, 2015 at 2:23 AM, Andres Freund  wrote:
> > I'm, completely independent of logical decoding, of the *VERY* strong
> > opinion that 'speculative insertions' should never be visible when
> > looking with normal snapshots. For one it allows to simplify
> > considerations around wraparound (which has proven to be a very good
> > idea, c.f. multixacts + vacuum causing data corruption years after it
> > was thought to be harmless). For another it allows to reclaim/redefine
> > the bit after a database restart/upgrade. Given how complex this is and
> > how scarce flags are that seems like a really good property.
> >
> > And avoiding those flags to be visible to the outside requires a WAL
> > record, otherwise it won't be correct on the standby.
> 
> I'm a bit distracted here, and not sure exactly what you mean. What's
> a normal snapshot?

Normal visibility semantics, i.e. SnapshotMVCC, not SnapshotDirty.

> Do you just mean that you think that speculative insertions should be
> explicitly affirmed by a second record (making it not a speculative
> tuple, but rather, a fully fledged tuple)? IOW, an MVCC snapshot has
> no chance of seeing a tuple until it was affirmed by a second in-place
> modification, regardless of tuple xmin xact commit status?

Yes. I think

a) HEAP_SPECULATIVE should never be visible outside in a
committed transaction. That allows us to redefine what exactly the bit
means and such after a simple restart. On IM Heiki said he wants to
replace this by a bit in the item pointer, but I don't think that
changes things substantially.

b) t_ctid should not contain a speculative token in committed
(i.e. visible to other sessions using mvcc semantics) tuple. Right now
(i.e. master) t_ctid will point to itself for non-updated tuples. I
don't think it's good to have something in there that's not an actual
ctid in there. The number of places that look into t_ctid for
in-progress insertions of other sessions is smaller than the ones that
look at tuples in general.

c) Cleaning the flag/ctid after a completed speculative insertion makes
it far less likely that we end up waiting on a other backend when we
wouldn't have to. If a session inserts a large number of tuples
speculatively (surely *not* a unlikely thing in the long run) it gets
rather likely that tokens are reused. Which means if another backend
touches these in-progress records it's quite possible that the currently
acquired token is the same as the one on a tuple that has actually
finished inserting.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-17 Thread Andres Freund

On 2015-04-16 10:25:29 -0700, Peter Geoghegan wrote:
> On Thu, Apr 16, 2015 at 2:18 AM, Andres Freund  wrote:
> > If we go that way that's the way I think it should be done: Whenever we
> > encounter a speculative record we 'unlink' it from the changes that will
> > be reused for spooling from disk and do nothing further. Then we just
> > continue reading through the records.
>
> You mean we continue iterating through *changes* from ReorderBufferCommit()?

changes, records, not much of a difference here.

> > If the next thing we encounter is
> > a super-deletion we throw away that record. If it's another type of
> > change (or even bettter a 'speculative insertion succeeded' record)
> > insert it.
>
> By "insert it", I gather you mean report to the the logical decoding
> plugin as an insert change.

Yes.

> We'd have coordination between records originally decoded into
> changes, maybe "intercepting" them during xact reassembly, like my
> first approach. We'd mostly do the same thing as the first approach,
> actually. The major difference would be that there'd be explicit
> "speculative affirmation" WAL records. But we wouldn't rely on those
> affirmation records within ReorderBufferCommit() - we'd rely on the
> *absence* of a super deletion WAL record (to report an insert change
> to the decoding plugin). To emphasize, like my first approach, it
> would be based on an *absence* of a super deletion WAL record.

> However, like my second approach, there would be a "speculative
> affirmation" WAL record.

I think there should be one, but it's not required for the approach. The
'pending speculative insertion' can just be completed whenever there's a
insert/update/delete that's not a super deletion.

I.e. in the REORDER_BUFFER_CHANGE_INSERT/... case ReorderBufferCommit()
just add something like:

if (pending_insertion != NULL)
{
if (new_record_is_super_deletion)
ReorderBufferReturnTupleBuf(pending_insertion);
else
rb->apply_change(rb, txn, relation, pending_insertion);
}
...

You'll have to be careful to store the pending_insertion *after*
ReorderBufferToastReplace(), but that should not be a problem.


> Right now, I'm tired, so bear with me. What I think I'm not quite
> getting here is how the new type of "affirmation" record affects
> visibility (or anything else, actually). Clearly dirty snapshots need
> to see the record to set a speculative insertion token for their
> caller to wait on (and HeapTupleSatisfiesVacuum() cannot see the
> speculatively inserted tuple as reclaimable, of course). They need
> this *before* the speculative insertion is affirmed, of course.
>
> Maybe you mean: If the speculative insertion xact is in progress, then
> the tuple is visible to several types of snapshots (dirty, vacuum,
> self, any). If it is not, then tuples are not visible because they are
> only speculative (and not *confirmed*). If they were confirmed, and
> the xact was committed, then those tuples are logically and physically
> indistinguishable from tuples that were inserted in the ordinary
> manner.
>
> Is that it? Basically, the affirmation records have nothing much to do
> with logical decoding in particular. But you still need to super
> delete, so that several types of snapshots ((dirty, vacuum, self, any)
> *stop* seeing the tuple as visible independent of the inserting xact
> being in progress.

I have no idea what this has to do with the email you responded to?
Maybe it's more meant as a response to my separate email that I want the
HEAP_SPECULATIVE to be cleared?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-17 Thread Dean Rasheed

On 9 April 2015 at 22:18, Peter Geoghegan  wrote:
> The big idea (the fine details of which Stephen appeared to be in
> tentative agreement with [1]) is that an UPSERT is a hybrid between an
> INSERT and an UPDATE, and not simply an INSERT and separate UPDATE
> tied together. So at the very least the exact behavior of such a
> hybrid cannot be assumed to be any particular way just from
> generalizing from known behaviors for INSERT and UPDATE (which is a
> usability issue, since the fine details of RLS are already complex
> enough without UPSERT).
>

I think that you're over-complicating this. From a usability point of
view, I think that the best approach is to keep this as simple as
possible and make the behaviour consistent with an INSERT and an
UPDATE tied together, as is suggested by the new syntax. The key point
is that, if you are using the RLS INSERT and UPDATE policies for this
new command, the rule should be that the user has permission to
insert/update a new/existing row if and only if they would have had
permission to do so using a stand-alone INSERT/UPDATE command.

On the other hand, if you believe that the behaviour should be
something other than that, then I think that it would need a new
dedicated kind of RLS policy for this command because, as I will
attempt to explain below, merging together the quals from INSERT and
UPDATE policies leads to logical problems.

> The INSERT policies are only enforced when a tuple is inserted
> because, when the alternative path isn't taken then it's really just
> an INSERT.
>

Agreed.

> For the UPDATE path, where the stickiness/hybridness begins...
> 
> I'd appreciate it if you explicitly outlined what policies you feel
> should be enforce at each of the 3 junctures within an UPSERT (post
> INSERT, pre-UPDATE, post-UPDATE). I would also like you to be very
> explicit about whether or not RLS WITH CHECK policies and USING quals
> (presumably enforced as RLS WITH CHECK policies) from both INSERT and
> UPDATE policies should be enforced at each point. In particular,
> should I ever not treat RLS WCO and USING quals equivalently? (recall
> that we don't want to elide an UPDATE silently, which makes much less
> sense for UPSERT...I had assumed that whatever else, we'd always treat
> WCO and USING quals from UPDATE (and ALL) policies equivalently, but
> perhaps not).

OK, I'll try to explicitly outline how I think this ought to work:

For INSERTs and UPDATEs, there are 3 kinds of RLS quals that come into play:

1). insertCheckQuals - the logical OR of the quals from all INSERT
WITH CHECK policy clauses. These give the user permission to insert
into the table, provided that the new rows satisfy at least one of
these quals.

2). updateUsingQuals - the logical OR of the quals from all UPDATE
USING policy clauses. These give the user permission to update
existing rows in the table, where the existing rows satisfy at least
one of these quals.

3). updateCheckQuals - the logical OR of the quals from all UPDATE
WITH CHECK policy clauses. These give the user permission to update
the table, provided that the new rows satisfy at least one of these
quals.

In general these may all differ from one another.

If we go forward with the idea that RLS quals should be checked before
attempting to insert/update any data, as we do for regular permission
checks, then stand-alone INSERT and UPDATE commands would work
conceptually as follows:

INSERT:
  1. Check NEW against insertCheckQuals (error if the result is not true)
  2. Do the actual insert of NEW

UPDATE:
  1. Check OLD against updateUsingQuals (skip rows that don't match)
  2. Check NEW against updateCheckQuals (error if the result is not true)
  3. Do the actual update (OLD -> NEW)

Of course that's an over-simplification. The updateUsingQuals are
actually merged with the user-supplied WHERE clause in a way dependent
on the presence of leaky functions, but conceptually it matches the
above description.

I think that there is universal agreement that an INSERT .. ON
CONFLICT UPDATE that follows the insert path ought to match the pure
INSERT case, and only check the insertCheckQuals. That just leaves the
question of what to do on the update path, where things get more
complicated because there are 3 tuples involved in the process:

1). t1 - the tuple originally proposed for insertion, but which could
not be inserted due to a conflict with an existing row in the table
(aka EXCLUDED).

2). t2 - the existing row in the table that prevented t1 from being
inserted (aka TARGET).

3). t3 - the final new row resulting from the update path. In general,
we allow this to be quite different from t1 and t2.

If we think of INSERT .. ON CONFLICT UPDATE as an INSERT and an UPDATE
tied together, then logically the following would happen if the update
path were taken:

INSERT .. ON CONFLICT UPDATE (update path):
  1. Check t1 against insertCheckQuals (error if not true)
  2. Detect that t1 conflicts with t2
  3. Test user-supplied auxilia

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-16 Thread Peter Geoghegan

On Thu, Apr 16, 2015 at 2:18 AM, Andres Freund  wrote:
> On 2015-04-15 18:53:15 +0300, Heikki Linnakangas wrote:
>> Hmm, ok, I've read the "INSERT ... ON CONFLICT UPDATE and logical decoding"
>> thread now, and I have to say that IMHO it's a lot more sane to handle this
>> in ReorderBufferCommit() like Peter first did, than to make the main
>> insertion path more complex like this.
>
> I don't like Peter's way much. For one it's just broken. For another
> it's quite annoying to trigger disk reads to figure out what actual type
> of record something is.
>
> If we go that way that's the way I think it should be done: Whenever we
> encounter a speculative record we 'unlink' it from the changes that will
> be reused for spooling from disk and do nothing further. Then we just
> continue reading through the records.

You mean we continue iterating through *changes* from ReorderBufferCommit()?

> If the next thing we encounter is
> a super-deletion we throw away that record. If it's another type of
> change (or even bettter a 'speculative insertion succeeded' record)
> insert it.

By "insert it", I gather you mean report to the the logical decoding
plugin as an insert change.

> That'll still require some uglyness, but it should not be too
> bad.

So, to be clear, what you have in mind is sort of a hybrid between my
first and second approaches (mostly my first approach).

We'd have coordination between records originally decoded into
changes, maybe "intercepting" them during xact reassembly, like my
first approach. We'd mostly do the same thing as the first approach,
actually. The major difference would be that there'd be explicit
"speculative affirmation" WAL records. But we wouldn't rely on those
affirmation records within ReorderBufferCommit() - we'd rely on the
*absence* of a super deletion WAL record (to report an insert change
to the decoding plugin). To emphasize, like my first approach, it
would be based on an *absence* of a super deletion WAL record.

However, like my second approach, there would be a "speculative
affirmation" WAL record. The new speculative affirmation WAL record
would however be quite different to what my second approach to logical
decoding (the in-place update record thing that was in V3.3) actually
did. In particular, it would be far more simple, and the tuple would
be built from the original insertion record within logical decoding.

Right now, I'm tired, so bear with me. What I think I'm not quite
getting here is how the new type of "affirmation" record affects
visibility (or anything else, actually). Clearly dirty snapshots need
to see the record to set a speculative insertion token for their
caller to wait on (and HeapTupleSatisfiesVacuum() cannot see the
speculatively inserted tuple as reclaimable, of course). They need
this *before* the speculative insertion is affirmed, of course.

Maybe you mean: If the speculative insertion xact is in progress, then
the tuple is visible to several types of snapshots (dirty, vacuum,
self, any). If it is not, then tuples are not visible because they are
only speculative (and not *confirmed*). If they were confirmed, and
the xact was committed, then those tuples are logically and physically
indistinguishable from tuples that were inserted in the ordinary
manner.

Is that it? Basically, the affirmation records have nothing much to do
with logical decoding in particular. But you still need to super
delete, so that several types of snapshots ((dirty, vacuum, self, any)
*stop* seeing the tuple as visible independent of the inserting xact
being in progress.

> I earlier thought that'd not be ok because there could be a new catalog
> snapshot inbetween, but I was mistaken: The locking on the source
> transaction prevents problems.

I thought this was the case.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-16 Thread Peter Geoghegan

On Thu, Apr 16, 2015 at 2:23 AM, Andres Freund  wrote:
> I'm, completely independent of logical decoding, of the *VERY* strong
> opinion that 'speculative insertions' should never be visible when
> looking with normal snapshots. For one it allows to simplify
> considerations around wraparound (which has proven to be a very good
> idea, c.f. multixacts + vacuum causing data corruption years after it
> was thought to be harmless). For another it allows to reclaim/redefine
> the bit after a database restart/upgrade. Given how complex this is and
> how scarce flags are that seems like a really good property.
>
> And avoiding those flags to be visible to the outside requires a WAL
> record, otherwise it won't be correct on the standby.

I'm a bit distracted here, and not sure exactly what you mean. What's
a normal snapshot?

Do you just mean that you think that speculative insertions should be
explicitly affirmed by a second record (making it not a speculative
tuple, but rather, a fully fledged tuple)? IOW, an MVCC snapshot has
no chance of seeing a tuple until it was affirmed by a second in-place
modification, regardless of tuple xmin xact commit status?

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-16 Thread Heikki Linnakangas


On 04/16/2015 12:18 PM, Andres Freund wrote:

On 2015-04-15 18:53:15 +0300, Heikki Linnakangas wrote:

Hmm, ok, I've read the "INSERT ... ON CONFLICT UPDATE and logical decoding"
thread now, and I have to say that IMHO it's a lot more sane to handle this
in ReorderBufferCommit() like Peter first did, than to make the main
insertion path more complex like this.


I don't like Peter's way much. For one it's just broken. For another
it's quite annoying to trigger disk reads to figure out what actual type
of record something is.

If we go that way that's the way I think it should be done: Whenever we
encounter a speculative record we 'unlink' it from the changes that will
be reused for spooling from disk and do nothing further. Then we just
continue reading through the records. If the next thing we encounter is
a super-deletion we throw away that record. If it's another type of
change (or even bettter a 'speculative insertion succeeded' record)
insert it. That'll still require some uglyness, but it should not be too
bad.


Sounds good to me.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-16 Thread Andres Freund

On 2015-04-15 17:58:54 +0300, Heikki Linnakangas wrote:
> When the speculative insertion is finished, write a new kind of a WAL record
> for that. The record only needs to contain the ctid of the tuple. Replaying
> that record will clear the flag on the heap tuple that said that it was a
> speculative insertion.
>
> In logical decoding, decode speculative insertions like any other insertion.
> To decode a super-deletion record, scan the reorder buffer for the
> transaction to find the corresponding speculative insertion record for the
> tuple, and remove it.
>
> BTW, that'd work just as well without the new WAL record to finish a
> speculative insertion. Am I missing something?

I'm, completely independent of logical decoding, of the *VERY* strong
opinion that 'speculative insertions' should never be visible when
looking with normal snapshots. For one it allows to simplify
considerations around wraparound (which has proven to be a very good
idea, c.f. multixacts + vacuum causing data corruption years after it
was thought to be harmless). For another it allows to reclaim/redefine
the bit after a database restart/upgrade. Given how complex this is and
how scarce flags are that seems like a really good property.

And avoiding those flags to be visible to the outside requires a WAL
record, otherwise it won't be correct on the standby.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-16 Thread Andres Freund

On 2015-04-15 18:53:15 +0300, Heikki Linnakangas wrote:
> Hmm, ok, I've read the "INSERT ... ON CONFLICT UPDATE and logical decoding"
> thread now, and I have to say that IMHO it's a lot more sane to handle this
> in ReorderBufferCommit() like Peter first did, than to make the main
> insertion path more complex like this.

I don't like Peter's way much. For one it's just broken. For another
it's quite annoying to trigger disk reads to figure out what actual type
of record something is.

If we go that way that's the way I think it should be done: Whenever we
encounter a speculative record we 'unlink' it from the changes that will
be reused for spooling from disk and do nothing further. Then we just
continue reading through the records. If the next thing we encounter is
a super-deletion we throw away that record. If it's another type of
change (or even bettter a 'speculative insertion succeeded' record)
insert it. That'll still require some uglyness, but it should not be too
bad.

I earlier thought that'd not be ok because there could be a new catalog
snapshot inbetween, but I was mistaken: The locking on the source
transaction prevents problems.

> Another idea is to never spill the latest record to disk, at least if it was
> a speculative insertion. Then you would be sure that when you see the
> super-deletion record, the speculative insertion it refers to is still in
> memory. That seems simple.

It's not guaranteed to be the last record, there can be records
inbetween (snapshots from other backends at the very least).

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-15 Thread Heikki Linnakangas


On 04/15/2015 06:01 PM, Andres Freund wrote:

On 2015-04-15 17:58:54 +0300, Heikki Linnakangas wrote:

On 04/15/2015 07:51 AM, Peter Geoghegan wrote:

+heap_finish_speculative(Relation relation, HeapTuple tuple, bool conflict)
+{
+   if (!conflict)
+   {
+   /*
+* Update the tuple in-place, in the common case where no 
conflict was
+* detected during speculative insertion.
+*
+* When heap_insert is called in respect of a speculative 
tuple, the
+* page will actually have a tuple inserted.  However, during 
recovery
+* replay will add an all-zero tuple to the page instead, which 
is the
+* same length as the original (but the tuple header is still 
WAL
+* logged and will still be restored at that point).  If and 
when the
+* in-place update record corresponding to releasing a value 
lock is
+* replayed, crash recovery takes the final tuple value from 
there.
+* Thus, speculative heap records require two WAL records.
+*
+* Logical decoding interprets an in-place update associated 
with a
+* speculative insertion as a regular insert change.  In other 
words,
+* the in-place record generated affirms that a speculative 
insertion
+* completed successfully.
+*/
+   heap_inplace_update(relation, tuple);
+   }
+   else
+   {


That's a bizarre solution.


I tend to agree, but for different reasons.


In logical decoding, decode speculative insertions like any other insertion.
To decode a super-deletion record, scan the reorder buffer for the
transaction to find the corresponding speculative insertion record for the
tuple, and remove it.


Not that easy. That buffer is spilled to disk and such. As discussed.


Hmm, ok, I've read the "INSERT ... ON CONFLICT UPDATE and logical 
decoding" thread now, and I have to say that IMHO it's a lot more sane 
to handle this in ReorderBufferCommit() like Peter first did, than to 
make the main insertion path more complex like this.


Another idea is to never spill the latest record to disk, at least if it 
was a speculative insertion. Then you would be sure that when you see 
the super-deletion record, the speculative insertion it refers to is 
still in memory. That seems simple.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-15 Thread Andres Freund

On 2015-04-15 17:58:54 +0300, Heikki Linnakangas wrote:
> On 04/15/2015 07:51 AM, Peter Geoghegan wrote:
> >+heap_finish_speculative(Relation relation, HeapTuple tuple, bool conflict)
> >+{
> >+if (!conflict)
> >+{
> >+/*
> >+ * Update the tuple in-place, in the common case where no 
> >conflict was
> >+ * detected during speculative insertion.
> >+ *
> >+ * When heap_insert is called in respect of a speculative 
> >tuple, the
> >+ * page will actually have a tuple inserted.  However, during 
> >recovery
> >+ * replay will add an all-zero tuple to the page instead, which 
> >is the
> >+ * same length as the original (but the tuple header is still 
> >WAL
> >+ * logged and will still be restored at that point).  If and 
> >when the
> >+ * in-place update record corresponding to releasing a value 
> >lock is
> >+ * replayed, crash recovery takes the final tuple value from 
> >there.
> >+ * Thus, speculative heap records require two WAL records.
> >+ *
> >+ * Logical decoding interprets an in-place update associated 
> >with a
> >+ * speculative insertion as a regular insert change.  In other 
> >words,
> >+ * the in-place record generated affirms that a speculative 
> >insertion
> >+ * completed successfully.
> >+ */
> >+heap_inplace_update(relation, tuple);
> >+}
> >+else
> >+{
> 
> That's a bizarre solution.

I tend to agree, but for different reasons.

> In logical decoding, decode speculative insertions like any other insertion.
> To decode a super-deletion record, scan the reorder buffer for the
> transaction to find the corresponding speculative insertion record for the
> tuple, and remove it.

Not that easy. That buffer is spilled to disk and such. As discussed.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-15 Thread Heikki Linnakangas


On 04/15/2015 07:51 AM, Peter Geoghegan wrote:

+heap_finish_speculative(Relation relation, HeapTuple tuple, bool conflict)
+{
+   if (!conflict)
+   {
+   /*
+* Update the tuple in-place, in the common case where no 
conflict was
+* detected during speculative insertion.
+*
+* When heap_insert is called in respect of a speculative 
tuple, the
+* page will actually have a tuple inserted.  However, during 
recovery
+* replay will add an all-zero tuple to the page instead, which 
is the
+* same length as the original (but the tuple header is still 
WAL
+* logged and will still be restored at that point).  If and 
when the
+* in-place update record corresponding to releasing a value 
lock is
+* replayed, crash recovery takes the final tuple value from 
there.
+* Thus, speculative heap records require two WAL records.
+*
+* Logical decoding interprets an in-place update associated 
with a
+* speculative insertion as a regular insert change.  In other 
words,
+* the in-place record generated affirms that a speculative 
insertion
+* completed successfully.
+*/
+   heap_inplace_update(relation, tuple);
+   }
+   else
+   {


That's a bizarre solution. May I suggest a much simpler one:

Make the heap-insert record the same for normal and speculative 
insertions, except for a flag that's set if it's a speculative one. 
Replay as usual.


When the speculative insertion is finished, write a new kind of a WAL 
record for that. The record only needs to contain the ctid of the tuple. 
Replaying that record will clear the flag on the heap tuple that said 
that it was a speculative insertion.


In logical decoding, decode speculative insertions like any other 
insertion. To decode a super-deletion record, scan the reorder buffer 
for the transaction to find the corresponding speculative insertion 
record for the tuple, and remove it.


BTW, that'd work just as well without the new WAL record to finish a 
speculative insertion. Am I missing something?



--- a/src/include/storage/off.h
+++ b/src/include/storage/off.h
@@ -26,6 +26,7 @@ typedef uint16 OffsetNumber;
 #define InvalidOffsetNumber((OffsetNumber) 0)
 #define FirstOffsetNumber  ((OffsetNumber) 1)
 #define MaxOffsetNumber((OffsetNumber) (BLCKSZ / 
sizeof(ItemIdData)))
+#define MagicOffsetNumber  (MaxOffsetNumber + 1)
 #define OffsetNumberMask   (0x)/* valid uint16 
bits */


IMHO it would be nicer if the magic value was more constant, e.g. 0x 
or 0xfffe, instead of basing it on MaxOffsetNumber which depends on 
block size. I would rather not include MaxOffsetNumber of anything 
derived from it in the on-disk dormat.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-09 Thread Peter Geoghegan

On Wed, Mar 18, 2015 at 2:59 AM, Dean Rasheed  wrote:
> Yes, I read that, and I agree with the intention to not leak data
> according to both the INSERT and UPDATE policies, however...
>
>> You're seeing a failure that applies to the target tuple of the UPDATE
>> (the tuple that we can't leak the contents of). I felt it was best to
>> check all policies against the target/existing tuple, including both
>> WITH CHECK OPTIONS and USING quals (which are both enforced).
>>
>
> I think that's an incorrect implementation of the RLS UPDATE policy.
> The WITH CHECK quals of a RLS policy are intended to be applied to the
> NEW data, not the existing data. This patch is applying the WITH CHECK
> quals to both the existing and NEW tuples, which runs counter to the
> way RLS polices are normally enforced, and I think that will just lead
> to confusion.

The big idea (the fine details of which Stephen appeared to be in
tentative agreement with [1]) is that an UPSERT is a hybrid between an
INSERT and an UPDATE, and not simply an INSERT and separate UPDATE
tied together. So at the very least the exact behavior of such a
hybrid cannot be assumed to be any particular way just from
generalizing from known behaviors for INSERT and UPDATE (which is a
usability issue, since the fine details of RLS are already complex
enough without UPSERT).

The INSERT policies are only enforced when a tuple is inserted
because, when the alternative path isn't taken then it's really just
an INSERT.

For the UPDATE path, where the stickiness/hybridness begins, we
enforce the target tuple passes both INSERT policies, and UPDATE
policies (including USING quals as WCO). The theory here is that if
you're entitled to INSERT it, you ought to be entitled to INSERT the
existing tuple in order to take the UPDATE path. And we bunch the
UPDATE USING quals + WCO for the sake of (conceptual, not
implementation) simplicity - they're already all WCO at that point.

Finally, the final tuple (generated using the EXCLUDED and TARGET
tuples, from the UPDATE) must pass the UPDATE WCO (including any that
originated as USING quals, a distinction that "no longer exists") as
well as INSERT policies. If you couldn't INSERT the tuple in the first
place (when there wasn't a conflict), then why should you be able to
UPSERT it? This is substantively the "same" row, no? You (the user)
are tacitly asserting that you don't care about whether the INSERT or
UPDATE path is taken anyway, so why should you care? Surely you'd want
this to fail as early as possible, rather than leaving it to chance. I
really do expect that people are only going to do simple
transformations in their UPDATE handler (e.g. "ON CONFLICT UPDATE set
count = TARGET.count + EXCLUDED.count"), so in practice it usually
doesn't matter.

Note that other systems that I looked at don't even support RLS with
SQL MERGE at all. So we have no precedent to consider that I'm aware
of, other than simply not supporting RLS, which would not be
outrageous IMV. I felt, given the ambiguity about how this should
differ from ordinary INSERTs + UPDATEs, that something quite
restrictive but not entirely restrictive (not supporting RLS, just
throwing an error all the time) was safest. In any case I doubt that
this will actually come up all that often.

> The problem with that is that the user will see errors saying that the
> data violates the RLS WITH CHECK policy, when they might quite
> reasonably argue that it doesn't. That's not really being
> conservative. I'd argue it's a bug.

Again, I accept that that's a valid interpretation of it. I have my
own opinion, but I will take the path of least resistance on this
point. What do other people think?

I'd appreciate it if you explicitly outlined what policies you feel
should be enforce at each of the 3 junctures within an UPSERT (post
INSERT, pre-UPDATE, post-UPDATE). I would also like you to be very
explicit about whether or not RLS WITH CHECK policies and USING quals
(presumably enforced as RLS WITH CHECK policies) from both INSERT and
UPDATE policies should be enforced at each point. In particular,
should I ever not treat RLS WCO and USING quals equivalently? (recall
that we don't want to elide an UPDATE silently, which makes much less
sense for UPSERT...I had assumed that whatever else, we'd always treat
WCO and USING quals from UPDATE (and ALL) policies equivalently, but
perhaps not).

Alternatively, perhaps you'd prefer to state your objection in terms
of the exact modifications you'd make to the above outline of the
existing behavior. I don't think I totally follow what you're saying
yet (which is the problem with being cleverer generally!). It is easy
to explain: The insert path is the same as always. Otherwise, both the
before and after tuple have all RLS policies (including USING quals)
enforced as WCOs. I think that it might be substantially easier to
explain that than to explain what you have in mind...let's see.

Thanks

[1] 
http://www.postgresql.org/message-id/20150

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-08 Thread Peter Geoghegan

On Tue, Apr 7, 2015 at 10:59 PM, Peter Geoghegan  wrote:
> The documentation has been updated to reflect all of this.

By the way, for the convenience of reviewers I continue to maintain a
mirror of pre-built documentation as outlined here:

https://wiki.postgresql.org/wiki/UPSERT#Documentation

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-04-07 Thread Peter Geoghegan

On Mon, Mar 30, 2015 at 9:20 AM, Peter Geoghegan  wrote:
>> * The code uses LockTupleExclusive to lock rows. That means the fkey key
>>   locking doesn't work; That's annoying. This means that using upsert
>>   will potentially cause deadlocks in other sessions :(. I think you'll
>>   have to determine what lock to acquire by fetching the tuple, doing
>>   the key comparison, and acquire the appropriate lock. That should be
>>   fine.
>
> Looking into the implementation of this. As we discussed, the
> difficulty about avoiding a lock escalation within ExecUpdate() is
> that we must fetch the row, run EvalPlanQual() to check if the new row
> version generated by updating will require a LockTupleExclusive or
> LockTupleNoKeyExclusive, and then lock the row using the right
> lockmode, and only then call ExecUpdate(). Right now, UPSERT benefits
> from fetching and locking the row together, so going this way imposes
> a little additional complexity. It's probably worth it, though.

Why do you think deadlocks will be a particular concern? Sure, it
could make the difference between deadlocking and not deadlocking,
which is bad, but it's not obvious to me that the risk would be any
worse than the risk of deadlocking with FKs in 9.2. Is that really all
that bad?


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-31 Thread Peter Geoghegan

On Tue, Mar 31, 2015 at 2:26 PM, Peter Geoghegan  wrote:
> Andres' wish to do things that way is at least partially motivated by
> having logical decoding just work.

I should add that there appears to be some need to terminate the loop
of speculative token waiting. By that I mean that since we're not
looking at the proc array to get a speculative token from
HeapTupleSatisfiesDirty() now, there is a livelock hazard. That goes
away when the speculative inserter cleans up after itself, as Andres
proposed. It would also go away if any speculative waiter cleaned up
after the inserter, which you suggested (that would be kind of
invasive to places like _bt_doinsert(), though). Finally, it would
also work if HeapTupleSatisfiesDirty() tested if the token was still
held directly, before reporting a speculative token, by for example
attempting to briefly acquire a ShareLock on the token (but that would
mean that the extra lock acquisition would be required unless and
until someone updated that originally-speculative tuple, in doing so
finally changing its t_ctid).

I think that we definitely have to do something like this, in any
case. Maybe just have SpeculativeTokenWait deal with the clean up is
cleanest, if we're not going to have inserters clean-up after
themselves immediately per Andres' suggestion.
-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-31 Thread Peter Geoghegan

On Tue, Mar 31, 2015 at 1:09 PM, Heikki Linnakangas  wrote:
> I'm pretty sceptical of that. ISTM you'll need to do modify the page twice
> for each insertion, first to insert the promise tuple, and then to turn the
> promise tuple into a real tuple. And WAL-log both updates. That's going to
> hurt performance.

Andres' wish to do things that way is at least partially motivated by
having logical decoding just work. The co-ordination I'm currently
doing across changes within transaction reassembly is pretty ugly.
Andres has strongly suggested that it's broken, too, since a snapshot
change could occur between a speculative insertion and its super
deletion within transaction resassembly, thus invalidating the
assumption that the next change not being a super deletion means there
is no such super deletion change (i.e. the insert should be treated as
"real").

Anyway, if we don't do this, we'll need to make sure my changes to
transaction reassembly are sound. Hopefully that's an easy fix.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-31 Thread Heikki Linnakangas

On 03/30/2015 07:20 PM, Peter Geoghegan wrote:

>* I think we should decouple the insertion and wal logging more. I think
>   the promise tuple insertion should be different from the final
>   insertion of the actual tuple. For one it seems cleaner to me, for
>   another it will avoid the uglyness around logical decoding. I think
>   also that the separation will make it more realistic to use something
>   like this for a COPY variant that doesn't raise unique violations and
>   such.

Your COPY argument swung this for me. I'm looking into the implementation.

I'm pretty sceptical of that. ISTM you'll need to do modify the page 
twice for each insertion, first to insert the promise tuple, and then to 
turn the promise tuple into a real tuple. And WAL-log both updates. 
That's going to hurt performance.

To recover COPY from unique violations, you can just do the same as 
INSERT ON CONFLICT IGNORE does, and super-delete the inserted tuple on 
conflict. To recover from any random error, you'll need to abort the 
(sub)transaction anyway, and I don't see how it helps to separate the 
insertion of the promise tuple and the "finalization" of the insertion.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-30 Thread Peter Geoghegan

On Sat, Mar 28, 2015 at 6:36 PM, Andres Freund  wrote:
> Just had a longer chat with Peter about this patch.

It was a very useful chat. Thanks for making yourself available to do it.

> * Not a fan of the heap flags usage, the reusage seems sketch to me
> * Explain should show the arbiter index in text as well
> * AddUniqueSpeculative is a bad name, it should refer IndexInfo
> * Work on the ExecInsert() comments
> * Let's remove the planner choosing the "cheapest" arbiter index; it
>   should do all.
> * s/infer_unique_index/infer_arbiter_index/

OK.

> * Not supporting inheritance properly makes me uncomfortable. I don't
>   think users will think that's a acceptable/reasonable restriction.

I'll look into making the inference specification deduce a child relation index.

> * Let's not use t_ctid directly, but add a wrapper

We talked about a union. This seems quite doable.

> * The code uses LockTupleExclusive to lock rows. That means the fkey key
>   locking doesn't work; That's annoying. This means that using upsert
>   will potentially cause deadlocks in other sessions :(. I think you'll
>   have to determine what lock to acquire by fetching the tuple, doing
>   the key comparison, and acquire the appropriate lock. That should be
>   fine.

Looking into the implementation of this. As we discussed, the
difficulty about avoiding a lock escalation within ExecUpdate() is
that we must fetch the row, run EvalPlanQual() to check if the new row
version generated by updating will require a LockTupleExclusive or
LockTupleNoKeyExclusive, and then lock the row using the right
lockmode, and only then call ExecUpdate(). Right now, UPSERT benefits
from fetching and locking the row together, so going this way imposes
a little additional complexity. It's probably worth it, though.

> * I think we should decouple the insertion and wal logging more. I think
>   the promise tuple insertion should be different from the final
>   insertion of the actual tuple. For one it seems cleaner to me, for
>   another it will avoid the uglyness around logical decoding. I think
>   also that the separation will make it more realistic to use something
>   like this for a COPY variant that doesn't raise unique violations and
>   such.

Your COPY argument swung this for me. I'm looking into the implementation.

> * We discussed the infererence and that it should just reuse (code,
>   grammar, docs) the column specification from create index.

Agreed.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-28 Thread Andres Freund

Hi,

Just had a longer chat with Peter about this patch.

* Not a fan of the heap flags usage, the reusage seems sketch to me
* Explain should show the arbiter index in text as well
* AddUniqueSpeculative is a bad name, it should refer IndexInfo
* Work on the ExecInsert() comments
* Let's remove the planner choosing the "cheapest" arbiter index; it
  should do all.
* s/infer_unique_index/infer_arbiter_index/
* Not supporting inheritance properly makes me uncomfortable. I don't
  think users will think that's a acceptable/reasonable restriction.
* Let's not use t_ctid directly, but add a wrapper
* The code uses LockTupleExclusive to lock rows. That means the fkey key
  locking doesn't work; That's annoying. This means that using upsert
  will potentially cause deadlocks in other sessions :(. I think you'll
  have to determine what lock to acquire by fetching the tuple, doing
  the key comparison, and acquire the appropriate lock. That should be
  fine.
* I think we should decouple the insertion and wal logging more. I think
  the promise tuple insertion should be different from the final
  insertion of the actual tuple. For one it seems cleaner to me, for
  another it will avoid the uglyness around logical decoding. I think
  also that the separation will make it more realistic to use something
  like this for a COPY variant that doesn't raise unique violations and
  such.
* We discussed the infererence and that it should just reuse (code,
  grammar, docs) the column specification from create index.
* Some more stuff I don't recall.


Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-26 Thread Gavin Flower


On 27/03/15 09:14, Peter Geoghegan wrote:

On Thu, Mar 26, 2015 at 2:51 PM, Heikki Linnakangas  wrote:

[...]

Oops - You're right. I find it interesting that this didn't result in
breaking my test cases.


[...]

Reminds of the situation where I got an A++ for a COBOL programming 
assignment that successfully handled the test data provided - then I 
found a major bug when 'idly' reviewing my code!  The lecturer (also a 
highly experienced developer) was amused when I pointed it out to her, 
and she said I still deserved the A++!



Cheers,
Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-26 Thread Peter Geoghegan

On Thu, Mar 26, 2015 at 2:51 PM, Heikki Linnakangas  wrote:
>> Attached revision, V3.1, implements this slightly different way of
>> figuring out if an insertion is speculative, as discussed. We reuse
>> t_ctid for speculatively inserted tuples. That's where the counter
>> goes. I think that this is a significant improvement, since there is
>> no longer any need to touch the proc array for any reason, without
>> there being any significant disadvantage that I'm aware of. I also
>> fixed some bitrot, and a bug with index costing (the details aren't
>> terribly interesting - tuple width wasn't being calculated correctly).
>
> Cool. Quickly looking at the patch though - does it actually work as it is?

The test cases pass, including jjanes_upsert, and stress tests that
test for unprincipled deadlocks. But yes, I am entirely willing to
believe that something that was written in such haste could be broken.
My manual testing was pretty minimal.

Sorry for posting a shoddy patch, but I thought it was more important
to show you that this is perfectly workable ASAP.

> RelationPutHeapTuple will overwrite the ctid field when the tuple is put on
> the page, so I don't think the correct token will make it to disk as the
> patch stands.

Oops - You're right. I find it interesting that this didn't result in
breaking my test cases. I guess that not having proc array locking
might have made the difference with unprincipled deadlocks, which I
could not recreate (and row locking saves us from breaking UPSERT, I
think - although if so the token lock would still certainly be needed
for the IGNORE variant). It is interesting that this wasn't obviously
broken for UPSERT, though. I think it at least suggests that when
testing, we need to be more careful with taking a working UPSERT as a
proxy for a working ON CONFLICT IGNORE.

> Also, there are a few places where we currently check if
> t_ctid equals the tuple's location, and try to follow t_ctid if it doesn't.
> I think those need to be taught that t_ctid can also be a token.

I did fix at least some of those. I thought that the choke point for
doing that was fairly small, entirely confined to one or two routines
with heapam.c. But it would surely be better to follow your suggestion
of using an invalid/magic tuple offset value to be sure that it cannot
possibly occur elsewhere. And I'm still using that infomask2 bit,
which is probably not really necessary. So that needs to change too.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-26 Thread Heikki Linnakangas


On 03/26/2015 08:00 PM, Peter Geoghegan wrote:

On Wed, Mar 25, 2015 at 12:42 PM, Peter Geoghegan  wrote:

My next revision will have a more polished version of this scheme. I'm
not going to immediately act on Robert's feedback elsewhere (although
I'd like to), owing to time constraints - no reason to deny you the
opportunity to review the entirely unrelated low-level speculative
locking mechanism due to that.


Attached revision, V3.1, implements this slightly different way of
figuring out if an insertion is speculative, as discussed. We reuse
t_ctid for speculatively inserted tuples. That's where the counter
goes. I think that this is a significant improvement, since there is
no longer any need to touch the proc array for any reason, without
there being any significant disadvantage that I'm aware of. I also
fixed some bitrot, and a bug with index costing (the details aren't
terribly interesting - tuple width wasn't being calculated correctly).


Cool. Quickly looking at the patch though - does it actually work as it 
is? RelationPutHeapTuple will overwrite the ctid field when the tuple is 
put on the page, so I don't think the correct token will make it to disk 
as the patch stands. Also, there are a few places where we currently 
check if t_ctid equals the tuple's location, and try to follow t_ctid if 
it doesn't. I think those need to be taught that t_ctid can also be a token.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-25 Thread Peter Geoghegan

On Wed, Mar 18, 2015 at 2:41 PM, Heikki Linnakangas  wrote:
> Here's what I had in mind: the inserter tags the tuple with the speculative
> insertion token, by storing the token in the t_ctid field. If the inserter
> needs to super-delete the tuple, it sets xmax like in a regular deletion,
> but also sets another flag to indicate that it was a super-deletion.

I was able to quickly hack up a prototype of this in my hotel room at
pgConf.US. It works fine at first blush, passing the jjanes_upsert
stress tests and my own regression tests without a problem. Obviously
it needs more testing and clean-up before posting, but I was pleased
with how easy this was.

> When another backend inserts, and notices that it has a potential conflict
> with the first tuple, it tries to acquire a hw-lock on the token. In most
> cases, the inserter has long since completed the insertion, and the
> acquisition succeeds immediately but you have to check because the token is
> not cleared on a completed insertion.

You don't even have to check/take a ShareLock on the token when the
other xact committed/aborted, because you know that if it is there,
then based on that (and based on the fact that it wasn't super
deleted) the tuple is visible/committed, or (in the event of
other-xact-abort) not visible/aborted. In other words, we continue to
only check for a speculative token when the inserting xact is in
flight - we just take the token from the heap now instead. Not much
needs to change, AFAICT.

> Regarding the physical layout: We can use a magic OffsetNumber value above
> MaxOffsetNumber to indicate that the t_ctid field stores a token rather than
> a regular ctid value. And another magic t_ctid value to indicate that a
> tuple has been super-deleted. The token and the super-deletion flag are
> quite ephemeral, they are not needed after the inserting transaction has
> completed, so it's nice to not consume the valuable infomask bits for these
> things. Those states are conveniently not possible on an updated tuple, when
> we would need the t_ctid field for it's current purpose.

Haven't done anything about this yet. I'm just using an infomask2 bit
for now. Although that was only because I forgot that you suggested
this before having a go at implementing this new t_ctid scheme!

My next revision will have a more polished version of this scheme. I'm
not going to immediately act on Robert's feedback elsewhere (although
I'd like to), owing to time constraints - no reason to deny you the
opportunity to review the entirely unrelated low-level speculative
locking mechanism due to that.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-23 Thread Peter Geoghegan

On Thu, Mar 19, 2015 at 1:02 PM, Robert Haas  wrote:
> I think this is pretty lousy.  The reasons why the user wants things
> that way is because they created a UNIQUE index and it got bloated
> somehow with lots of dead tuples.  So they made a new UNIQUE index on
> the same column and then they're planning to do a DROP INDEX
> CONCURRENTLY on the old one, which is maybe even now in progress.  And
> now they start getting duplicate key failures, the avoidance of which
> was their whole reason for using UPSERT in the first place.  If I were
> that user, I'd report that as a bug, and if someone told me that it
> was intended behavior, I'd say "oh, so you deliberately designed this
> feature to not work some of the time?".
>
> ISTM that we need to (1) decide which operator we're using to compare
> and then (2) tolerate conflicts in every index that uses that
> operator.  In most cases there will only be one, but if there are
> more, so be it.


On reflection, I see your point. I'll try and do something about this too.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-19 Thread Robert Haas

On Wed, Mar 18, 2015 at 3:40 PM, Peter Geoghegan  wrote:
>>> I think Heikki's concern is something different, although I am not
>>> altogether up to speed on this and may be confused.  The issue is:
>>> suppose that process A and process B are both furiously upserting into
>>> the same table with the same key column but, because they have
>>> different costing parameters, process A consistently chooses index X
>>> and process B consistently chooses index Y.  In that situation, will
>>> we deadlock, livelock, error out, bloat, or get any other undesirable
>>> behavior, or is that A-OK?
>>
>> Right, that's what I had in mind.
>
> Oh, I see. I totally failed to understand that that was the concern.
>
> I think it'll be fine. The pre-check is going to look for a heap tuple
> using one or the other of (say) a pair of equivalent indexes. We might
> miss the heap tuple because we picked an index that had yet to have a
> physical index tuple inserted, and then hit a conflict on optimistic
> insertion (the second phase). But there is no reason to think that
> that won't happen anyway. The ordering of operations isn't critical.
>
> The one issue you might still have is a duplicate violation, because
> you happened to infer the unique index that does not get to tolerate
> unique violations as conflicts that can be recovered from (and there
> was a race, which is unlikely). I don't really care about this,
> though. You really are inferring one particular unique index, and for
> reasons like this I think it would be a mistake to try to pretend that
> the unique index is 100% an implementation detail. That's why I called
> the new clause a unique index inference specification.
>
> This hypothetical set of unique indexes will always have n-1 redundant
> unique indexes - they must closely match. That's something that calls
> into question why the user wants things this way to begin with. Also,
> note that one unique index will consistently "win", since the
> insertion order is stable (the relcache puts them in OID order). So it
> will not be all over the map.

I think this is pretty lousy.  The reasons why the user wants things
that way is because they created a UNIQUE index and it got bloated
somehow with lots of dead tuples.  So they made a new UNIQUE index on
the same column and then they're planning to do a DROP INDEX
CONCURRENTLY on the old one, which is maybe even now in progress.  And
now they start getting duplicate key failures, the avoidance of which
was their whole reason for using UPSERT in the first place.  If I were
that user, I'd report that as a bug, and if someone told me that it
was intended behavior, I'd say "oh, so you deliberately designed this
feature to not work some of the time?".

ISTM that we need to (1) decide which operator we're using to compare
and then (2) tolerate conflicts in every index that uses that
operator.  In most cases there will only be one, but if there are
more, so be it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-18 Thread Peter Geoghegan

On Wed, Mar 18, 2015 at 9:19 AM, Robert Haas  wrote:
> On Tue, Mar 17, 2015 at 3:11 PM, Heikki Linnakangas  wrote:
>> I've been thinking that it would be nice to be able to specify a constraint
>> name. Naming an index directly feels wrong, as in relational and SQL
>> philosophy, indexes are just an implementation detail, but naming a
>> constraint is a fair game. It would also be nice to be able to specify "use
>> the primary key".
>
> Intuitively, I think you should specify an operator name, not a
> constraint name.  That's what we do for, e.g., exclusion constraints,
> and it feels right.  People sometimes create and drop indexes (and
> thus, perhaps, the constraints that depend on them) for maintenance
> reasons where a change in semantics will be unwelcome.

I think we should use a constraint name. That is the plain reality of
what we're doing, and is less ambiguous than an operator. 99% of the
time you'll be happy with an unspecified, across-the-board IGNORE, or
won't be using exclusion constraints anyway (so we can infer a unique
index).

A constraint name covers all reasonable cases, since partial unique
indexes are otherwise covered (partial unique indexes do not have a
pg_constraint entry). Oracle has a hint for ignoring particular, named
unique indexes (not constraints). I realize that Oracle hints are not
supposed to affect semantics, but this is actually true (Google it).
This is a bit ugly, but less ugly as the hint, since as Heikki points
out we're only naming a constraint. Naming a constraint reflects the
reality of how the feature needs to work, and has a sort of precedent
from other systems.

> But I don't
> accept Peter's argument that it's OK to be indifferent to which
> particular equality semantics are being used.

I am not suggesting actual indifference makes sense. I am leaving it
up to the definition of available indexes. And there are no known
cases where it could matter anyway, unless you consider the ===
operator for tuples to be a counter example. And you need multiple
conflicting unique indexes on the exact same attributes/expressions on
attributes to begin with. Isn't that a highly esoteric thing to have
to worry about? Perhaps to the extent that literally no one will ever
have to care anyway? It's an oddball use-case, if ever I saw one.

Note: the issue of caring about equality semantics across B-Tree
opclasses of the same type, and the issue of naming unique indexes are
separate issues, AFAICT. No one should confuse them. The only
crossover is that the oddball use-case mentioned could use the named
constraint thing as an escape hatch.

As I've said, I think it's misguided to try to make unique indexes
100% an implementation detail. It's going to fall apart in edge cases,
like the one with multiple unique indexes that I mentioned in my last
e-mail. No one thinks of them that way, including users.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-18 Thread Peter Geoghegan

On Wed, Mar 18, 2015 at 11:41 AM, Heikki Linnakangas  wrote:
> AFAICS, there is no need to go and clear the tag after the insert has
> completed.
>
> Here's what I had in mind: the inserter tags the tuple with the speculative
> insertion token, by storing the token in the t_ctid field. If the inserter
> needs to super-delete the tuple, it sets xmax like in a regular deletion,
> but also sets another flag to indicate that it was a super-deletion.
>
> When another backend inserts, and notices that it has a potential conflict
> with the first tuple, it tries to acquire a hw-lock on the token. In most
> cases, the inserter has long since completed the insertion, and the
> acquisition succeeds immediately but you have to check because the token is
> not cleared on a completed insertion. It would be slightly faster for the
> second backend if the inserter actually removed the token after the
> insertion has completed: it wouldn't need to check the lock if the tuple was
> not tagged in the first place. But that'd be just a tiny optimization, for
> the rare case that there is a conflict, and surely not worth it.
>
> Hmm, I just realized a little fly in that ointment: if the inserter inserts
> enough tuples to wrap around the token counter (4 billion?) in a single
> transaction, another backend that inserts could confuse a new speculative
> insertion with one that completed a long time ago. The risk seems small
> enough that we could just accept it. I don't think anything particularly bad
> would happen on a false positive here. Or we could also use all 48 bits
> available in the t_ctid to push it truly in the realm of ain't-gonna-happen.
> Or we could use (relfilenode, block,  token) as the lock tag for the
> SpeculativeInsertionLock.

Maybe we'd use all 48-bits anyway, but it's not a major concern either
way. Speculative token locks (by design) are only held for an instant.
I think a spurious wait would be entirely benign, and very unlikely.

> Regarding the physical layout: We can use a magic OffsetNumber value above
> MaxOffsetNumber to indicate that the t_ctid field stores a token rather than
> a regular ctid value. And another magic t_ctid value to indicate that a
> tuple has been super-deleted. The token and the super-deletion flag are
> quite ephemeral, they are not needed after the inserting transaction has
> completed, so it's nice to not consume the valuable infomask bits for these
> things. Those states are conveniently not possible on an updated tuple, when
> we would need the t_ctid field for it's current purpose.

You seem pretty convinced that this is the way to go. I'll try and
produce a revision that works this way shortly. I don't think that it
will be all that hard to come up with something to prove the idea out.

>> I think Heikki's concern is something different, although I am not
>> altogether up to speed on this and may be confused.  The issue is:
>> suppose that process A and process B are both furiously upserting into
>> the same table with the same key column but, because they have
>> different costing parameters, process A consistently chooses index X
>> and process B consistently chooses index Y.  In that situation, will
>> we deadlock, livelock, error out, bloat, or get any other undesirable
>> behavior, or is that A-OK?
>
> Right, that's what I had in mind.

Oh, I see. I totally failed to understand that that was the concern.

I think it'll be fine. The pre-check is going to look for a heap tuple
using one or the other of (say) a pair of equivalent indexes. We might
miss the heap tuple because we picked an index that had yet to have a
physical index tuple inserted, and then hit a conflict on optimistic
insertion (the second phase). But there is no reason to think that
that won't happen anyway. The ordering of operations isn't critical.

The one issue you might still have is a duplicate violation, because
you happened to infer the unique index that does not get to tolerate
unique violations as conflicts that can be recovered from (and there
was a race, which is unlikely). I don't really care about this,
though. You really are inferring one particular unique index, and for
reasons like this I think it would be a mistake to try to pretend that
the unique index is 100% an implementation detail. That's why I called
the new clause a unique index inference specification.

This hypothetical set of unique indexes will always have n-1 redundant
unique indexes - they must closely match. That's something that calls
into question why the user wants things this way to begin with. Also,
note that one unique index will consistently "win", since the
insertion order is stable (the relcache puts them in OID order). So it
will not be all over the map.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-18 Thread Heikki Linnakangas


On 03/18/2015 06:23 PM, Robert Haas wrote:

On Tue, Mar 17, 2015 at 6:27 PM, Peter Geoghegan  wrote:

On Tue, Mar 17, 2015 at 12:11 PM, Heikki Linnakangas  wrote:

I'm still not sure the way the speculative locking works is the best
approach. Instead of clearing xmin on super-deletion, a new flag on the heap
tuple seems more straightforward. And we could put the speculative insertion
token in t_ctid, instead of stashing it in the PGPROC array. That would
again seem more straightforward.


I see the appeal of that approach. What concerns me about that
approach is that it makes it the problem of the inserter to confirm
its insertion, even though overwhelmingly, speculative insertions
succeeds (the race window is small due to the pre-check). The current
speculative token is legitimately a very short lived thing, a thing
that I hesitate to write to disk at all. So our optimistic approach to
inserting speculatively loses its optimism, which I don't like, if you
know what I mean.


Modifying a shared buffer is not the same as writing to disk.


AFAICS, there is no need to go and clear the tag after the insert has 
completed.


Here's what I had in mind: the inserter tags the tuple with the 
speculative insertion token, by storing the token in the t_ctid field. 
If the inserter needs to super-delete the tuple, it sets xmax like in a 
regular deletion, but also sets another flag to indicate that it was a 
super-deletion.


When another backend inserts, and notices that it has a potential 
conflict with the first tuple, it tries to acquire a hw-lock on the 
token. In most cases, the inserter has long since completed the 
insertion, and the acquisition succeeds immediately but you have to 
check because the token is not cleared on a completed insertion. It 
would be slightly faster for the second backend if the inserter actually 
removed the token after the insertion has completed: it wouldn't need to 
check the lock if the tuple was not tagged in the first place. But 
that'd be just a tiny optimization, for the rare case that there is a 
conflict, and surely not worth it.


Hmm, I just realized a little fly in that ointment: if the inserter 
inserts enough tuples to wrap around the token counter (4 billion?) in a 
single transaction, another backend that inserts could confuse a new 
speculative insertion with one that completed a long time ago. The risk 
seems small enough that we could just accept it. I don't think anything 
particularly bad would happen on a false positive here. Or we could also 
use all 48 bits available in the t_ctid to push it truly in the realm of 
ain't-gonna-happen. Or we could use (relfilenode, block,  token) as the 
lock tag for the SpeculativeInsertionLock.


Regarding the physical layout: We can use a magic OffsetNumber value 
above MaxOffsetNumber to indicate that the t_ctid field stores a token 
rather than a regular ctid value. And another magic t_ctid value to 
indicate that a tuple has been super-deleted. The token and the 
super-deletion flag are quite ephemeral, they are not needed after the 
inserting transaction has completed, so it's nice to not consume the 
valuable infomask bits for these things. Those states are conveniently 
not possible on an updated tuple, when we would need the t_ctid field 
for it's current purpose.



If I'm reading this correctly, if there are multiple indexes that match the
unique index inference specification, we pick the cheapest one. Isn't that
unstable? Two backends running the same INSERT ON CONFLICT statement might
pick different indexes, and the decision might change over time as the table
is analyzed. I think we should have a more robust rule. Could we easily just
use all matching indexes?


Robert feel pretty strongly that there should be a costing aspect to
this. Initially I was skeptical of this, but just did what he said all
the same. But I've since come around to his view entirely because we
now support partial unique indexes.


I think Heikki's concern is something different, although I am not
altogether up to speed on this and may be confused.  The issue is:
suppose that process A and process B are both furiously upserting into
the same table with the same key column but, because they have
different costing parameters, process A consistently chooses index X
and process B consistently chooses index Y.  In that situation, will
we deadlock, livelock, error out, bloat, or get any other undesirable
behavior, or is that A-OK?


Right, that's what I had in mind.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-18 Thread Robert Haas

On Tue, Mar 17, 2015 at 6:27 PM, Peter Geoghegan  wrote:
> On Tue, Mar 17, 2015 at 12:11 PM, Heikki Linnakangas  wrote:
>> I'm still not sure the way the speculative locking works is the best
>> approach. Instead of clearing xmin on super-deletion, a new flag on the heap
>> tuple seems more straightforward. And we could put the speculative insertion
>> token in t_ctid, instead of stashing it in the PGPROC array. That would
>> again seem more straightforward.
>
> I see the appeal of that approach. What concerns me about that
> approach is that it makes it the problem of the inserter to confirm
> its insertion, even though overwhelmingly, speculative insertions
> succeeds (the race window is small due to the pre-check). The current
> speculative token is legitimately a very short lived thing, a thing
> that I hesitate to write to disk at all. So our optimistic approach to
> inserting speculatively loses its optimism, which I don't like, if you
> know what I mean.

Modifying a shared buffer is not the same as writing to disk.

>> If I'm reading this correctly, if there are multiple indexes that match the
>> unique index inference specification, we pick the cheapest one. Isn't that
>> unstable? Two backends running the same INSERT ON CONFLICT statement might
>> pick different indexes, and the decision might change over time as the table
>> is analyzed. I think we should have a more robust rule. Could we easily just
>> use all matching indexes?
>
> Robert feel pretty strongly that there should be a costing aspect to
> this. Initially I was skeptical of this, but just did what he said all
> the same. But I've since come around to his view entirely because we
> now support partial unique indexes.

I think Heikki's concern is something different, although I am not
altogether up to speed on this and may be confused.  The issue is:
suppose that process A and process B are both furiously upserting into
the same table with the same key column but, because they have
different costing parameters, process A consistently chooses index X
and process B consistently chooses index Y.  In that situation, will
we deadlock, livelock, error out, bloat, or get any other undesirable
behavior, or is that A-OK?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-18 Thread Robert Haas

On Tue, Mar 17, 2015 at 3:11 PM, Heikki Linnakangas  wrote:
> I've been thinking that it would be nice to be able to specify a constraint
> name. Naming an index directly feels wrong, as in relational and SQL
> philosophy, indexes are just an implementation detail, but naming a
> constraint is a fair game. It would also be nice to be able to specify "use
> the primary key".

Intuitively, I think you should specify an operator name, not a
constraint name.  That's what we do for, e.g., exclusion constraints,
and it feels right.  People sometimes create and drop indexes (and
thus, perhaps, the constraints that depend on them) for maintenance
reasons where a change in semantics will be unwelcome.  But I don't
accept Peter's argument that it's OK to be indifferent to which
particular equality semantics are being used.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-18 Thread Dean Rasheed

On 17 March 2015 at 23:25, Peter Geoghegan  wrote:
>> Possibly I'm missing something though.
>
> I think that you may have. Did you read the commit message/docs of the
> RLS commit 0004-*? You must consider the second point here, I believe:
>
> 
> The 3 places that RLS policies are enforced are:
>
> * Against row actually inserted, after insertion proceeds successfully
>   (INSERT-applicable policies only).
>
> * Against row in target table that caused conflict.  The implementation
>   is careful not to leak the contents of that row in diagnostic
>   messages (INSERT-applicable *and* UPDATE-applicable policies).
>
> * Against the version of the row added by to the relation after
>   ExecUpdate() is called (INSERT-applicable *and* UPDATE-applicable
>   policies).
>
> 
>

Yes, I read that, and I agree with the intention to not leak data
according to both the INSERT and UPDATE policies, however...

> You're seeing a failure that applies to the target tuple of the UPDATE
> (the tuple that we can't leak the contents of). I felt it was best to
> check all policies against the target/existing tuple, including both
> WITH CHECK OPTIONS and USING quals (which are both enforced).
>

I think that's an incorrect implementation of the RLS UPDATE policy.
The WITH CHECK quals of a RLS policy are intended to be applied to the
NEW data, not the existing data. This patch is applying the WITH CHECK
quals to both the existing and NEW tuples, which runs counter to the
way RLS polices are normally enforced, and I think that will just lead
to confusion.

> I can see why you might not like that behavior, but it is the intended
> behavior. I thought that this whole intersection of RLS + UPSERT is
> complex enough that it would be best to be almost as conservative as
> possible in what fails and what succeeds. The one exception is when
> the insert path is actually taken, since the statement is an INSERT
> statement.

The problem with that is that the user will see errors saying that the
data violates the RLS WITH CHECK policy, when they might quite
reasonably argue that it doesn't. That's not really being
conservative. I'd argue it's a bug.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-17 Thread Peter Geoghegan

On Mon, Mar 16, 2015 at 8:10 AM, Dean Rasheed  wrote:
> (Note there is some bitrot in gram.y that prevents the first patch
> from applying cleanly to HEAD)

That's trivially fixable. I'll have those fixes in the next revision,
once I firm some things up with Heikki.

> I tested using the attached script, and one test didn't behave as I
> expected. I believe the following should have been a valid upsert
> (following the update path) but actually it failed:
>
> INSERT INTO t1 VALUES (4, 0) ON CONFLICT (a) UPDATE SET b = 1;
>
> AFAICT, it is applying a WITH CHECK OPTION with qual "b > 0 AND a % 2
> = 0" to the about-to-be-updated tuple (a=4, b=0), which is wrong
> because the "b > 0" check (policy p3) should only be applied to the
> post-update tuple.
>
> Possibly I'm missing something though.

I think that you may have. Did you read the commit message/docs of the
RLS commit 0004-*? You must consider the second point here, I believe:

The 3 places that RLS policies are enforced are:

* Against row actually inserted, after insertion proceeds successfully
  (INSERT-applicable policies only).

* Against row in target table that caused conflict.  The implementation
  is careful not to leak the contents of that row in diagnostic
  messages (INSERT-applicable *and* UPDATE-applicable policies).

* Against the version of the row added by to the relation after
  ExecUpdate() is called (INSERT-applicable *and* UPDATE-applicable
  policies).

You're seeing a failure that applies to the target tuple of the UPDATE
(the tuple that we can't leak the contents of). I felt it was best to
check all policies against the target/existing tuple, including both
WITH CHECK OPTIONS and USING quals (which are both enforced).

I can see why you might not like that behavior, but it is the intended
behavior. I thought that this whole intersection of RLS + UPSERT is
complex enough that it would be best to be almost as conservative as
possible in what fails and what succeeds. The one exception is when
the insert path is actually taken, since the statement is an INSERT
statement.
-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-17 Thread Peter Geoghegan

On Tue, Mar 17, 2015 at 12:11 PM, Heikki Linnakangas  wrote:
> I'm still not sure the way the speculative locking works is the best
> approach. Instead of clearing xmin on super-deletion, a new flag on the heap
> tuple seems more straightforward. And we could put the speculative insertion
> token in t_ctid, instead of stashing it in the PGPROC array. That would
> again seem more straightforward.

I see the appeal of that approach. What concerns me about that
approach is that it makes it the problem of the inserter to confirm
its insertion, even though overwhelmingly, speculative insertions
succeeds (the race window is small due to the pre-check). The current
speculative token is legitimately a very short lived thing, a thing
that I hesitate to write to disk at all. So our optimistic approach to
inserting speculatively loses its optimism, which I don't like, if you
know what I mean.

OTOH, apart from avoiding stashing things in PGPROC, I do see another
advantage to what you outline. Doing things this way (and making the
insertion and confirmation of a speculative insertion a two-phased
thing) unambiguously fixes all problems with logical decoding, without
adding unexpected cross-change-coordination tricks during transaction
reassembly, and really without much further thought. All we do is get
a new case to decode, that ultimately produces a
REORDER_BUFFER_CHANGE_INSERT change, which Andres more or less wanted
to do anyway.

Under this scheme with using t_ctid, heap_insert() would do minimal
WAL logging, necessary for the same reasons that some WAL logging is
required within heap_lock_tuple() (so the new record type is very
similar to the existing, simple xl_heap_lock record type). We'd use
one of the two remaining bits within t_infomask2 for this, so that
waiters will know to interpret t_ctid as a speculative insertion token
(and then wait on it). Then, after speculative insertion succeeded,
we'd WAL log something closer to an UPDATE to confirm that insertion
was in fact successful, which is where the contents of the new tuple
is actually finally WAL-logged (like UPDATEs used to be, but without a
new tuple being WAL-logged at all, since there of course is none).

Is that more or less what you have in mind?

> A couple of quick random comments:
>
>> /*
>>  * plan_speculative_use_index
>>  *  Use the planner to decide speculative insertion arbiter
>> index
>>  *
>>  * Among indexes on target of INSERT ... ON CONFLICT, decide which index
>> to
>>  * use to arbitrate taking alternative path.  This should be called
>>  * infrequently in practice, because it's unusual for more than one index
>> to
>>  * be available that can satisfy a user-specified unique index inference
>>  * specification.
>>  *
>>  * Note: caller had better already hold some type of lock on the table.
>>  */
>> Oid
>> plan_speculative_use_index(PlannerInfo *root, List *indexList)
>> {
>> ...
>> /* Locate cheapest IndexOptInfo for the target index */
>
>
> If I'm reading this correctly, if there are multiple indexes that match the
> unique index inference specification, we pick the cheapest one. Isn't that
> unstable? Two backends running the same INSERT ON CONFLICT statement might
> pick different indexes, and the decision might change over time as the table
> is analyzed. I think we should have a more robust rule. Could we easily just
> use all matching indexes?

Robert feel pretty strongly that there should be a costing aspect to
this. Initially I was skeptical of this, but just did what he said all
the same. But I've since come around to his view entirely because we
now support partial unique indexes.

You can add a predicate to a unique index inference specification, and
you're good, even if there is no partial index on those
attributes/expressions exactly matching the DML/inference predicate -
we use predicate_implied_by() for this, which works with an equivalent
non-partial unique index. This is because a non-partial unique index
covers those attributes sufficiently. There may be value in preferring
the cheap partial index, and then verifying that it actually did cover
the final inserted tuple version (which is how things work now). If we
cannot discriminate against one particular unique index, making sure
it covers the inserted heap tuple (by throwing a user-visible
ERRCODE_TRIGGERED_ACTION_EXCEPTION error if it doesn't, as I currently
do within ExecInsertIndexTuples()) is on shaky ground as a
user-visible behavior. I like that behavior, though - seems less
surprising than letting a failure to actually cover a selective
partial unique index predicate (that of the one and only arbiter
index) slide. You can only get this ERRCODE_TRIGGERED_ACTION_EXCEPTION
error when you actually had a predicate in your unique index inference
specification in the first place, so seems likely that you want the
error. But, I also like supporting unique indexes that are non-partial
even in the presence of a predicate in the unique

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-17 Thread Peter Geoghegan

On Thu, Mar 5, 2015 at 3:44 PM, Peter Geoghegan  wrote:
> Bruce Momjian kindly made available a server for stress-testing [1].
> I'm using jjanes_upsert for this task (I stopped doing this for a
> little while, and was in need of a new server).

BTW, this was run for about another week on Bruce's server, and no
further issues arose affecting either the B-Tree or exclusion
constraint implementations. I've stopped with it for now, because it
feels unlikely that I'm going to find any more bugs this way.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-17 Thread Heikki Linnakangas


On 03/05/2015 03:18 AM, Peter Geoghegan wrote:

Attached patch series forms what I'm calling V3.0 of the INSERT ... ON
CONFLICT IGNORE/UPDATE feature. (Sorry about all the threads. I feel
this development justifies a new thread, though.)


I'm still not sure the way the speculative locking works is the best 
approach. Instead of clearing xmin on super-deletion, a new flag on the 
heap tuple seems more straightforward. And we could put the speculative 
insertion token in t_ctid, instead of stashing it in the PGPROC array. 
That would again seem more straightforward.


A couple of quick random comments:


/*
 * plan_speculative_use_index
 *  Use the planner to decide speculative insertion arbiter index
 *
 * Among indexes on target of INSERT ... ON CONFLICT, decide which index to
 * use to arbitrate taking alternative path.  This should be called
 * infrequently in practice, because it's unusual for more than one index to
 * be available that can satisfy a user-specified unique index inference
 * specification.
 *
 * Note: caller had better already hold some type of lock on the table.
 */
Oid
plan_speculative_use_index(PlannerInfo *root, List *indexList)
{
...
/* Locate cheapest IndexOptInfo for the target index */


If I'm reading this correctly, if there are multiple indexes that match 
the unique index inference specification, we pick the cheapest one. 
Isn't that unstable? Two backends running the same INSERT ON CONFLICT 
statement might pick different indexes, and the decision might change 
over time as the table is analyzed. I think we should have a more robust 
rule. Could we easily just use all matching indexes?




... Deferred unique constraints are not supported as
+   arbiters of whether an alternative ON CONFLICT path
+   should be taken.


We really need to find a shorter term for "arbiter of whether an 
alternative path should be taken". Different variations of that term are 
used a lot, and it's tedious to read.



* There is still an unresolved semantics issue with unique index
inference and non-default opclasses. I think it's sufficient that the
available/defined unique indexes dictate our idea of a unique
violation (that necessitates taking the alternative path). Even in a
world where there exists a non-default opclass with an "equals"
operator that does not match that of the default opclass (that is not
really the world we live in, because the only counter-example known is
made that way specifically to *discourage* its use by users), this
seems okay to me. It seems okay to me because surely the relevant
definition of equality is the one actually chosen for the available
unique index. If there exists an ambiguity for some strange reason
(i.e. there are two unique indexes, on the same attribute(s), but with
different "equals" operators), then its a costing issue, so the
behavior given is essentially non-predictable (it could end up being
either...but hey, those are the semantics). I have a very hard time
imagining how that could ever be the case, even when we have (say)
case insensitive opclasses for the text type. A case insensitive
opclass is stricter than a case sensitive opclass.  Why would a user
ever want both on the same attribute(s) of the same table? Is the user
really more or less expecting to never get a unique violation on the
non-arbitrating unique index, despite all this?

If reviewers are absolutely insistent that this theoretical ambiguity
is a problem, we can add an optional CREATE INDEX style opclass
specification (I'm already using the IndexElems representation from
CREATE INDEX for the inference specification, actually, so that would
be easy enough). I really have a hard time believing that the ugliness
is a problem for those hypothetical users that eventually consider
"equals" operator ambiguity among opclasses among available unique
indexes to be a problem. I haven't just gone and implemented this
already because I didn't want to document that an opclass
specification will be accepted. Since there is literally no reason why
anyone would care today, I see no reason to add what IMV would really
just be cruft.


I've been thinking that it would be nice to be able to specify a 
constraint name. Naming an index directly feels wrong, as in relational 
and SQL philosophy, indexes are just an implementation detail, but 
naming a constraint is a fair game. It would also be nice to be able to 
specify "use the primary key".


Attached patch contains a few more things I saw at a quick read.

- Heikki

diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml
index 46b8db8..d6fa98c 100644
--- a/doc/src/sgml/fdwhandler.sgml
+++ b/doc/src/sgml/fdwhandler.sgml
@@ -1014,6 +1014,10 @@ GetForeignServerByName(const char *name, bool missing_ok);
  source provides.
 
 
+
 
  INSERT with an ON CONFLICT clause is not
  supported with a unique index inference specification.  When
diff --git a/doc/src/sgml/ref/create_view.sgml

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-16 Thread Dean Rasheed

On 5 March 2015 at 23:44, Peter Geoghegan  wrote:
> On Wed, Mar 4, 2015 at 5:18 PM, Peter Geoghegan  wrote:
>> Attached patch series forms what I'm calling V3.0 of the INSERT ... ON
>> CONFLICT IGNORE/UPDATE feature. (Sorry about all the threads. I feel
>> this development justifies a new thread, though.)
>

Hi,

I had a play with this, mainly looking at the interaction with RLS.

(Note there is some bitrot in gram.y that prevents the first patch
from applying cleanly to HEAD)

I tested using the attached script, and one test didn't behave as I
expected. I believe the following should have been a valid upsert
(following the update path) but actually it failed:

INSERT INTO t1 VALUES (4, 0) ON CONFLICT (a) UPDATE SET b = 1;

AFAICT, it is applying a WITH CHECK OPTION with qual "b > 0 AND a % 2
= 0" to the about-to-be-updated tuple (a=4, b=0), which is wrong
because the "b > 0" check (policy p3) should only be applied to the
post-update tuple.

Possibly I'm missing something though.

Regards,
Dean
\set ECHO all
DROP TABLE IF EXISTS t1;
CREATE TABLE t1(a int PRIMARY KEY, b int);

CREATE POLICY p1 ON t1 FOR INSERT WITH CHECK (b = 0);
CREATE POLICY p2 ON t1 FOR UPDATE USING (a % 2 = 0);
CREATE POLICY p3 ON t1 FOR UPDATE WITH CHECK (b > 0);

ALTER TABLE t1 ENABLE ROW LEVEL SECURITY;
SET row_security = force;

-- Valid inserts
INSERT INTO t1 VALUES (1, 0);
INSERT INTO t1 VALUES (2, 0);

-- Insert that should fail policy p1
INSERT INTO t1 VALUES (3, 1);

-- Valid update
UPDATE t1 SET b = 1 WHERE a = 2;

-- Update that should fail policy p2 (not an error, just no rows updated)
UPDATE t1 SET b = 1 WHERE a = 1;

-- Update that should fail policy p3
UPDATE t1 SET b = 0 WHERE a = 2;

-- Valid upserts (insert path)
INSERT INTO t1 VALUES (3, 0) ON CONFLICT (a) UPDATE SET b = 0;
INSERT INTO t1 VALUES (4, 0) ON CONFLICT (a) UPDATE SET b = 0;

-- Upsert (insert path) that should fail policy p1
INSERT INTO t1 VALUES (5, 1) ON CONFLICT (a) UPDATE SET b = 1;

-- Upsert (update path) that should fail policy p1
INSERT INTO t1 VALUES (3, 1) ON CONFLICT (a) UPDATE SET b = 1;

-- Valid upsert (update path)
-- XXX: Should pass, but fails WCO on t1
--  Failing WCO: (b > 0 AND a % 2 = 0) = p3 AND p2
INSERT INTO t1 VALUES (4, 0) ON CONFLICT (a) UPDATE SET b = 1;

-- Upsert (update path) that should fail policy p2
INSERT INTO t1 VALUES (3, 0) ON CONFLICT (a) UPDATE SET b = 1;

-- Upsert (update path) that should fail policy p3
INSERT INTO t1 VALUES (4, 0) ON CONFLICT (a) UPDATE SET b = 0;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0

2015-03-05 Thread Peter Geoghegan

On Wed, Mar 4, 2015 at 5:18 PM, Peter Geoghegan  wrote:
> Attached patch series forms what I'm calling V3.0 of the INSERT ... ON
> CONFLICT IGNORE/UPDATE feature. (Sorry about all the threads. I feel
> this development justifies a new thread, though.)

Bruce Momjian kindly made available a server for stress-testing [1].
I'm using jjanes_upsert for this task (I stopped doing this for a
little while, and was in need of a new server).

At the very highest client count I'm testing (128), I see unprincipled
deadlocks for the exclusion constraint case very infrequently:

"
2015-03-05 14:09:36 EST [ 64987901 ]: ERROR:  deadlock detected
2015-03-05 14:09:36 EST [ 64987901 ]: DETAIL:  Process 7044 waits for
ShareLock on promise tuple with token 1 of transaction 64987589;
blocked by process 7200.
Process 7200 waits for ShareLock on transaction 64987901;
blocked by process 7044.
Process 7044: insert into upsert_race_test (index, count)
values ('541','-1') on conflict
  update set count=TARGET.count + EXCLUDED.count
  where TARGET.index = EXCLUDED.index
  returning count
Process 7200: insert into upsert_race_test (index, count)
values ('541','-1') on conflict
  update set count=TARGET.count + EXCLUDED.count
  where TARGET.index = EXCLUDED.index
  returning count
2015-03-05 14:09:36 EST [ 64987901 ]: HINT:  See server log for query details.
2015-03-05 14:09:36 EST [ 64987901 ]: STATEMENT:  insert into
upsert_race_test (index, count) values ('541','-1') on conflict
  update set count=TARGET.count + EXCLUDED.count
  where TARGET.index = EXCLUDED.index
  returning count

"

(Reminder: Exclusion constraints doing UPSERTs, and not just IGNOREs,
are artificial; this has been enabled within the parser solely for the
benefit of this stress-testing. Also, the B-Tree AM does not have nor
require "livelock insurance").

This only happens after just over 30 minutes, while consistently doing
128 client exclusion constraint runs. This is pretty close to the most
stressful thing that you could throw at the implementation, so that's
really not too bad.

I believe that this regression occurred as a direct result of adding
"livelock insurance". Basically, we cannot be 100% sure that the
interleaving of WAL-logged things within and across transactions won't
be such that the "only", "oldest" session that gets to wait in the
second stage (the second possible call to
check_exclusion_or_unique_constraint(), from ExecInsertIndexTuples())
will really be the "oldest" XID. Another *older* xact could just get
in ahead of us, waiting on our promise tuple as we wait on its xact
end (maybe it updates some third tuple that we didn't see in that has
already committed...not 100% sure). Obviously XID assignment order
does not guarantee that things like heap insertion and index tuple
insertion occur in serial XID order, especially with confounding
factors like super deletion. And so, every once in a long while we
deadlock.

Now, the very fact that this happens at all actually demonstrates the
need for "livelock insurance", IMV. The fact that we reliably
terminate is *reassuring*, because livelocks are in general a
terrifying possibility. We cannot *truly* solve the unprincipled
deadlock problem without adding livelocks, I think. But what we have
here is very close to eliminating unprincipled deadlocks, while not
also adding any livelocks, AFAICT. I'd argue that that's good enough.

Of course, when I remove "livelock insurance", the problem ostensibly
goes away (I've waited on such a stress test session for a couple of
hours now, so this conclusion seems very likely to be correct). I
think that we should do nothing about this, other than document it as
possible in our comments on "livelock insurance". Again, it's very
unlikely to be a problem in the real world, especially since B-Trees
are unaffected.

Also, I should point out that one of those waiters doesn't look like
an insertion-related wait at all: "7200 waits for ShareLock on
transaction 64987901; blocked by process 7044". Looks like row locking
is an essential part of this deadlock, and ordinarily that isn't even
possible for exclusion constraints (unless the patch is hacked to make
the parser accept exclusion constraints for an UPSERT, as it has been
here). Not quite sure what exact  XactLockTableWait() call did this,
but don't think it was any of them within
check_exclusion_or_unique_constraint(), based on the lack of details
(TID and so on are not shown).

[1] http://momjian.us/main/blogs/pgblog/2012.html#January_20_2012
-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

74 matches

Mail list logo