Re: [HACKERS] Optimising Foreign Key checks

2013-06-10 Thread Noah Misch
On Sun, Jun 09, 2013 at 10:51:43AM +0100, Simon Riggs wrote:
 On 9 June 2013 02:12, Noah Misch n...@leadboat.com wrote:
  On Sat, Jun 08, 2013 at 08:20:42PM -0400, Robert Haas wrote:
  On Sat, Jun 8, 2013 at 5:41 PM, Noah Misch n...@leadboat.com wrote:
   Likewise; I don't see why we couldn't perform an optimistic check ASAP 
   and
   schedule a final after-statement check when an early check fails.  That
   changes performance characteristics without changing semantics.
 
  ...this seems like it might have some promise; but what if the action
  we're performing isn't idempotent?  And how do we know?
 
  The action discussed so far is RI_FKey_check_ins().  It acquires a KEY SHARE
  lock (idempotent by nature) on a row that it finds using B-tree equality
  (presumed IMMUTABLE, thus idempotent).  RI_FKey_check_upd() is nearly the 
  same
  action, so the same argument holds.  Before treating any other operation in
  the same way, one would need to conduct similar analysis.
 
 As long as we are talking about FKs only, then this approach can work.
 All we are doing is applying the locks slightly earlier than before.
 Once locked they will prevent any later violations, so we are safe
 from anybody except *ourselves* from making changes that would
 invalidate the earlier check.  Trouble is, there are various ways I
 can see that as possible, so making a check early doesn't allow you to
 avoid making the check later as well.

This UPDATE or DELETE that invalidates the check by modifying the PK row will
fire the usual RI_FKey_*_{upd,del} trigger on the PK table.  That will (a)
fail the transaction, (b) CASCADE to delete the new FK row, or (c) update the
new FK row's key column to NULL/DEFAULT.  If (a) happens we're of course fine.
If (b) or (c) happens, the FK's AFTER check already today becomes a no-op due
to the visibility test in RI_FKey_check().  Consequently, I don't think later
actions of the SQL statement can put us in a position to need a second check.

 AFAICS there are weird cases where changing the way FKs execute will
 change the way complex trigger applications will execute. I don't see
 a way to avoid that other than do nothing. Currently, we execute the
 checks following the normal order of execution rules for triggers.
 Every idea we've had so far changes that in some way; variously in
 major or minor ways, but changed nonetheless.

I've tried to envision a trigger-creates-missing-references scenario that
would notice the difference.  The trigger in question would, I'm presuming, be
an AFTER ROW INSERT trigger named such that it fires before the FK trigger.
The initial optimistic FK check would fail, so we would queue a traditional FK
AFTER trigger.  From that point, the scenario proceeds exactly as it proceeds
today.  Could you detail a problem scenario you have in mind?

 Even the approach of deferring checks to allow them to be applied in a
 batch mean we might change the way applications execute in detail.
 However, since the only possible change there would be to decrease the
 number of self-induced failures that seems OK.
 
 So the question is how much change do we want to introduce? I'll guess
 not much, rather than lots or none.

The batch would need to fire at the trigger firing position of the *last*
queue entry it covers.  If you run a final FK check earlier than that, other
AFTER triggers that expect to run before the FK check and affect its outcome
may not yet have run.  In contrast, an FK check running later than usual is
mostly fine; whether a tuple has been locked does not affect the semantics of
later ordinary actions in the transaction.  (I say ordinary to exclude a
function like pgrowlocks() that makes a point to discern.  Also note that the
reasoning about timing only applies to definitive FK checks that can throw
errors; a tentative, soft-failure check is acceptable anytime after the new
row is in place.)

One can, however, at least construct problem cases.  When a query calls
functions that perform non-transactional actions, changing the timing of an
ERROR with respect to those calls changes application behavior.  Taking locks
in a different order affects the incidence of deadlocks.  Does compatibility
to that degree have much value?  I'd be happy to file those in the not much
change category.

 Proposal: Have a WHEN clause that accumulates values to be checked in
 a hash table up to work_mem in size, allowing us to eliminate the most
 common duplicates (just not *all* duplicates). If the value isn't a
 duplicate (or at least the first seen tuple with that value), we will
 queue up a check for later. That approach gives us *exactly* what we
 have now and works with the two common cases: i) few, mostly
 duplicated values, ii) many values, but clustered together. Then apply
 changes in batches at end of statement.

I'm still fine with this proposal, but it does not dramatically sidestep these
sorts of tricky situations.  Suppose a COPY inserts rows with fkcol=1 at TIDs
(0,3), 

Re: [HACKERS] Optimising Foreign Key checks

2013-06-10 Thread Simon Riggs
On 10 June 2013 07:06, Noah Misch n...@leadboat.com wrote:
 On Sun, Jun 09, 2013 at 10:51:43AM +0100, Simon Riggs wrote:
 On 9 June 2013 02:12, Noah Misch n...@leadboat.com wrote:
  On Sat, Jun 08, 2013 at 08:20:42PM -0400, Robert Haas wrote:
  On Sat, Jun 8, 2013 at 5:41 PM, Noah Misch n...@leadboat.com wrote:
   Likewise; I don't see why we couldn't perform an optimistic check ASAP 
   and
   schedule a final after-statement check when an early check fails.  That
   changes performance characteristics without changing semantics.
 
  ...this seems like it might have some promise; but what if the action
  we're performing isn't idempotent?  And how do we know?
 
  The action discussed so far is RI_FKey_check_ins().  It acquires a KEY 
  SHARE
  lock (idempotent by nature) on a row that it finds using B-tree equality
  (presumed IMMUTABLE, thus idempotent).  RI_FKey_check_upd() is nearly the 
  same
  action, so the same argument holds.  Before treating any other operation in
  the same way, one would need to conduct similar analysis.

 As long as we are talking about FKs only, then this approach can work.
 All we are doing is applying the locks slightly earlier than before.
 Once locked they will prevent any later violations, so we are safe
 from anybody except *ourselves* from making changes that would
 invalidate the earlier check.  Trouble is, there are various ways I
 can see that as possible, so making a check early doesn't allow you to
 avoid making the check later as well.

 This UPDATE or DELETE that invalidates the check by modifying the PK row will
 fire the usual RI_FKey_*_{upd,del} trigger on the PK table.  That will (a)
 fail the transaction, (b) CASCADE to delete the new FK row, or (c) update the
 new FK row's key column to NULL/DEFAULT.  If (a) happens we're of course fine.
 If (b) or (c) happens, the FK's AFTER check already today becomes a no-op due
 to the visibility test in RI_FKey_check().  Consequently, I don't think later
 actions of the SQL statement can put us in a position to need a second check.

 AFAICS there are weird cases where changing the way FKs execute will
 change the way complex trigger applications will execute. I don't see
 a way to avoid that other than do nothing. Currently, we execute the
 checks following the normal order of execution rules for triggers.
 Every idea we've had so far changes that in some way; variously in
 major or minor ways, but changed nonetheless.

 I've tried to envision a trigger-creates-missing-references scenario that
 would notice the difference.  The trigger in question would, I'm presuming, be
 an AFTER ROW INSERT trigger named such that it fires before the FK trigger.
 The initial optimistic FK check would fail, so we would queue a traditional FK
 AFTER trigger.  From that point, the scenario proceeds exactly as it proceeds
 today.  Could you detail a problem scenario you have in mind?

 Even the approach of deferring checks to allow them to be applied in a
 batch mean we might change the way applications execute in detail.
 However, since the only possible change there would be to decrease the
 number of self-induced failures that seems OK.

 So the question is how much change do we want to introduce? I'll guess
 not much, rather than lots or none.

 The batch would need to fire at the trigger firing position of the *last*
 queue entry it covers.  If you run a final FK check earlier than that, other
 AFTER triggers that expect to run before the FK check and affect its outcome
 may not yet have run.  In contrast, an FK check running later than usual is
 mostly fine; whether a tuple has been locked does not affect the semantics of
 later ordinary actions in the transaction.  (I say ordinary to exclude a
 function like pgrowlocks() that makes a point to discern.  Also note that the
 reasoning about timing only applies to definitive FK checks that can throw
 errors; a tentative, soft-failure check is acceptable anytime after the new
 row is in place.)

 One can, however, at least construct problem cases.  When a query calls
 functions that perform non-transactional actions, changing the timing of an
 ERROR with respect to those calls changes application behavior.  Taking locks
 in a different order affects the incidence of deadlocks.  Does compatibility
 to that degree have much value?  I'd be happy to file those in the not much
 change category.

 Proposal: Have a WHEN clause that accumulates values to be checked in
 a hash table up to work_mem in size, allowing us to eliminate the most
 common duplicates (just not *all* duplicates). If the value isn't a
 duplicate (or at least the first seen tuple with that value), we will
 queue up a check for later. That approach gives us *exactly* what we
 have now and works with the two common cases: i) few, mostly
 duplicated values, ii) many values, but clustered together. Then apply
 changes in batches at end of statement.

 I'm still fine with this proposal, but it does not dramatically sidestep 

Re: [HACKERS] Optimising Foreign Key checks

2013-06-10 Thread Noah Misch
On Mon, Jun 10, 2013 at 09:05:40AM +0100, Simon Riggs wrote:
 Your earlier comments argue that it is OK to make an early check. The
 above seems to argue the opposite, not sure.

I'll attempt to summarize.  If we execute a traditional error-throwing FK
check any earlier than we execute it today, applications with certain triggers
will notice a behavior change (probably not OK).  However, we *can* safely
execute an optimistic FK check as early as just after ExecInsertIndexTuples().
If the optimistic check is successful, later activity cannot invalidate its
success as concerns that particular inserted tuple.

 IIUYC we can do this:
 
 * search hash table for a value, if found, skip check and continue
 * if entry in hash not found make an immediate FK check
 * if the check passes, store value in hash table, if it fits
 * if check does not pass or value doesn't fit, queue up an after
 trigger queue entry

Why shall doesn't-fit prompt an after-statement recheck?

You do need a mechanism to invalidate table entries or the entire table.  As a
first cut at that, perhaps have parent table RI triggers empty any local hash
tables of the same FK relationship.  Note that invalidating table entries does
not invalidate skips already done on the strength of those entries.

 except we want to batch things a little, so same algo just with a
 little batching.
 
 * search hash table for a value, if found, skip check and continue
 * if entry in hash not found add to next batch of checks and continue
 * when batch full make immediate FK checks for whole batch in one SQL stmt
 * if a check passes, store value in hash table, if it fits
 * if check does not pass or value doesn't fit, queue up an after
 trigger queue entry
 * when executing queue, use batches to reduce number of SQL stmts

I think this all can be made to work, too.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimising Foreign Key checks

2013-06-09 Thread Simon Riggs
On 9 June 2013 02:12, Noah Misch n...@leadboat.com wrote:
 On Sat, Jun 08, 2013 at 08:20:42PM -0400, Robert Haas wrote:
 On Sat, Jun 8, 2013 at 5:41 PM, Noah Misch n...@leadboat.com wrote:
  Likewise; I don't see why we couldn't perform an optimistic check ASAP and
  schedule a final after-statement check when an early check fails.  That
  changes performance characteristics without changing semantics.

 ...this seems like it might have some promise; but what if the action
 we're performing isn't idempotent?  And how do we know?

 The action discussed so far is RI_FKey_check_ins().  It acquires a KEY SHARE
 lock (idempotent by nature) on a row that it finds using B-tree equality
 (presumed IMMUTABLE, thus idempotent).  RI_FKey_check_upd() is nearly the same
 action, so the same argument holds.  Before treating any other operation in
 the same way, one would need to conduct similar analysis.

As long as we are talking about FKs only, then this approach can work.
All we are doing is applying the locks slightly earlier than before.
Once locked they will prevent any later violations, so we are safe
from anybody except *ourselves* from making changes that would
invalidate the earlier check.  Trouble is, there are various ways I
can see that as possible, so making a check early doesn't allow you to
avoid making the check later as well.

AFAICS there are weird cases where changing the way FKs execute will
change the way complex trigger applications will execute. I don't see
a way to avoid that other than do nothing. Currently, we execute the
checks following the normal order of execution rules for triggers.
Every idea we've had so far changes that in some way; variously in
major or minor ways, but changed nonetheless.

Even the approach of deferring checks to allow them to be applied in a
batch mean we might change the way applications execute in detail.
However, since the only possible change there would be to decrease the
number of self-induced failures that seems OK.

So the question is how much change do we want to introduce? I'll guess
not much, rather than lots or none.

Proposal: Have a WHEN clause that accumulates values to be checked in
a hash table up to work_mem in size, allowing us to eliminate the most
common duplicates (just not *all* duplicates). If the value isn't a
duplicate (or at least the first seen tuple with that value), we will
queue up a check for later. That approach gives us *exactly* what we
have now and works with the two common cases: i) few, mostly
duplicated values, ii) many values, but clustered together. Then apply
changes in batches at end of statement.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimising Foreign Key checks

2013-06-09 Thread Andres Freund
On 2013-06-01 09:41:13 +0100, Simon Riggs wrote:
 FK checks can be expensive, especially when loading large volumes of
 data into an existing table or partition. A couple of ideas for
 improving performance are discussed here:

Another idea would be to optimize away the row level locks if we have a
table level lock that already provides more protection than the asked
for row level lock. In bulk data loading scenarios that's not uncommon
and can matter quite a bit, especially if loading in parallel since
both, the wal traffic and the dirtying of pages, is considerably
reduced.

Should be implementable relatively easily in heap_lock_tuple without
being visible to the outside.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimising Foreign Key checks

2013-06-09 Thread Greg Stark
On Sun, Jun 9, 2013 at 10:51 AM, Simon Riggs si...@2ndquadrant.com wrote:
 AFAICS there are weird cases where changing the way FKs execute will
 change the way complex trigger applications will execute. I don't see
 a way to avoid that other than do nothing. Currently, we execute the
 checks following the normal order of execution rules for triggers.
 Every idea we've had so far changes that in some way; variously in
 major or minor ways, but changed nonetheless.

The obvious case to handle would be if someone has a trigger to
automatically create any missing references.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimising Foreign Key checks

2013-06-09 Thread Simon Riggs
On 9 June 2013 14:59, Greg Stark st...@mit.edu wrote:
 On Sun, Jun 9, 2013 at 10:51 AM, Simon Riggs si...@2ndquadrant.com wrote:
 AFAICS there are weird cases where changing the way FKs execute will
 change the way complex trigger applications will execute. I don't see
 a way to avoid that other than do nothing. Currently, we execute the
 checks following the normal order of execution rules for triggers.
 Every idea we've had so far changes that in some way; variously in
 major or minor ways, but changed nonetheless.

 The obvious case to handle would be if someone has a trigger to
 automatically create any missing references.

Exactly my thoughts.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimising Foreign Key checks

2013-06-08 Thread Noah Misch
On Tue, Jun 04, 2013 at 02:45:17PM +0100, Simon Riggs wrote:
  On Sun, Jun 02, 2013 at 10:45:21AM +0100, Simon Riggs wrote:
  For clarity the 4 problems are
  1. SQL execution overhead
  2. Memory usage
  3. Memory scrolling
  4. Locking overhead, specifically FPWs and WAL records from FK checks
  probably in that order or thereabouts.

 Lets rethink things to put a few options on the table and see what we get...

 2. Don't store FK events in the after trigger queue at all, but apply
 them as we go. That solves problems2 and 3. That could be considered
 to be in violation of the SQL standard, which requires us to apply the
 checks at the end of the statement. We already violate the standard
 with regard to uniqueness checks, so doing it here doesn't seem
 unacceptable.

I wouldn't like to see that compliance bug propagate to other constraint
types.  What clauses in the standard demand end-of-statement timing, anyway?

What if we followed the example of deferred UNIQUE: attempt FK checks as we go
and enqueue an after-trigger recheck when such an initial test fails?

 Implementation: Given we know that COPY uses a ring buffer, and given
 your earlier thoughts on use of a batched SQL, I have a new
 suggestion. Every time the ring buffer fills, we go through the last
 buffers accessed, pulling out all the PKs and then issue them as a
 single SQL statement (as above). We can do that manually, or using the
 block scan mentioned previously. This uses batched SQL to solve
 problem1. It doesn't build up a large data structure in memory,
 problem2, and it also solves problem3 by accessing data blocks before
 they fall out of the ring buffer. If there are no duplicates in the
 referenced table, then this behavious will do as much as possible to
 make accesses to the referenced table also use a small working set.
 (We may even wish to consider making the batched SQL use a separate
 ring buffer for RI accesses). That approach doesn't make any
 assumptions about duplicates.

If this can be made standard-compliant, it sounds most fruitful.

 Perhaps another way would be to avoid very large COPY statements
 altogether, breaking down loads into smaller pieces.

True.  It would be nice for the system to not need such hand-holding, but
that's a largely-adequate tool for coping in the field.

Thanks,
nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimising Foreign Key checks

2013-06-08 Thread Simon Riggs
On 8 June 2013 15:30, Noah Misch n...@leadboat.com wrote:
 On Tue, Jun 04, 2013 at 02:45:17PM +0100, Simon Riggs wrote:
  On Sun, Jun 02, 2013 at 10:45:21AM +0100, Simon Riggs wrote:
  For clarity the 4 problems are
  1. SQL execution overhead
  2. Memory usage
  3. Memory scrolling
  4. Locking overhead, specifically FPWs and WAL records from FK checks
  probably in that order or thereabouts.

 Lets rethink things to put a few options on the table and see what we get...

 2. Don't store FK events in the after trigger queue at all, but apply
 them as we go. That solves problems2 and 3. That could be considered
 to be in violation of the SQL standard, which requires us to apply the
 checks at the end of the statement. We already violate the standard
 with regard to uniqueness checks, so doing it here doesn't seem
 unacceptable.

 I wouldn't like to see that compliance bug propagate to other constraint
 types.  What clauses in the standard demand end-of-statement timing, anyway?

 What if we followed the example of deferred UNIQUE: attempt FK checks as we go
 and enqueue an after-trigger recheck when such an initial test fails?

The copy I have access to (2008 draft), 4.17.2 Checking of constraints

The checking of a constraint depends on its constraint mode within
the current SQL-transaction. Whenever an SQL-statement is executed,
every constraint whose mode is immediate is checked, at a certain
point after any changes to SQL-data and schemas resulting from that
execution have been effected, to see if it is satisfied. A constraint
is satisfied if and only if the applicable search condition included
in its descriptor evaluates to True or Unknown. If any constraint is
not satisfied, then any changes to SQL-data or schemas resulting from
executing that statement are canceled. (See the General Rules of
Subclause 13.5, “SQL procedure statement”.

NOTE 31 — This includes SQL-statements that are executed as a direct
result or an indirect result of executing a different SQL-statement.
It also includes statements whose effects explicitly or implicitly
include setting the constraint mode to immediate. 


I can't see anything there that stops me applying locks as we go, but
I feel like someone will object...

This point seems crucial to the particular approach we take, so I need
wider input.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimising Foreign Key checks

2013-06-08 Thread Noah Misch
On Sat, Jun 08, 2013 at 09:39:14PM +0100, Simon Riggs wrote:
 On 8 June 2013 15:30, Noah Misch n...@leadboat.com wrote:
  On Tue, Jun 04, 2013 at 02:45:17PM +0100, Simon Riggs wrote:

  2. Don't store FK events in the after trigger queue at all, but apply
  them as we go. That solves problems2 and 3. That could be considered
  to be in violation of the SQL standard, which requires us to apply the
  checks at the end of the statement. We already violate the standard
  with regard to uniqueness checks, so doing it here doesn't seem
  unacceptable.
 
  I wouldn't like to see that compliance bug propagate to other constraint
  types.  What clauses in the standard demand end-of-statement timing, anyway?
 
  What if we followed the example of deferred UNIQUE: attempt FK checks as we 
  go
  and enqueue an after-trigger recheck when such an initial test fails?
 
 The copy I have access to (2008 draft), 4.17.2 Checking of constraints
 
 The checking of a constraint depends on its constraint mode within
 the current SQL-transaction. Whenever an SQL-statement is executed,
 every constraint whose mode is immediate is checked, at a certain
 point after any changes to SQL-data and schemas resulting from that
 execution have been effected, to see if it is satisfied. A constraint
 is satisfied if and only if the applicable search condition included
 in its descriptor evaluates to True or Unknown. If any constraint is
 not satisfied, then any changes to SQL-data or schemas resulting from
 executing that statement are canceled. (See the General Rules of
 Subclause 13.5, “SQL procedure statement”.
 
 NOTE 31 — This includes SQL-statements that are executed as a direct
 result or an indirect result of executing a different SQL-statement.
 It also includes statements whose effects explicitly or implicitly
 include setting the constraint mode to immediate. 

This does appear to specify FK timing semantics like PostgreSQL gives today.
Namely, it does not permit a FK-induced error when later actions of the query
that prompted the check could possibly remedy the violation.

 I can't see anything there that stops me applying locks as we go, but

Likewise; I don't see why we couldn't perform an optimistic check ASAP and
schedule a final after-statement check when an early check fails.  That
changes performance characteristics without changing semantics.

 I feel like someone will object...
 
 This point seems crucial to the particular approach we take, so I need
 wider input.

Agreed.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimising Foreign Key checks

2013-06-08 Thread Robert Haas
On Sat, Jun 8, 2013 at 5:41 PM, Noah Misch n...@leadboat.com wrote:
 This does appear to specify FK timing semantics like PostgreSQL gives today.
 Namely, it does not permit a FK-induced error when later actions of the query
 that prompted the check could possibly remedy the violation.

Yeah.  Standard or no standard, I think we'd have unhappy users if we
broke that.

 I can't see anything there that stops me applying locks as we go, but

Not sure I follow that bit but...

 Likewise; I don't see why we couldn't perform an optimistic check ASAP and
 schedule a final after-statement check when an early check fails.  That
 changes performance characteristics without changing semantics.

...this seems like it might have some promise; but what if the action
we're performing isn't idempotent?  And how do we know?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimising Foreign Key checks

2013-06-08 Thread Noah Misch
On Sat, Jun 08, 2013 at 08:20:42PM -0400, Robert Haas wrote:
 On Sat, Jun 8, 2013 at 5:41 PM, Noah Misch n...@leadboat.com wrote:
  Likewise; I don't see why we couldn't perform an optimistic check ASAP and
  schedule a final after-statement check when an early check fails.  That
  changes performance characteristics without changing semantics.
 
 ...this seems like it might have some promise; but what if the action
 we're performing isn't idempotent?  And how do we know?

The action discussed so far is RI_FKey_check_ins().  It acquires a KEY SHARE
lock (idempotent by nature) on a row that it finds using B-tree equality
(presumed IMMUTABLE, thus idempotent).  RI_FKey_check_upd() is nearly the same
action, so the same argument holds.  Before treating any other operation in
the same way, one would need to conduct similar analysis.

Thanks,
nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimising Foreign Key checks

2013-06-05 Thread Greg Stark
On Sat, Jun 1, 2013 at 9:41 AM, Simon Riggs si...@2ndquadrant.com wrote:
 COMMIT;
 The inserts into order_line repeatedly execute checks against the same
 ordid. Deferring and then de-duplicating the checks would optimise the
 transaction.

 Proposal: De-duplicate multiple checks against same value. This would
 be implemented by keeping a hash of rows that we had already either
 inserted and/or locked as the transaction progresses, so we can use
 the hash to avoid queuing up after triggers.


Fwiw the reason we don't do that now is that the rows might be later
deleted within the same transaction (or even the same statement I
think). If they are then the trigger needs to be skipped for that row
but still needs to happen for other rows. So you need to do some kind
of book-keeping to keep track of that. The easiest way was just to do
the check independently for each row. I think there's a comment about
this in the code.

I think you're right that this should be optimized because in the vast
majority of cases you don't end up deleting rows and we're currently
doing lots of redundant checks. But you need to make sure you don't
break the unusual case entirely.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimising Foreign Key checks

2013-06-05 Thread Hannu Krosing
On 06/05/2013 11:37 AM, Greg Stark wrote:
 On Sat, Jun 1, 2013 at 9:41 AM, Simon Riggs si...@2ndquadrant.com wrote:
 COMMIT;
 The inserts into order_line repeatedly execute checks against the same
 ordid. Deferring and then de-duplicating the checks would optimise the
 transaction.

 Proposal: De-duplicate multiple checks against same value. This would
 be implemented by keeping a hash of rows that we had already either
 inserted and/or locked as the transaction progresses, so we can use
 the hash to avoid queuing up after triggers.

 Fwiw the reason we don't do that now is that the rows might be later
 deleted within the same transaction (or even the same statement I
 think). If they are then the trigger needs to be skipped for that row
 but still needs to happen for other rows. So you need to do some kind
 of book-keeping to keep track of that. The easiest way was just to do
 the check independently for each row. I think there's a comment about
 this in the code.
A simple counter on each value should also solve this.
Increment for each row, decrement for each deletion,
then run the tests on values where counter is  0
 I think you're right that this should be optimized because in the vast
 majority of cases you don't end up deleting rows and we're currently
 doing lots of redundant checks. But you need to make sure you don't
 break the unusual case entirely.



-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimising Foreign Key checks

2013-06-04 Thread Simon Riggs
On 4 June 2013 01:54, Noah Misch n...@leadboat.com wrote:
 On Sun, Jun 02, 2013 at 10:45:21AM +0100, Simon Riggs wrote:
 For clarity the 4 problems are
 1. SQL execution overhead
 2. Memory usage
 3. Memory scrolling
 4. Locking overhead, specifically FPWs and WAL records from FK checks
 probably in that order or thereabouts.

 The above is why I went for a technique that avoided SQL execution
 entirely, as well as conserving memory by de-duplicating the values in
 a hash table as we go, which avoids all three problems. The fourth was
 solved by the more extreme approach to locking.

 That nicely frames the benefits of your proposals.  Makes sense.

 I think it might be worth considering joining the after trigger queue
 directly to the referenced table(s), something like this...

 CREATE OR REPLACE FUNCTION after_trigger_queue() RETURNS SETOF tid AS $$
 ...
 $$ LANGUAGE SQL;

 EXPLAIN
 SELECT 1 FROM ONLY order
 WHERE orderid IN
 (SELECT orderid FROM ONLY order_line WHERE ctid IN (SELECT
 after_trigger_queue FROM after_trigger_queue() ))
 FOR KEY SHARE;

 Agreed.

Hmm, although the above is cute, it still falls down by requiring lots
of memory (problem 2) and churning memory again (problem 3).

Lets rethink things to put a few options on the table and see what we get...

1. Store all the after trigger events in a big queue, then execute
later. That requires us to scroll it to disk to solve problem2, but it
still falls foul of problem3, which becomes severe on big data.

Implementation: Implement scrolling to disk for the after trigger
queue. Then implement event batching in the RI handler code, similar
to what is suggested above or directly as suggested by Noah on
previous post.

2. Don't store FK events in the after trigger queue at all, but apply
them as we go. That solves problems2 and 3. That could be considered
to be in violation of the SQL standard, which requires us to apply the
checks at the end of the statement. We already violate the standard
with regard to uniqueness checks, so doing it here doesn't seem
unacceptable.

Implementation: Given we know that COPY uses a ring buffer, and given
your earlier thoughts on use of a batched SQL, I have a new
suggestion. Every time the ring buffer fills, we go through the last
buffers accessed, pulling out all the PKs and then issue them as a
single SQL statement (as above). We can do that manually, or using the
block scan mentioned previously. This uses batched SQL to solve
problem1. It doesn't build up a large data structure in memory,
problem2, and it also solves problem3 by accessing data blocks before
they fall out of the ring buffer. If there are no duplicates in the
referenced table, then this behavious will do as much as possible to
make accesses to the referenced table also use a small working set.
(We may even wish to consider making the batched SQL use a separate
ring buffer for RI accesses). That approach doesn't make any
assumptions about duplicates.

3.
Strip the PK values from the rows at access time and store them in a
hash table, de-duplicating as we go. If that gets too big, write
seldom accessed values to a temporary table which will automatically
scroll to disk when it exceeds temp_buffers. We don't create the temp
table until we hit work_mem, so we only create that for larger
statements. Then at the end of the statement, copy the hash table into
the temp table and then join to the referenced table directly. We
don't guarantee the list of PK values is completely unique, but we
will have significantly reduced the number of duplicates to check.

Comparison:
Approach 1 suffers from memory scrolling, problem3. But it follows the
SQL standard.
Approach 2 solves all performance problems but doesn't follow the
letter of the standard (AIUI - anyone see differently?)
Approach 3 solves all performance problems and follows the SQL standard

Perhaps another way would be to avoid very large COPY statements
altogether, breaking down loads into smaller pieces. For example
pg_dump could issue COPY in 1 row chunks rather than big copy.
That would mostly avoid problem2 and problem3 and is more realistic
with wild data.

Based on that thought, implementation option 1 looks most reasonable.
Which in short summary is a) scroll after trigger queue to disk and b)
batch SQL values together as Noah originally suggested, or in the SQL
above.

Let me know what you think of that analysis.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimising Foreign Key checks

2013-06-03 Thread Jim Nasby

On 6/2/13 4:45 AM, Simon Riggs wrote:

Will this add too much cost where it doesn't help?  I don't know what to
predict there.  There's the obvious case of trivial transactions with no more
than one referential integrity check per FK, but there's also the case of a
transaction with many FK checks all searching different keys.  If the hash hit
rate (key duplication rate) is low, the hash can consume considerably more
memory than the trigger queue without preventing many RI queries.  What sort
of heuristic could we use to avoid pessimizing such cases?

I've struggled with that for a while now. Probably all we can say is
that there might be one, and if there is not, then manual decoration
of the transaction will be the way to go.


Just an idea... each backend could keep a store that indicates what FKs this 
would help with. For example, any time we hit a transaction that exercises the 
same FK more than once, we stick the OID of the FK constraint (or maybe of the 
two tables) into a hash that's in that backend's top memory context. (Or if we 
want to be real fancy, shared mem).
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimising Foreign Key checks

2013-06-03 Thread Simon Riggs
On 3 June 2013 19:41, Jim Nasby j...@nasby.net wrote:
 On 6/2/13 4:45 AM, Simon Riggs wrote:

 Will this add too much cost where it doesn't help?  I don't know what to
 predict there.  There's the obvious case of trivial transactions with no
  more
 than one referential integrity check per FK, but there's also the case
  of a
 transaction with many FK checks all searching different keys.  If the
  hash hit
 rate (key duplication rate) is low, the hash can consume considerably
  more
 memory than the trigger queue without preventing many RI queries.  What
  sort
 of heuristic could we use to avoid pessimizing such cases?

 I've struggled with that for a while now. Probably all we can say is
 that there might be one, and if there is not, then manual decoration
 of the transaction will be the way to go.


 Just an idea... each backend could keep a store that indicates what FKs this
 would help with. For example, any time we hit a transaction that exercises
 the same FK more than once, we stick the OID of the FK constraint (or maybe
 of the two tables) into a hash that's in that backend's top memory context.
 (Or if we want to be real fancy, shared mem).

Yes, that principle would work. We could just store that on the
relcache entry for a table.

It requires a little bookkeeping to implement that heuristic. I'm sure
other ways exist as well.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimising Foreign Key checks

2013-06-03 Thread Noah Misch
On Sun, Jun 02, 2013 at 10:45:21AM +0100, Simon Riggs wrote:
 For clarity the 4 problems are
 1. SQL execution overhead
 2. Memory usage
 3. Memory scrolling
 4. Locking overhead, specifically FPWs and WAL records from FK checks
 probably in that order or thereabouts.
 
 The above is why I went for a technique that avoided SQL execution
 entirely, as well as conserving memory by de-duplicating the values in
 a hash table as we go, which avoids all three problems. The fourth was
 solved by the more extreme approach to locking.

That nicely frames the benefits of your proposals.  Makes sense.

 I think it might be worth considering joining the after trigger queue
 directly to the referenced table(s), something like this...
 
 CREATE OR REPLACE FUNCTION after_trigger_queue() RETURNS SETOF tid AS $$
 ...
 $$ LANGUAGE SQL;
 
 EXPLAIN
 SELECT 1 FROM ONLY order
 WHERE orderid IN
 (SELECT orderid FROM ONLY order_line WHERE ctid IN (SELECT
 after_trigger_queue FROM after_trigger_queue() ))
 FOR KEY SHARE;

Agreed.

 But we could optimise that even further if we had a BlockScan, which
 would be a block-oriented version of the tid scan where we simply
 provide a bitmap of blocks needing to be scanned, just like the output
 of an BitmapIndexScan. The reason for mentioning that here is that
 parallel query will eventually need the ability to do a scan of a
 subset of blocks, as does tablesample. So I can see 3 callers of such
 a Scan type.

Interesting.  I was going to say that could lock more keys than needed, but
perhaps you would afterward filter by xmin/cmin.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimising Foreign Key checks

2013-06-02 Thread Simon Riggs
On 1 June 2013 21:27, Noah Misch n...@leadboat.com wrote:
 On Sat, Jun 01, 2013 at 09:41:13AM +0100, Simon Riggs wrote:
 FK checks can be expensive, especially when loading large volumes of
 data into an existing table or partition. A couple of ideas for
 improving performance are discussed here:

 1. Use Case: Bulk loading
 COPY pgbench_accounts;  -- references pgbench_branches with many
 repeated values

 Proposal: Transactions that need multiple checks can be optimised by
 simply LOCKing the whole referenced table, once. We can then hold the
 referenced table as a Hash, like we do with a Hash Join (its almost
 exactly the same thing). This works in two ways: it speeds up checks
 and it also reduces the locking overhead.

 2. Use Case: Transactional repetition
 BEGIN;
 INSERT INTO order VALUES (ordid, )
 INSERT INTO order_line VALUES (ordid, 1, .)
 INSERT INTO order_line VALUES (ordid, 2, .)
 INSERT INTO order_line VALUES (ordid, 3, .)
 INSERT INTO order_line VALUES (ordid, 4, .)

 Proposal: De-duplicate multiple checks against same value. This would
 be implemented by keeping a hash of rows that we had already either
 inserted and/or locked as the transaction progresses, so we can use
 the hash to avoid queuing up after triggers.

 I find (2) more promising than (1).  It helps automatically, and it helps in
 more cases.  The main advantage of (1) is avoiding the writes of tuple locks
 onto the PK table.  Therefore, I expect (1) to distinguish itself over (2)
 when the referenced values are _not_ repeated too much.  If referenced values
 are repeated, tuple locking costs would tend to disappear into the noise after
 the deduplication of (2).

 In both cases we are building up a local hash table with values and
 then using those values to avoid queuing constraint triggers. So code
 is similar for both.

 Thoughts?

Thanks for your detailed reply.


 Will this add too much cost where it doesn't help?  I don't know what to
 predict there.  There's the obvious case of trivial transactions with no more
 than one referential integrity check per FK, but there's also the case of a
 transaction with many FK checks all searching different keys.  If the hash hit
 rate (key duplication rate) is low, the hash can consume considerably more
 memory than the trigger queue without preventing many RI queries.  What sort
 of heuristic could we use to avoid pessimizing such cases?

I've struggled with that for a while now. Probably all we can say is
that there might be one, and if there is not, then manual decoration
of the transaction will be the way to go.

 Same-transaction UPDATE or DELETE of the PK table, as well as subtransaction
 abort, potentially invalidates hash entries.  I recommend thinking relatively
 early about how best to handle that.

Thanks, good point.

 Before deciding what to think overall, I needed to see a benchmark.  I ran
 this simple one based on your scenarios:

 BEGIN;
 CREATE TABLE order (orderid int PRIMARY KEY);
 CREATE TABLE order_line (
 orderid int,
 lineno int,
 PRIMARY KEY (orderid, lineno),
 FOREIGN KEY (orderid) REFERENCES order
 );
 INSERT INTO order VALUES (1);
 INSERT INTO order_line SELECT 1, n FROM generate_series(1,100) t(n);
 ROLLBACK;

 See attached output from perf report -s parent -g graph,5,caller; I suggest
 browsing under less -S.  It confirms that the expensive part is something
 your proposals would address.

Thanks for posting this. It shows nicely that there is a problem with
repeated execution of SQL in constraint triggers. However, that isn't
the only problem since we must also consider memory usage and memory
scrolling. Currently, the after trigger queue builds up in memory and
doesn't scroll to disk, so overall memory usage is a problem in many
cases. Memory scrolling is also a problem since the after trigger
queue records the tids against which we we need to execute triggers;
this causes us to re-access blocks that had scrolled out of memory,
thus defeating the bulk access strategy, as well as spoiling cache.

For clarity the 4 problems are
1. SQL execution overhead
2. Memory usage
3. Memory scrolling
4. Locking overhead, specifically FPWs and WAL records from FK checks
probably in that order or thereabouts.

The above is why I went for a technique that avoided SQL execution
entirely, as well as conserving memory by de-duplicating the values in
a hash table as we go, which avoids all three problems. The fourth was
solved by the more extreme approach to locking.


 A different way to help the bulk loading case would be to lock more keys with
 a single query.  Currently, we issue a query like this for every INSERTed row:

   SELECT 1 FROM ONLY pktable WHERE pkcol = $1 FOR KEY SHARE OF x

 We could instead consider queued tuples in batches:

   SELECT 1 FROM ONLY pktable WHERE pkcol = ANY (ARRAY[$1,$2,...,$100]) FOR 
 KEY SHARE OF x

 This would have the advantage of not adding a long-lived, potentially-large
 

[HACKERS] Optimising Foreign Key checks

2013-06-01 Thread Simon Riggs
FK checks can be expensive, especially when loading large volumes of
data into an existing table or partition. A couple of ideas for
improving performance are discussed here:

1. Use Case: Bulk loading
COPY pgbench_accounts;  -- references pgbench_branches with many
repeated values

Proposal: Transactions that need multiple checks can be optimised by
simply LOCKing the whole referenced table, once. We can then hold the
referenced table as a Hash, like we do with a Hash Join (its almost
exactly the same thing). This works in two ways: it speeds up checks
and it also reduces the locking overhead.

This would require explicit permission of the user, which would be
given by a new table parameter, set on the referenced table.

  WITH (foreign_key_lock_level = row | table)

Setting this would lock out changes on that table, so would only be
suitable for read-mostly tables. But that is exactly the most
frequently referenced table in a FK anyway, reference tables, so the
optimisation is appropriate in probably the majority of cases.

2. Use Case: Transactional repetition
BEGIN;
INSERT INTO order VALUES (ordid, )
INSERT INTO order_line VALUES (ordid, 1, .)
INSERT INTO order_line VALUES (ordid, 2, .)
INSERT INTO order_line VALUES (ordid, 3, .)
INSERT INTO order_line VALUES (ordid, 4, .)
...
COMMIT;
The inserts into order_line repeatedly execute checks against the same
ordid. Deferring and then de-duplicating the checks would optimise the
transaction.

Proposal: De-duplicate multiple checks against same value. This would
be implemented by keeping a hash of rows that we had already either
inserted and/or locked as the transaction progresses, so we can use
the hash to avoid queuing up after triggers.

We could also use this technique to de-duplicate checks within a
single statement.

In both cases we are building up a local hash table with values and
then using those values to avoid queuing constraint triggers. So code
is similar for both.

Thoughts?

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimising Foreign Key checks

2013-06-01 Thread Noah Misch
On Sat, Jun 01, 2013 at 09:41:13AM +0100, Simon Riggs wrote:
 FK checks can be expensive, especially when loading large volumes of
 data into an existing table or partition. A couple of ideas for
 improving performance are discussed here:
 
 1. Use Case: Bulk loading
 COPY pgbench_accounts;  -- references pgbench_branches with many
 repeated values
 
 Proposal: Transactions that need multiple checks can be optimised by
 simply LOCKing the whole referenced table, once. We can then hold the
 referenced table as a Hash, like we do with a Hash Join (its almost
 exactly the same thing). This works in two ways: it speeds up checks
 and it also reduces the locking overhead.

 2. Use Case: Transactional repetition
 BEGIN;
 INSERT INTO order VALUES (ordid, )
 INSERT INTO order_line VALUES (ordid, 1, .)
 INSERT INTO order_line VALUES (ordid, 2, .)
 INSERT INTO order_line VALUES (ordid, 3, .)
 INSERT INTO order_line VALUES (ordid, 4, .)

 Proposal: De-duplicate multiple checks against same value. This would
 be implemented by keeping a hash of rows that we had already either
 inserted and/or locked as the transaction progresses, so we can use
 the hash to avoid queuing up after triggers.

I find (2) more promising than (1).  It helps automatically, and it helps in
more cases.  The main advantage of (1) is avoiding the writes of tuple locks
onto the PK table.  Therefore, I expect (1) to distinguish itself over (2)
when the referenced values are _not_ repeated too much.  If referenced values
are repeated, tuple locking costs would tend to disappear into the noise after
the deduplication of (2).

 In both cases we are building up a local hash table with values and
 then using those values to avoid queuing constraint triggers. So code
 is similar for both.
 
 Thoughts?

Will this add too much cost where it doesn't help?  I don't know what to
predict there.  There's the obvious case of trivial transactions with no more
than one referential integrity check per FK, but there's also the case of a
transaction with many FK checks all searching different keys.  If the hash hit
rate (key duplication rate) is low, the hash can consume considerably more
memory than the trigger queue without preventing many RI queries.  What sort
of heuristic could we use to avoid pessimizing such cases?

Same-transaction UPDATE or DELETE of the PK table, as well as subtransaction
abort, potentially invalidates hash entries.  I recommend thinking relatively
early about how best to handle that.


Before deciding what to think overall, I needed to see a benchmark.  I ran
this simple one based on your scenarios:

BEGIN;
CREATE TABLE order (orderid int PRIMARY KEY);
CREATE TABLE order_line (
orderid int,
lineno int,
PRIMARY KEY (orderid, lineno),
FOREIGN KEY (orderid) REFERENCES order
);
INSERT INTO order VALUES (1);
INSERT INTO order_line SELECT 1, n FROM generate_series(1,100) t(n);
ROLLBACK;

See attached output from perf report -s parent -g graph,5,caller; I suggest
browsing under less -S.  It confirms that the expensive part is something
your proposals would address.


A different way to help the bulk loading case would be to lock more keys with
a single query.  Currently, we issue a query like this for every INSERTed row:

  SELECT 1 FROM ONLY pktable WHERE pkcol = $1 FOR KEY SHARE OF x

We could instead consider queued tuples in batches:

  SELECT 1 FROM ONLY pktable WHERE pkcol = ANY (ARRAY[$1,$2,...,$100]) FOR KEY 
SHARE OF x

This would have the advantage of not adding a long-lived, potentially-large
data structure and not depending on the rate of referenced key duplication.
But it would not help non-bulk scenarios.  (However, the user could make your
example for (2) become a bulk scenario by deferring the FK constraint.)

Thanks,
nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com
# Events: 20K cycles
#
# Overhead  Parent symbol
#   .
#
   100.00%  [other]  
|
--- (nil)
   |  
--96.88%-- __libc_start_main
  generic_start_main.isra.0
  |  
   --96.77%-- main
 PostmasterMain
 |  
  --95.62%-- PostgresMain
|  
 --95.51%-- PortalRun
   PortalRunMulti
   |  
--95.50%-- 
ProcessQuery
  | 
 
  
|--71.37%-- standard_ExecutorFinish