Re: Deferred constraint trigger semantics

Bryn Llewellyn Wed, 11 May 2022 15:43:59 -0700

> [email protected] wrote:
> 
>> [email protected] wrote:


Thanks for the point-by-point reply, David.

>> ...makes no mention of what you might expect to see in an AFTER EACH ROW 
>> trigger.
> 
> ...the absence of a comment declaring a guarantee of order means that, like 
> the comment for the row-level BEFORE trigger, the row-level AFTER row 
> ordering is not guaranteed (even if one cannot produce a counter-example in 
> today's codebase).

Got it!

>> ...unless anybody contradicts me.
> 
> Caveat emptor...? I wouldn't be surprised that doing so is technically 
> possible in all cases - as to whether a particular algorithm is sound, to 
> some extent, isn't something we try to predict. We do try to describe all the 
> known interactions though - and let the user work within what those mean for 
> them.

Got it again!

>> ...implies that there's no such unpredictability in the AFTER EACH ROW cases.
> 
> I would not read it that way. In general, absence of mention of 
> predictability like this means there is none - that some other sentence goes 
> into more detail doesn't change that.

OK.

>> But there has to be a subtle caveat here for the deferred constraint trigger 
>> when the txn changes two or more tables, all of which participate in the 
>> query that the trigger function issues… The "raise info" output below 
>> illustrates my point (n changes from 5 to 8).
> 
> I'm failing to see the deferral aspect of that example. First statement 
> finishes, sees the 5 inserts, next statement finishes, sees 3 more inserts. 
> Not, both statements finish, triggers fire, triggers see all 8 inserts (which 
> I suspect they will if you actually perform deferral).

Oops. I did a copy-and-paste error on going from my test env. to email and 
missed out the "deferral" that I'd intended. For completeness, here's the test 
that I meant:

create table t1(k serial primary key, v int not null);
create table t2(k serial primary key, v int not null);

create function trg_fn()
  returns trigger
  language plpgsql
as $body$
declare
  n int not null := 0;
begin
  n := (select count(*) from t1) + (select count(*) from t2);
  raise info 'trg fired. new.v = %, n = %', new.v, n;
  return new;
end;
$body$;

create constraint trigger trg
after insert on t1
for each row
execute function trg_fn();

create constraint trigger trg
after insert on t2
initially deferred
for each row
execute function trg_fn();

set default_transaction_isolation = 'read committed';
do $body$
begin
  insert into t1(v)
  values (10), (20), (30), (40), (50);

  insert into t2(v)
  values (60), (70), (80);
end;
$body$;

It adds the "initially deferred" decoration to the "create constraint trigger" 
statement. This is (still) the result:

INFO:  trg fired. new.v = 10, n = 5
INFO:  trg fired. new.v = 20, n = 5
INFO:  trg fired. new.v = 30, n = 5
INFO:  trg fired. new.v = 40, n = 5
INFO:  trg fired. new.v = 50, n = 5
INFO:  trg fired. new.v = 60, n = 8
INFO:  trg fired. new.v = 70, n = 8
INFO:  trg fired. new.v = 80, n = 8

Even though both inserts have completed by commit time, only the trigger firing 
caused by the second statement sees the final state that obtains the moment 
before commit. The first statement sees only the state after it finishes and 
before the second statement executes. You said « I suspect [that both 
statements will see the final state] if you actually perform deferral ». My 
test shows that this is not the case.

Did I misunderstand you? Or does this result surprise you? If it does, do you 
think that this is a bug?

> ...Your test case for the deferred constraint, that supposedly allows for the 
> insertion of invalid data per the specification of the constraint trigger, 
> isn't something I've worked through yet; and as written reads like a bug 
> report.

It's not a report of a PG bug. Rather, it shows how an application programmer 
might write a bug in *their* code. When "set constraints all immediate" is 
used, it opens a race condition window between its execution and the commit. 
I'd speculated on that earlier. So I felt that I should show a self-contained 
demo of this possible trap for the application programmer. As long as  "set 
constraints all immediate" is not used, the demo shows proper behavior. Of 
course, it moves the constraint violation error to commit time—and this means 
that PL/pgSQL code cannot handle it (as discussed at length elsewhere).

I appreciate that using  "set constraints all immediate" is useful in many 
scenarios to allow handling the violation error in PL/pgSQL. (Thanks again for 
that tip, David.) For example, and as I reason it, the famous "mandatory 
one-to-one relationship" use-case is safe when you use this technique. This 
use-case needs mutual FK constraints between the two tables. But you have to 
insert one of the two rows (call it "main") that are so related before the 
other (call it "extra"). And so the mutual FK constraints cannot be satisfied 
until both new rows are in place. Therefore the "main" table's FK constraint to 
the "extra" table (when the insertion order that I described is used) must be 
deferred. But, because this is insert, no other session can see your 
uncommitted work in progress. So there's no race condition window. And once the 
new row-pair is committed, other non-deferred constraints can ensure that 
either deleting the "extra" row is RESTRICTed or deleting the "main" row 
CASCADEs. So the business rule is easily enforced once the new row-pair is in 
place.

It does take a fair effort of reasoning, for each specific use-case, to 
determine the safety of using "set constraints all immediate". But programming 
is just like that!

>> Is there a fundamental reason why a deferred AFTER EACH STATEMENT constraint 
>> trigger is not allowed? Nothing in what is explained in the "Overview of 
>> Trigger Behavior" and "Visibility of Data Changes" sections lets me see why 
>> the present restriction is needed.
> 
> I imagine having to keep around a working set of what are the changed records 
> is both memory intensive and also problematic should a future statement make 
> yet more changes to the table… And given that constraints are defined per-row 
> everywhere else there is a pull to not push the envelope of our extension too 
> far.

At the conceptual level, there are both per-row constraints and the kind that 
can be precisely specified (albeit not implemented) using SQL Assertion syntax.

With respect to « having to keep around a working set of what are the changed 
records » I think that the complexity that you envisaged is avoided by the 
(emergent) rule that an AFTER EACH STATEMENT trigger cannot see "old" and "new" 
values. In other words, all you can sensibly do in its function is ordinary SQL 
that sees the current state at the moment it fires.

To my surprise, it *is* legal to write code that accesses "old" and "new" 
values. But, because many rows can be affected by a single statement, and the 
trigger fires just once, the meanings of "old" and "new" are undefined. I've 
seen that, in any test that I do, both are always set to NULL (which seems 
reasonable). For example:

create table t(k serial primary key, v int not null);

create function trg_fn()
  returns trigger
  language plpgsql
as $body$
begin
  raise info 'old.v is %', coalesce(old.v::text, 'null as expected');
  raise info 'new.v is %', coalesce(new.v::text, 'null as expected');
  return new;
end;
$body$;

create trigger trg
after insert on t
for each statement
execute function trg_fn();

set default_transaction_isolation = 'read committed';
insert into t(v)
values (10), (20), (30), (40), (50);

It produces this output:

INFO:  old.v is null as expected
INFO:  new.v is null as expected

A deferred AFTER EACH STATEMENT constraint trigger would be a good tool for the 
kind of business rule that a SQL Assertion specifies. Yes, it would fire more 
times than is actually needed. But nevertheless, the timing point better 
expresses the intention—and there would, at least, be fewer firings than with 
an AFTER EACH ROW trigger.

If what I pointed out above (not all deferred constraint triggers see the final 
state just before commit) is regarded as a bug and fixed, then the application 
programming would be straightforward. As things are now, tricky (but feasible, 
I believe) programming would be needed in the multi-table case to make only the 
very last deferred trigger function execution perform the actual test.

Re: Deferred constraint trigger semantics

Reply via email to