Trigger to Count Number of Logical Replication Table Changes.

2023-11-16 Thread Avi Weinberg
I'm using Postgres (13 and 15) logical replication to sync data from two 
servers. I would like to have an update counter whenever data is changed. The 
counter can be incremented by 1 even if multiple rows are updated, but it is 
also ok to be incremented the counter by the number of rows updated (but it 
seems less efficient to me).
I need the counter to increase after initial sync as well as after regular 
logical replication sync.
Triggers not to work without ENABLE ALWAYS.
In addition, If I try trigger that is "FOR EACH STATEMENT" it works only for 
initial sync and not for regular logical replication sync.
Having per row set_time_trig  trigger takes about 1 minute when updating 50k 
rows in one transaction (all I need is to increase update_count by 1, why spend 
1 minute for it) . How can I improve this?
CREATE TABLE IF NOT EXISTS tst.t2
(
id bigint NOT NULL,
c1 int,
CONSTRAINT pk_t2 PRIMARY KEY (id)
);

CREATE TABLE IF NOT EXISTS tst.time_audit_tbl
(
table_name character varying(63) COLLATE pg_catalog."default" NOT NULL,
update_count integer DEFAULT 0,
CONSTRAINT updated_time_audit_unique UNIQUE (table_name)
);


CREATE FUNCTION tst.set_time() RETURNS trigger
LANGUAGE plpgsql SECURITY DEFINER
AS $$
DECLARE
  updated_count int;
BEGIN
UPDATE tst.time_audit_tbl SET update_count = update_count + 1 WHERE 
table_name = CONCAT(TG_TABLE_SCHEMA, '.', TG_TABLE_NAME);
GET DIAGNOSTICS updated_count = ROW_COUNT;
IF updated_count = 0 THEN
  RAISE EXCEPTION 'set_updated_time().  Table not found %.%', 
TG_TABLE_SCHEMA, TG_TABLE_NAME;
END IF;

RETURN coalesce(NEW, OLD);
END;
$$;


CREATE  TRIGGER set_time_trig
AFTER INSERT OR DELETE OR UPDATE
ON tst.t2
FOR EACH ROW
EXECUTE FUNCTION tst.set_time();

ALTER TABLE tst.t2 ENABLE ALWAYS TRIGGER set_time_trig;

IMPORTANT - This email and any attachments is intended for the above named 
addressee(s), and may contain information which is confidential or privileged. 
If you are not the intended recipient, please inform the sender immediately and 
delete this email: you should not copy or use this e-mail for any purpose nor 
disclose its contents to any person.


Re: New addition to the merge sql standard

2023-11-16 Thread Alvaro Herrera
On 2023-Nov-16, Nick DeCoursin wrote:

> In my opinion, it would be better for merge to offer the functionality to
> simply ignore the rows that cause unique violation exceptions instead of
> tanking the whole query.

"ignore" may not be what you want, though.  Perhaps the fact that insert
(coming from the NOT MATCHED clause) fails (== conflicts with a tuple
concurrently inserted in an unique or exclusion constraint) should
transform the row operation into a MATCHED case, so it'd fire the other
clauses in the overall MERGE operation.  Then you could add a WHEN
MATCHED DO NOTHING case which does the ignoring that you want; or just
let them be handled by WHEN MATCHED UPDATE or whatever.  But you may
need some way to distinguish rows that appeared concurrently from rows
that were there all along.

In regards to the SQL standard, I hope what you're saying is merely not
documented by them.  If it indeed isn't, it may be possible to get them
to accept some new behavior, and then I'm sure we'd consider
implementing it.  If your suggestion goes against what they already
have, I'm afraid you'd be doomed.  So the next question is, how do other
implementations handle this case you're talking about?  SQL Server, DB2
and Oracle being the relevant ones.

Assuming the idea is good and there are no conflicts, then maybe it's
just lack of round tuits.

Happen to have some?

I vaguely recall thinking about this, and noticing that implementing
something of this sort would require messing around with the ExecInsert
interface.  It'd probably require splitting it in pieces, similar to how
ExecUpdate was split.

There are some comments in the code about possible "live-locks" where
merge would be eternally confused between inserting a new row which it
then wants to delete; or something like that.  For sure we would need to
understand the concurrent behavior of this new feature very clearly.


An interesting point is that our inserts *wait* to see whether the
concurrent insertion commits or aborts, when a unique constraint is
involved.  I'm not sure you want to have MERGE blocking on concurrent
inserts.  This is all assuming READ COMMITTED semantics; on REPEATABLE
READ or higher, I think you're just screwed, because of course MERGE is
not going to get a snapshot that sees the rows inserted by transactions
that started after.

You'd need to explore all this very carefully.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/




New addition to the merge sql standard

2023-11-16 Thread Nick DeCoursin
Dear Postgres Administrators,

There was a great article of `merge` by Lukas Fittl here:
https://pganalyze.com/blog/5mins-postgres-15-merge-vs-insert-on-conflict

In his article, he highlights one of the severe disadvantages to merge:

The comment that he essentially made is that the downside of MERGE's
> handling of concurrency is that when you concurrently INSERT, so at the
> same time as you're executing the MERGE statement, there is another INSERT
> going on, then MERGE might not notice that. MERGE would go into its INSERT
> logic, and then it would get a unique violation.


This means that any individual row insert during the insert logic of the
merge query can cause a unique violation and tank the entire merge query.

I explained this in more detail here:
https://stackoverflow.com/questions/77479975/postgres-merge-silently-ignore-unique-constraint-violation

In my opinion, it would be better for merge to offer the functionality to
simply ignore the rows that cause unique violation exceptions instead of
tanking the whole query.

Thank you,
Nick