On 4/6/23 17:49, Louis Tian wrote:
Hi Peter,

Thanks for your reply. Appreciate the help and discussion.

  In general UPSERT (or any definition of it that I can think of) does
not imply idempotency.

"Idempotence is the property of certain operations in mathematics and computer 
science whereby they can be applied multiple times without changing the result beyond the 
initial application." from Wikipedia.
the concept of Idempotence when applies to HTTP is consistent with the above. 
https://developer.mozilla.org/en-US/docs/Glossary/Idempotent. Or are you going 
by a different defintion that I am not aware of?
If you execute the same upsert multiple times, the state of the database 
remains the same as if only execute once.
If a row already exists, the first statement will update the row so does any 
subsequent statements. executing the same update multiple time is the same as 
executing it only once.
If the row doesn't exist, the first statement will insert that row and any 
subsequent will try to update, but the update has no real effect since it the 
value is exactly the same as the insert.
So by defintion, upsert is idempotent.

No it is not as Israel Brewster pointed out.

To his example I would add:

alter some_table add column ts_upsert_update timestamptz;

insert into some_table values('foo', 'bar') on conflict(tbl_id) do update set foo_fld = excluded.foo_fld, bar_fld = some_table.bar_fld, ts_upsert_update = now();

You are substituting whatever definition you have in your head for the definition as it actually exists.


It could just be a unique index or a unique constraint. So you can
upsert on any individual unique constraint/index, or the primary key.
Of course there might be several on a given table, but you can only
use one as the "conflict arbiter" per statement.

Understand that I can use any unique constraint with on conflict.
But semantically the only correct one is the primary key, since that's what 
identifies a row logically.
In that sense, any unique column(s) is a potential candidate for primary key.
It's more of a pedantic point rather than pragmatic one.
It's less of a problem for PostgreSQL where the semantic importance of primary 
key is not manifested at implementation level, since all index points to the 
tuple directly
Whereas it is more import for Databaes like MySQL where the secondary index 
points to the primary key index.

Again you are dealing with the imagined instead of the reality. Along that line you left out that a 'exclusion constraint violation error' can also trigger the ON CONFLICT.


Use some pesudo code might be helpful here to explain the difference.

How on conflict works at the moment.

try {
  insert row
} catch (duplicated key error) {
   update row
}

How I think it upsert should work

And therein lies your problem, you are imagining something that does not exist and more to the point will most likely not exist as it would break all code that depends on above behavior.


if (new.id exists) {
  update row
} else {
  insert row
}


I am not expecting an error here. The problem is with no conflict it always go 
down the insert path first and results in a not null constraint error.

To be expected, as after all the command is:

INSERT INTO <some_table> ...

While I am expecting the insert is never executed in the first place when that 
row already exist (as identified by it primary key). So the update execute 
without error.
I hope the pesudo code above is enough to clarify the difference?


Cheers,
Louis Tian



--
Adrian Klaver
adrian.kla...@aklaver.com



Reply via email to