On 4/6/23 17:49, Louis Tian wrote:
Hi Peter,
Thanks for your reply. Appreciate the help and discussion.
In general UPSERT (or any definition of it that I can think of) does
not imply idempotency.
"Idempotence is the property of certain operations in mathematics and computer
science whereby they can be applied multiple times without changing the result beyond the
initial application." from Wikipedia.
the concept of Idempotence when applies to HTTP is consistent with the above.
https://developer.mozilla.org/en-US/docs/Glossary/Idempotent. Or are you going
by a different defintion that I am not aware of?
If you execute the same upsert multiple times, the state of the database
remains the same as if only execute once.
If a row already exists, the first statement will update the row so does any
subsequent statements. executing the same update multiple time is the same as
executing it only once.
If the row doesn't exist, the first statement will insert that row and any
subsequent will try to update, but the update has no real effect since it the
value is exactly the same as the insert.
So by defintion, upsert is idempotent.
No it is not as Israel Brewster pointed out.
To his example I would add:
alter some_table add column ts_upsert_update timestamptz;
insert into some_table values('foo', 'bar') on conflict(tbl_id) do
update set foo_fld = excluded.foo_fld, bar_fld = some_table.bar_fld,
ts_upsert_update = now();
You are substituting whatever definition you have in your head for the
definition as it actually exists.
It could just be a unique index or a unique constraint. So you can
upsert on any individual unique constraint/index, or the primary key.
Of course there might be several on a given table, but you can only
use one as the "conflict arbiter" per statement.
Understand that I can use any unique constraint with on conflict.
But semantically the only correct one is the primary key, since that's what
identifies a row logically.
In that sense, any unique column(s) is a potential candidate for primary key.
It's more of a pedantic point rather than pragmatic one.
It's less of a problem for PostgreSQL where the semantic importance of primary
key is not manifested at implementation level, since all index points to the
tuple directly
Whereas it is more import for Databaes like MySQL where the secondary index
points to the primary key index.
Again you are dealing with the imagined instead of the reality. Along
that line you left out that a 'exclusion constraint violation error' can
also trigger the ON CONFLICT.
Use some pesudo code might be helpful here to explain the difference.
How on conflict works at the moment.
try {
insert row
} catch (duplicated key error) {
update row
}
How I think it upsert should work
And therein lies your problem, you are imagining something that does not
exist and more to the point will most likely not exist as it would break
all code that depends on above behavior.
if (new.id exists) {
update row
} else {
insert row
}
I am not expecting an error here. The problem is with no conflict it always go
down the insert path first and results in a not null constraint error.
To be expected, as after all the command is:
INSERT INTO <some_table> ...
While I am expecting the insert is never executed in the first place when that
row already exist (as identified by it primary key). So the update execute
without error.
I hope the pesudo code above is enough to clarify the difference?
Cheers,
Louis Tian
--
Adrian Klaver
[email protected]