Peter Geoghegan-3 wrote >> with semantics like this: >> >> 1. Search the table, using any type of scan you like, for a row >> matching the given predicate. > > Perhaps I've misunderstood, but this is fundamentally different to > what I'd always thought would need to happen. I always understood that > the UPSERT should be insert-driven, and so not really have an initial > scan at all in the same sense that every Insert lacks one. Moreover, > I've always thought that everyone agreed on that. We go to insert, and > then in the course of inserting index tuples do something with dirty > snapshots. That's how we get information on conflicts.
SQL standard not-withstanding there really is no way to pick one implementation and not have it be sub-optimal in some situations. Unless there is a high barrier why not introduce syntax: SCAN FIRST; INSERT FIRST that allows the user to specify the behavior that they expect would be most efficient given their existing/new data ratio? >> Having said all that, I believe the INSERT ON CONFLICT syntax is more >> easily comprehensible than previous proposals. But I still tend to >> agree with Andres that an explicit UPSERT syntax or something like it, >> that captures all of the MVCC games inside itself, is likely >> preferable from a user standpoint, whatever the implementation ends up >> looking like. > > Okay then. If you both feel that way, I will come up with something > closer to what you sketch. But for now I still strongly feel it ought > to be driven by an insert. Perhaps I've misunderstood you entirely, > though. Getting a little syntax crazy here but having all of: UPSERT [SCAN|INSERT] FIRST INSERT ON CONFLICT UPDATE - same as INSERT FIRST UPDATE ON MISSING INSERT - same as SCAN FIRST with the corresponding 2 implementations would make the user interface slightly more complicated but able to be conformed to the actual data that the user has. You could basically perform a two-phase pass where you run the user-requested algorithm and then for all failures attempt the alternate algorithm and then error if both fail. I am not at all fluent on the concurrency issues here, and the MVCC violations and re-tries that might be considered, but at a high-level there is disagreement here simply because both answers are "correct" and ideally both can be provided to the user. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Making-joins-involving-ctid-work-for-the-benefit-of-UPSERT-tp5811919p5812640.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers