Peter Geoghegan <p...@heroku.com> wrote: > On Tue, Oct 7, 2014 at 5:23 AM, Simon Riggs <si...@2ndquadrant.com> wrote: >> IIRC it wasn't agreed that we needed to identify which indexes in the >> upsert SQL statement itself, since this would be possible in other >> ways and would require programmers to know which unique constraints >> are declared. > > Kevin seemed quite concerned about that. That is something that seems > hard to reconcile with supporting the MERGE syntax. Perhaps Kevin can > comment on that, since he was in favor of both being able to specify > user intent by accepting a unique index, while also being in favor of > the MERGE syntax.
Well, I mostly wanted to make sure we properly considered what the implications were of using the standard syntax without other keywords or decorations before deciding to go the non-standard route. In spite of an alarming tendency for people to assume that meant that I didn't understand the desired semantics, I feel enough people have understood the question and weighed in in favor of an explicit choice between semantics, rather than inferring concurrency handling based on the availability of the index necessary for the slicker behavior. I'm willing to concede that overall consensus is leaning toward the view that UPSERT semantics should be conditioned on explicit syntax; I'll drop that much going forward. Granting that, I will say that I lean toward either the MERGE syntax with CONCURRENTLY being the flag to use UPSERT semantics, or a separate UPSERT command which is as close to identical to the MERGE syntax (other than the opening verb) as possible. I see that as still needing the ON clause so that you can specify which values match which columns from the target table. I'm fine with starting with the syntax in the standard, which has no DELETE or IGNORE options (as of the latest version I've seen). So the syntax I'm suggesting is close to what Simon is suggesting, but a more compliant form would be: MERGE CONCURRENTLY INTO foo USING (VALUES (valuelist) aliases) ON (conditions) WHEN NOT MATCHED INSERT [ (columnlist) ] VALUES (valuelist) WHEN MATCHED UPDATE SET colname = expression [, ...] Rather than pseudo-randomly picking a unique index or using a constraint or index name, the ON condition would need to allow matching based on equality to all columns of a unique index which only referenced NOT NULL columns; we would pick an index which matched those conditions. In any event, the unique index would be required if CONCURRENTLY was specified. Using column matching to pick the index (like we do when specifying a FOREIGN KEY constraint) is more in keeping with other SQL statements, and seems generally safer to me. It would also make it fairly painless for people to switch concurrency techniques for what is, after all, exactly the same operation except for differences in handling of concurrent conflicting DML. As I said, I'm also OK with using UPSERT in place of MERGE CONCURRENTLY. I also feel that if we could allow: USING (VALUES (valuelist) [, ...]) that would be great. In fact, I don't see why that can't be pretty much any relation, but it doesn't have to be for a first cut. A relation would allow a temporary table to be loaded with a batch of rows where the intent is to UPSERT every row in the batch, without needing to write a loop to do it. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers