Peter Geoghegan <p...@heroku.com> wrote: > I still strongly feel it ought to be driven by an insert
Could you clarify that? Does this mean that you feel that we should write to the heap before reading the index to see if the row will be a duplicate? If so, I think that is a bad idea, since this will sometimes be used to apply a new data set which hasn't changed much from the old, and that approach will perform poorly for this use case, causing a lot of bloat. It certainly would work well for the case that most of the rows are expected to be INSERTs rather than DELETEs, but I'm not sure that's justification for causing extreme bloat in the other cases. Also, just a reminder that I'm going to squawk loudly if the implementation does not do something fairly predictable and sane for the case that the table has more than one UNIQUE index and you attempt to UPSERT a row that is a duplicate of one row on one of the indexes and a different row on a different index. The example discussed during your PGCon talk was something like a city table with two column, each with a UNIQUE constraint, containing: city_id | city_name ---------+----------- 1 | Toronto 2 | Ottawa ... and an UPSERT comes through for (1, 'Ottawa'). We would all like for that never to happen, but it will. There must be sane and documented behavior in that case. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers