On Tue, Jan 7, 2014 at 8:46 PM, Peter Geoghegan <p...@heroku.com> wrote: > I've worked on a simple set of tests, written quickly in bash, that I > think exercise interesting cases: > > https://github.com/petergeoghegan/upsert > > Perhaps most notably, these tests make comparisons between the > performance of ordinary inserts with a serial primary key table, and > effectively equivalent upserts that always insert.
While I realize that everyone is busy, I'm concerned about the lack of discussing here. It's been 6 full days since I posted my benchmark, which I expected to quickly clear some things up, or at least garner interest, and yet no one has commented here since. Here is a summary of the situation, at least as I understand it: * My patch has been shown to perform much better than the alternative "promise tuples" proposal. The benchmark previously published, referred to above makes this evident for workloads with lots of contention [1]. Now, to cover everything, I've gone on to benchmark inserts into a table foo(serial, int4, text) that lock the row using the new infrastructure. The SERIAL column is the primary key. I'm trying to characterize the overhead of the extended value locking here, by showing the same case (a worst case) with and without the overhead. Here are the results: http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/vs-vanilla-insert/ (asynchronous commits, logged table) With both extremes covered, the data suggests that my patch performs very well by *any* standard. But if we consider how things compare to the alternative proposal, all indications are that performance is far superior (at least for representative cases without too many unique indexes, not that I suspect things are much different with many). Previous concerns about the cost of extended leaf page locking ought to be almost put to rest by this benchmark, because inserting a sequence of btree index tuple integers in succession is a particularly bad case, and yet in practice the implementation does very well. (With my patch, we're testing the same statement with an ON DUPLICATE KEY LOCK FOR UPDATE part, but there are naturally no conflicts on the SERIAL PK - on master we're testing the same INSERT statement without that, inserting sequence values just as before, only without the worst-case value locking overhead). * The alternative exclusion* patch still deadlocks in an unprincipled fashion, when simple, idiomatic usage encounters contention. Heikki intends to produce a revision that fixes the problem, though having considered it carefully myself, I don't know what mechanism he has in mind, and frankly I'm skeptical. More importantly, I have to question whether we should continue to pursue that alternative approach, giving what we now know about its performance characteristics. It could be improved, but not by terribly much, particularly for the case where there is plenty of update contention, which was shown in [1] to be approximately 2-3 times slower than extended page locking (*and* it's already looking for would-be duplicates *first*). I'm trying to be as fair as possible, and yet the difference is huge. It's going to be really hard to beat something where the decision to try to see if we should insert or update comes so late: the decision is made as late as possible, is based on strong indicators of likely outcome, while the cost of making the wrong choice is very low. With shared buffer locks held calling _bt_check_unique(), we still lock out concurrent would-be duplicate insertion, and so don't need to restart from scratch (to insert instead) in the same way as with the alternative proposal's largely AM-naive approach. * I am not aware that anyone considers that there are any open items yet. I've addressed all of those now. Progress here is entirely blocked on waiting for review feedback. With the new priorConflict lock strength optimization, my patch is in some ways similar to what Heikki proposed (in the exclusion* patch). It's as if the first phase, the locking operation is an index scan with an identity crisis. It can decide to continue to be an "index scan" (albeit an oddball one with an insertion scankey that using shared buffer locks prevents concurrent duplicate insertion, for very efficient uniqueness checks), or it can decide to actually insert, at the last possible moment. The second phase is picked up with much of the work already complete from the first, so the amount of wasted work is very close to zero in all cases. How can anything beat that? If the main argument for the exclusion approach is that it works with exclusion constraints, then I can still go and make what I've done work there too (for the IGNORE case, which I maintain is the only exclusion constraint variant of this that is useful to users). In any case I think making anything work for exclusion constraints should be a relatively low priority. I'd like to hear more opinions on what I've done here, if anyone has bandwidth to spare. I doubt I need to remind anybody that this is a feature of considerable strategic importance. We need this, and we've been unnecessarily at a disadvantage to other systems by not having it for all these years. Every application developer wants this feature - it's a *very* frequent complaint. [1] http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/upsert-cmp-2/ -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers