On Tue, Oct 7, 2014 at 11:25 AM, Peter Geoghegan <p...@heroku.com> wrote:
> Now, we're not talking about a huge advantage here (I should re-test
> that).

I attach raw output when running the bash scripts insert.sh and
update.sh. These are benchmarks that concern performance in terms of
total system throughput (TPS). The scripts are available from my
stress-test suite: https://github.com/petergeoghegan/upsert

These scripts were originally designed to compare UPSERT with an
unsympathetic "gold-standard" for performance: "equivalent" INSERTs
and UPDATEs. I looked at a few runs of 60 seconds, on unlogged tables,
making the most direct comparison between UPSERTs and "equivalent"
INSERTs and UPDATEs that is possible. To be clear, by "equivalent" I
mean UPSERTS where we know we'll only UPDATE (in the case of
update.sh), and UPSERTS where we know we'll only INSERT (in the case
of insert.sh).

Both #1 and #2 do respectably as compared to "equivalent" INSERTs and
UPDATEs. There may be even less sympathetic though more representative
cases, but certainly for these simple cases, performance is solid
across the board. I got these numbers on my laptop, and it may be
necessary to devise a more rigorous benchmark later, but performance
is quite consistent between runs shown here.

Approach #1 wins out with UPDATEs. The heavyweight-lock avoidance
stuff is enough to compensate for the fact that we never INSERT (and
never need B-Tree leaf page heavyweight locks). Median TPS was
19,310.79 for #1. Whereas for #2, it was 18,872.63 TPS. This is the
case even though the "pre-check" for #2 is always appropriate, while
we still acquire page-level heavyweight locks sometimes with #1.

INSERTs see #2 win, and by a wider margin than #1 beat #2 with
UPDATEs. However, insert.sh is by design very unsympathetic towards
#1. It uses a serial primary key, so every INSERT uselessly obtains a
HW lock on the same leaf page for the duration of heap insertion.
Anyway, the median INSERT TPS numbers is 17,759.53 for #1, and
20,441.57 TPS for #2. So you're pretty much seeing the full brunt of
page heavyweight locking, and it isn't all that bad. However, Heikki
has said something about being more clever with when and how #2 is
made to pre-check (which is always wasted here); so it's possible to
imagine INSERTs becoming faster for #2, while that probably isn't the
case for #1. I think that if I wanted to, I could get #1 to do much
better on another case where page heavyweight locking is more varied.
My goal here was to do the opposite, though.

-- 
Peter Geoghegan
hwlocking (#1):
===============

 ******** UPDATES (#1) ********
pg@hamster:~/upsert$ ./update.sh
running ./update.sh benchmark 10000000 times
Test upsert as a replacement for update...no insert part, for comparison 
against raw update performance
UPSERT with no inserts, only updates, vs. equivalent updates
running ./update.sh benchmark
trying 60 second run (new infrastructure, extended hwlocking)
NOTICE:  table "foo" does not exist, skipping
DROP TABLE
CREATE TABLE
pre-inserting tuples to update
CHECKPOINT
transaction type: Custom query
scaling factor: 1000000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1158697
latency average: 0.414 ms
tps = 19308.753179 (including connections establishing)
tps = 19313.317204 (excluding connections establishing)
trying 60 second run (traditional updates, equivalent to master)
DROP TABLE
CREATE TABLE
pre-inserting tuples to update
CHECKPOINT
transaction type: Custom query
scaling factor: 1000000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1552490
latency average: 0.309 ms
tps = 25873.503004 (including connections establishing)
tps = 25878.810475 (excluding connections establishing)



running ./update.sh benchmark
trying 60 second run (new infrastructure, extended hwlocking)
DROP TABLE
CREATE TABLE
pre-inserting tuples to update
CHECKPOINT
transaction type: Custom query
scaling factor: 1000000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1158654
latency average: 0.414 ms
tps = 19310.786710 (including connections establishing)
tps = 19314.838618 (excluding connections establishing)
trying 60 second run (traditional updates, equivalent to master)
DROP TABLE
CREATE TABLE
pre-inserting tuples to update
CHECKPOINT
transaction type: Custom query
scaling factor: 1000000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1557929
latency average: 0.308 ms
tps = 25964.116323 (including connections establishing)
tps = 25969.841506 (excluding connections establishing)



running ./update.sh benchmark
trying 60 second run (new infrastructure, extended hwlocking)
DROP TABLE
CREATE TABLE
pre-inserting tuples to update
CHECKPOINT
transaction type: Custom query
scaling factor: 1000000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1156364
latency average: 0.415 ms
tps = 19270.091404 (including connections establishing)
tps = 19273.968473 (excluding connections establishing)
trying 60 second run (traditional updates, equivalent to master)
DROP TABLE
CREATE TABLE
pre-inserting tuples to update
CHECKPOINT
transaction type: Custom query
scaling factor: 1000000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1582520
latency average: 0.303 ms
tps = 26375.209370 (including connections establishing)
tps = 26380.587907 (excluding connections establishing)

 ******** INSERTS (#1) ********
pg@hamster:~/upsert$ ./insert.sh 
running ./insert.sh benchmark 10000000 times
Test upsert as a replacement for insert...no update part, for comparison 
against raw insert performance
Note that a SERIAL PK is used here, to maximize lock contention 
running ./insert.sh benchmark
trying 60 second run (new infrastructure, extended hwlocking)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 4
duration: 60 s
number of transactions actually processed: 1115938
latency average: 0.430 ms
tps = 18597.840257 (including connections establishing)
tps = 18599.343148 (excluding connections establishing)
trying 60 second run (traditional inserts, equivalent to master)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1616974
latency average: 0.297 ms
tps = 26905.090310 (including connections establishing)
tps = 26910.548614 (excluding connections establishing)



running ./insert.sh benchmark
trying 60 second run (new infrastructure, extended hwlocking)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 4
duration: 60 s
number of transactions actually processed: 1050853
latency average: 0.457 ms
tps = 17513.047963 (including connections establishing)
tps = 17514.834648 (excluding connections establishing)
trying 60 second run (traditional inserts, equivalent to master)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1611726
latency average: 0.298 ms
tps = 26860.299912 (including connections establishing)
tps = 26865.700452 (excluding connections establishing)



running ./insert.sh benchmark
trying 60 second run (new infrastructure, extended hwlocking)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 4
duration: 60 s
number of transactions actually processed: 1065632
latency average: 0.450 ms
tps = 17759.534360 (including connections establishing)
tps = 17761.232238 (excluding connections establishing)
trying 60 second run (traditional inserts, equivalent to master)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1540704
latency average: 0.312 ms
tps = 25676.640722 (including connections establishing)
tps = 25682.510058 (excluding connections establishing)



running ./insert.sh benchmark
trying 60 second run (new infrastructure, extended hwlocking)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 4
duration: 60 s
number of transactions actually processed: 1044714
latency average: 0.459 ms
tps = 17410.476984 (including connections establishing)
tps = 17412.336106 (excluding connections establishing)
trying 60 second run (traditional inserts, equivalent to master)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1575568
latency average: 0.305 ms
tps = 26256.201708 (including connections establishing)
tps = 26261.845552 (excluding connections establishing)



running ./insert.sh benchmark
trying 60 second run (new infrastructure, extended hwlocking)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 4
duration: 60 s
number of transactions actually processed: 1073967
latency average: 0.447 ms
tps = 17898.406523 (including connections establishing)
tps = 17900.024292 (excluding connections establishing)
trying 60 second run (traditional inserts, equivalent to master)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1555066
latency average: 0.309 ms
tps = 25916.316217 (including connections establishing)
tps = 25921.476894 (excluding connections establishing)


promise tuples (#2):
===================

 ******** UPDATES (#2) ********
pg@hamster:~/upsert$ ./update.sh
running ./update.sh benchmark 10000000 times
Test upsert as a replacement for update...no insert part, for comparison 
against raw update performance
UPSERT with no inserts, only updates, vs. equivalent updates
running ./update.sh benchmark
trying 60 second run (new infrastructure, extended hwlocking)
NOTICE:  table "foo" does not exist, skipping
DROP TABLE
CREATE TABLE
pre-inserting tuples to update
CHECKPOINT
transaction type: Custom query
scaling factor: 1000000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1123128
latency average: 0.427 ms
tps = 18717.037479 (including connections establishing)
tps = 18720.705456 (excluding connections establishing)
trying 60 second run (traditional updates, equivalent to master)
DROP TABLE
CREATE TABLE
pre-inserting tuples to update
CHECKPOINT
transaction type: Custom query
scaling factor: 1000000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1538212
latency average: 0.312 ms
tps = 25634.910723 (including connections establishing)
tps = 25639.996456 (excluding connections establishing)



running ./update.sh benchmark
trying 60 second run (new infrastructure, extended hwlocking)
DROP TABLE
CREATE TABLE
pre-inserting tuples to update
CHECKPOINT
transaction type: Custom query
scaling factor: 1000000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1132236
latency average: 0.424 ms
tps = 18868.135507 (including connections establishing)
tps = 18872.629751 (excluding connections establishing)
trying 60 second run (traditional updates, equivalent to master)
DROP TABLE
CREATE TABLE
pre-inserting tuples to update
CHECKPOINT
transaction type: Custom query
scaling factor: 1000000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1527868
latency average: 0.314 ms
tps = 25462.839591 (including connections establishing)
tps = 25468.147238 (excluding connections establishing)



running ./update.sh benchmark
trying 60 second run (new infrastructure, extended hwlocking)
DROP TABLE
CREATE TABLE
pre-inserting tuples to update
CHECKPOINT
transaction type: Custom query
scaling factor: 1000000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1134058
latency average: 0.423 ms
tps = 18896.552117 (including connections establishing)
tps = 18900.606925 (excluding connections establishing)
trying 60 second run (traditional updates, equivalent to master)
DROP TABLE
CREATE TABLE
pre-inserting tuples to update
CHECKPOINT
transaction type: Custom query
scaling factor: 1000000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1566905
latency average: 0.306 ms
tps = 26114.340380 (including connections establishing)
tps = 26119.575030 (excluding connections establishing)

 ******** INSERTS (#2) ********
pg@hamster:~/upsert$ ./insert.sh 
running ./insert.sh benchmark 10000000 times
Test upsert as a replacement for insert...no update part, for comparison 
against raw insert performance
Note that a SERIAL PK is used here, to maximize lock contention 
running ./insert.sh benchmark
trying 60 second run (new infrastructure, extended hwlocking)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 4
duration: 60 s
number of transactions actually processed: 1295151
latency average: 0.371 ms
tps = 21584.352765 (including connections establishing)
tps = 21586.111732 (excluding connections establishing)
trying 60 second run (traditional inserts, equivalent to master)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1595229
latency average: 0.301 ms
tps = 26585.172063 (including connections establishing)
tps = 26590.385164 (excluding connections establishing)



running ./insert.sh benchmark
trying 60 second run (new infrastructure, extended hwlocking)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 4
duration: 60 s
number of transactions actually processed: 1243224
latency average: 0.386 ms
tps = 20720.270153 (including connections establishing)
tps = 20722.015799 (excluding connections establishing)
trying 60 second run (traditional inserts, equivalent to master)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1559009
latency average: 0.308 ms
tps = 25981.598368 (including connections establishing)
tps = 25987.410897 (excluding connections establishing)



running ./insert.sh benchmark
trying 60 second run (new infrastructure, extended hwlocking)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 4
duration: 60 s
number of transactions actually processed: 1226602
latency average: 0.391 ms
tps = 20441.565424 (including connections establishing)
tps = 20443.592909 (excluding connections establishing)
trying 60 second run (traditional inserts, equivalent to master)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1561024
latency average: 0.307 ms
tps = 26015.389541 (including connections establishing)
tps = 26020.686144 (excluding connections establishing)



running ./insert.sh benchmark
trying 60 second run (new infrastructure, extended hwlocking)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 4
duration: 60 s
number of transactions actually processed: 1226969
latency average: 0.391 ms
tps = 20448.181466 (including connections establishing)
tps = 20450.289756 (excluding connections establishing)
trying 60 second run (traditional inserts, equivalent to master)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1496496
latency average: 0.321 ms
tps = 24901.224739 (including connections establishing)
tps = 24906.522484 (excluding connections establishing)



running ./insert.sh benchmark
trying 60 second run (new infrastructure, extended hwlocking)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 4
duration: 60 s
number of transactions actually processed: 1197352
latency average: 0.401 ms
tps = 19955.269006 (including connections establishing)
tps = 19957.244382 (excluding connections establishing)
trying 60 second run (traditional inserts, equivalent to master)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1542751
latency average: 0.311 ms
tps = 25709.177045 (including connections establishing)
tps = 25714.432820 (excluding connections establishing)



running ./insert.sh benchmark
trying 60 second run (new infrastructure, extended hwlocking)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 4
duration: 60 s
number of transactions actually processed: 1181995
latency average: 0.406 ms
tps = 19697.758449 (including connections establishing)
tps = 19699.876610 (excluding connections establishing)
trying 60 second run (traditional inserts, equivalent to master)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1536436
latency average: 0.312 ms
tps = 25603.964182 (including connections establishing)
tps = 25609.142963 (excluding connections establishing)



running ./insert.sh benchmark
trying 60 second run (new infrastructure, extended hwlocking)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 4
duration: 60 s
number of transactions actually processed: 1200230
latency average: 0.400 ms
tps = 20002.224154 (including connections establishing)
tps = 20004.182990 (excluding connections establishing)
trying 60 second run (traditional inserts, equivalent to master)
DROP TABLE
CREATE TABLE
transaction type: Custom query
scaling factor: 150000
query mode: simple
number of clients: 8
number of threads: 1
duration: 60 s
number of transactions actually processed: 1539267
latency average: 0.312 ms
tps = 25652.835154 (including connections establishing)
tps = 25657.859502 (excluding connections establishing)
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to