Hello hackers,

Andres Freund diagnosed a case of $SUBJECT in a customer's 9.6 system.
I've written a minimal reproducer and a prototype patch to address the
root cause.

The problem is that StandbyReleaseLocks() does a linear search of all
known AccessExclusiveLocks when a transaction ends.  Luckily, since
v10 (commit 9b013dc2) that is skipped for transactions that haven't
taken any AELs and aren't using 2PC, but that doesn't help all users.

It's fine if the AEL list is short, but if you do something that takes
a lot of AELs such as restoring a database with many tables or
truncating a lot of partitions while other transactions are in flight
then we start doing O(txrate * nlocks * nsubxacts) work and that can
hurt.

The reproducer script I've attached creates one long-lived transaction
that acquires 6,000 AELs and takes a nap, while 48 connections run
trivial 2PC transactions (I was also able to reproduce the effect
without 2PC by creating a throw-away temporary table in every
transaction, but it was unreliable due to contention slowing
everything down).  For me, the standby's startup process becomes 100%
pegged, replay_lag begins to climb and perf says something like:

+   97.88%    96.96%  postgres  postgres            [.] StandbyReleaseLocks

The attached patch splits the AEL list into one list per xid and
sticks them in a hash table.  That makes perf say something like:

+    0.60%     0.00%  postgres  postgres            [.] StandbyReleaseLocks

This seems like something we'd want to back-patch because the problem
affects all branches (the older releases more severely because they
lack the above-mentioned optimisation).

Thoughts?

-- 
Thomas Munro
http://www.enterprisedb.com
import psycopg2
import threading
import time

DSN = "dbname=postgres host=localhost"

def init():
  conn = psycopg2.connect(DSN)
  cursor = conn.cursor()
  cursor.execute("""DROP TABLE IF EXISTS foo""")
  cursor.execute("""CREATE TABLE foo (i serial primary key)""")
  conn.commit()

def big_slow():
  conn = psycopg2.connect(DSN)
  cursor = conn.cursor()
  # hold onto many AELs
  for i in range(6000):
    cursor.execute("create temp table foo%s (i int) on commit drop" % (i,))
  time.sleep(600)
  conn.rollback()

def many_small_xacts(thread_id):
  conn = psycopg2.connect(DSN)
  cursor = conn.cursor()
  for loops in range(100000):
    #cursor.execute("create temp table foo1 (i int) on commit drop");
    cursor.execute("begin")
    cursor.execute("insert into foo values (default)")
    cursor.execute("prepare transaction %s", [thread_id])
    cursor.execute("commit prepared %s", [thread_id])

init()
threading.Thread(target=big_slow).start()                              
for i in range(48):
  threading.Thread(target=many_small_xacts, args=['t' + str(i)]).start()                              

Attachment: 0001-Move-RecoveryLockList-into-a-hash-table.patch
Description: Binary data

Reply via email to