On 04.10.2010 14:02, Hans-Jürgen Schönig wrote:
it seems we have found a fairly nasty problem.
imagine a long transaction which piles up XX.XXX of locks (count on pg_locks) 
inside the same transaction by doing some tasty savepoints, with hold cursors 
and so on.
in this case we see that a normal count issued in a second database connection 
will take ages. in a practical case we did a plain seq_scan in connection 2. 
instead of 262 ms (cached case) it started to head north linearily with the 
number of locks taken by connection 1. in an extreme case it took around 1.5 
hours or so (on XXX.XXX pg_locks entries).

i tracked down the issue quickly and make the following profile (in 10k locks 
or so):

Flat profile:

Each sample counts as 0.01 seconds.
   %   cumulative   self              self     total
  time   seconds   seconds    calls   s/call   s/call  name
  32.49      6.01     6.01 98809118     0.00     0.00  
SimpleLruReadPage_ReadOnly
  26.97     11.00     4.99 98837761     0.00     0.00  LWLockAcquire
  21.89     15.05     4.05 98837761     0.00     0.00  LWLockRelease
   8.70     16.66     1.61 98789370     0.00     0.00  SubTransGetParent
   4.38     17.47     0.81    19748     0.00     0.00  
SubTransGetTopmostTransaction
   2.41     17.92     0.45 98851951     0.00     0.00  TransactionIdPrecedes
   0.59     18.03     0.11                             LWLockAssign
   0.54     18.13     0.10                             LWLockConditionalAcquire
   0.46     18.21     0.09    19748     0.00     0.00  TransactionLogFetch
   0.38     18.28     0.07                             SimpleLruReadPage
   0.27     18.33     0.05                             SubTransSetParent
   0.05     18.34     0.01   136778     0.00     0.00  AllocSetAlloc
   0.05     18.35     0.01    42996     0.00     0.00  slot_deform_tuple
   0.05     18.36     0.01    42660     0.00     0.00  
TransactionIdIsCurrentTransactionId

it seems we are running into a nice shared buffer / locking contention here and 
the number of calls explodes (this profiling infos is coming from a seq scan on 
a 500.000 rows table - 400 mb or so).

That doesn't seem related to the lock manager. Is the long-running transaction inserting a lot of tuples (by INSERT or UPDATE) to the same table that the seqscan scans? With a lot of different subtransaction xids. That profile looks like the seqscan is spending a lot of time swapping pg_subtrans pages in and out of the slru buffers.

Increasing NUM_SUBTRANS_BUFFERS should help. A more sophisticated solution would be to allocate slru buffers (for clog and other slru caches as well) dynamically from shared_buffers. That's been discussed before but no-one has gotten around to it.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to