Re: [HACKERS] Heap truncation without AccessExclusiveLock (9.4)

Heikki Linnakangas Fri, 17 May 2013 00:39:14 -0700

On 16.05.2013 00:18, Robert Haas wrote:

On Wed, May 15, 2013 at 11:35 AM, Heikki Linnakangas
<hlinnakan...@vmware.com>  wrote:

Shared memory space is limited, but we only need the watermarks for any
in-progress truncations. Let's keep them in shared memory, in a small
fixed-size array. That limits the number of concurrent truncations that can
be in-progress, but that should be ok.


Would it only limit the number of concurrent transactions that can be
in progress *due to vacuum*?  Or would it limit the total number of
TOTAL concurrent truncations?  Because a table could have arbitrarily
many inheritance children, and you might try to truncate the whole
thing at once...

It would only limit the number of concurrent *truncations*. Vacuums ingeneral would not count, only vacuums at the end of the vacuum process,trying to truncate the heap.

To not slow down common backend
operations, the values (or lack thereof) are cached in relcache. To sync the
relcache when the values change, there will be a new shared cache
invalidation event to force backends to refresh the cached watermark values.
A backend (vacuum) can ensure that all backends see the new value by first
updating the value in shared memory, sending the sinval message, and waiting
until everyone has received it.


AFAIK, the sinval mechanism isn't really well-designed to ensure that
these kinds of notifications arrive in a timely fashion.  There's no
particular bound on how long you might have to wait.  Pretty much all
inner loops have CHECK_FOR_INTERRUPTS(), but they definitely do not
all have AcceptInvalidationMessages(), nor would that be safe or
practical.  The sinval code sends catchup interrupts, but only for the
purpose of preventing sinval overflow, not for timely receipt.

Currently, vacuum will have to wait for all transactions that havetouched the relation to finish, to get the AccessExclusiveLock. If wedon't change anything in the sinval mechanism, the wait would be similar- until all currently in-progress transactions have finished. It's notquite the same; you'd have to wait for all in-progress transactions tofinish, not only those that have actually touched the relation. But onthe plus-side, you would not block new transactions from accessing therelation, so it's not too bad if it takes a long time.

If we could use the catchup interrupts to speed that up though, thatwould be much better. I think vacuum could simply send a catchupinterrupt, and wait until everyone has caught up. That wouldsignificantly increase the traffic of sinval queue and catchupinterrupts, compared to what it is today, but I think it would still beok. It would still only be a few sinval messages and catchup interruptsper truncation (ie. per vacuum).

Another problem is that sinval resets are bad for performance, and
anything we do that pushes more messages through sinval will increase
the frequency of resets.  Now if those are operations are things that
are relatively uncommon, it's not worth worrying about - but if it's
something that happens on every relation extension, I think that's
likely to cause problems.


It would not be on every relation extension, only on truncation.

With the watermarks, truncation works like this:

1. Set soft watermark to the point where we think we can truncate the
relation. Wait until everyone sees it (send sinval message, wait).


I'm also concerned about how you plan to synchronize access to this
shared memory arena.

I was thinking of a simple lwlock, or perhaps one lwlock per slot in thearena. It would not be accessed very frequently, because the watermarkvalues would be cached in the relcache. It would only need to beaccessed when:


1. Truncating the relation, by vacuum, to set the watermark values

2. By backends, to update the relcache, when it receives the sinvalmessage sent by vacuum.3. By backends, when writing above the cached watermark value. IOW, whenextending a relation that's being truncated at the same time.

In particular, it would definitely not be accessed every time a backendcurrently needs to do an lseek. Nor everytime a backend needs to extenda relation.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Heap truncation without AccessExclusiveLock (9.4)

Reply via email to