On 06/06/2017 07:24 AM, Ashutosh Bapat wrote:
On Tue, Jun 6, 2017 at 9:48 AM, Craig Ringer <cr...@2ndquadrant.com> wrote:
On 6 June 2017 at 12:13, Ashutosh Bapat <ashutosh.ba...@enterprisedb.com> wrote:

What happens when the epoch is so low that the rest of the XID does
not fit in 32bits of tuple header? Or such a case should never arise?

Storing an epoch implies that rows can't have (xmin,xmax) different by
more than one epoch. So if you're updating/deleting an extremely old
tuple you'll presumably have to set xmin to FrozenTransactionId if it
isn't already, so you can set a new epoch and xmax.

If the page has multiple such tuples, updating one tuple will mean
updating headers of other tuples as well? This means that those tuples
need to be locked for concurrent scans? May be not, since such tuples
will be anyway visible to any concurrent scans and updating xmin/xmax
doesn't change the visibility. But we might have to prevent multiple
updates to the xmin/xmax because of concurrent updates on the same
page.

"Store the epoch in the page header" is actually a slightly simpler-to-visualize, but incorrect, version of what we actually need to do. If you only store the epoch, then all the XIDs on a page need to belong to the same epoch, which causes trouble when the current epoch changes. Just after the epoch changes, you cannot necessarily freeze all the tuples from the previous epoch, because they would not yet be visible to everyone.

The full picture is that we need to store one 64-bit XID "base" value in the page header, and all the xmin/xmax values in the tuple headers are offsets relative to that base. With that, you effectively have 64-bit XIDs, as long as the *difference* between any two XIDs on a page is not greater than 2^32. That can be guaranteed, as long as we don't allow a transaction to be in-progress for more than 2^32 XIDs. That seems like a reasonable limitation.

But yes, when the "current XID - base XID in page header" becomes greater than 2^32, and you need to update a tuple on that page, you need to first freeze the page, update the base XID on the page header to a more recent value, and update the XID offsets on every tuple on the page accordingly. And to do that, you need to hold a lock on the page. If you don't move any tuples around at the same time, but just update the XID fields, and exclusive lock on the page is enough, i.e. you don't need to take a super-exclusive or vacuum lock. In any case, it happens so infrequently that it should not become a serious burden.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to