On 2/14/07, Hannu Krosing <[EMAIL PROTECTED]> wrote:
OTOH, for same page HOT tuples, we have the command and trx ids stored twice first as cmax,xmax of the old tuple and as cmin,xmin of the updated tuple. One of these could probably be used for in-page HOT tuple pointer.
I think we recently merged cmin/cmax into a single combo cid. So I don't think we can use cmin for storing back pointer. Idea of using xmin seems feasible though. As we can't rely on get-xmax-from-next-in-chain behaviour for off-page
tuples, we need to move the other way, that is storing a pointer to previous version and getting xmin from there. That means keeping a chain back pointers in xmin fields for all but the oldest tuple in HOT chain/cycle.
Why can't we store the real xmin/xmax at the end of the chain and store forward pointers in the xmax ? I find the idea of having back pointers more compelling though. This requires a little shuffling around of xmin/xmax info when removing
dead HOT chain members, but this would void the need to do any index changes during HOT updates. Avoiding all index updates is probably good, as it means we dont need to do any index page locking.
I agree. If we can design a solution that does not require any index updates, that would save us a lot. One problem with such a approach though is that we might have to live with a dead tuple/stub until another update occurs and the tuple/stub is reused. So the proposed structure for HOT chain is like this:
* Oldest tuple in chain has real xmin/xmax, this needs a flag to be recognisable, of we may just start transaction counter at 64K so that any xmin that fits in 2 bytes is immediately recognized as hot back pointer. Or start cmin at 1 and use cmin=0 as pointer flag. * All newer tuples have real xmax, and in-page pointer to previous version in place of xmin. the real xmin is xmax of the previous tuple. When previous tuple is removed, xmin is stored in the new oldest tuple. * Index can point to any tuple in HOT chain. Finding the tuple by index * To find a tuple, one errives to (possibly) center of HOT chain, and the tuple may be found either by following the in-page pointer stored in xmin field or ctid pointers.
Can we quickly figure out based on the given snapshot and the root tuple xmin/xmax, whether to follow the forward or the backward pointer to arrive at a visible tuple ? * If the tuples ctid is pointing to off page, which means it must thus
be the newest one in HOT chain and also end of it. The chain ends also when the tuple is inserted by an rollbacked transaction. * there is a special case when the index pointer points to a rollbacked tuple. In this case the tuple can't be removed, but should be reused on next update. But this is the case for any HOT rollbacks.
One problem is with aborted HOT-updates. According to this scheme, xmin of the aborted version is stored in the previous version. What happens when that gets updated again and committed ? If we follow the back pointer from the aborted tuple to fetch the xmin, we would actually be fetching the txid of the committed transaction which later updated the tuple. This would fool us to incorrectly believe that the aborted tuple is LIVE. May be we can set XMIN_INVALID for the next tuple in the chain when a tuple being updated has t_ctid pointing to a tuple in the same page, other than itself. I am sure there are interesting interactions between vacuum-ing that needs to be considered as well. Thanks, Pavan -- EnterpriseDB http://www.enterprisedb.com