On 3/21/07, Bruce Momjian <[EMAIL PROTECTED]> wrote:
A different idea is to flag the _index_ as using HOT for the table or not, using a boolean in pg_index. The idea is that when a new index is created, it has its HOT boolean set to false and indexes all tuples and ignores HOT chains. Then doing lookups using that index, the new index does not follow HOT chains. We also add a boolean to pg_class to indicate no new HOT chains should be created and set that to false once the new index is created. Then, at some later time when all HOT chains are dead, we can enable HOT chain following for the new index and allow new HOT chains to be created. A more sophisticated idea would be to place an xid, rather than a boolean, in pg_index to indicate which chains were created after the index was created to control whether the index should follow that HOT chain, or ignore it. The xmax of the head of the HOT chain can be used as an indicator of when the chain was created. Transactions started before the pg_index xid could continue following the old rules and insert into the _new_ index for HOT chain additions, and new transactions would create HOT chains that could skip adding to the new index. Cleanup of the hybrid HOT chains (some indexes take part, some do not) would be more complex.
Bruce, thanks for bringing up this idea. As I think more about this idea, I think I am able to extend this further to solve the problems we discussed around it. One of my conerns were that the change the basic structure of heap with HOT-chains so that it should be possible to just look at the heap tuple and say whether it has any index pointer or not. The way I propose to extend/modify the idea is to use pg_index xid as suggested by Bruce to mark the index. This xid would guide the visibility of the index. As we all know, CREATE INDEX locks out UPDATEs on the table and further UPDATEs are possible only after the transaction creating the new index commits. When CREATE INDEX starts, it acquires ShareLock on the table. At this point we may have one or more HOT-update chains in the table. Tuples in this chain may be visible to one or more running transactions. The fact that we have ShareLock on the table means that all tuples in the chain except the one at the head either RECENTLY_DEAD or were UPDATEd by the same transaction that is now running CREATE INDEX. With this background, I propose to index ONLY the head of the HOT-chain. The TID of the root tuple is used instead of the actual TID of the tuple being indexed. This index will not be available to the transactions which are started before the CREATE INDEX transaction. Just like we use "indisvalid" flag to avoid including an invalid index in the plan, we use the pg_index "xid" to decide whether to use the index in the plan or not. Only transactions with txid > pg_index:xid can see the index and use it. In fact, the serializable transactions started before CREATE INDEX can not anyway see the index so all this is done to handle read-committed transactions. In this proposal we indexed only the latest version. But none of the transactions started after CREATE INDEX can anyway see the older tuples and hence we should be fine even if we don't index them in the new index. And none of the older transaction can see the index, so again we are safe. The design also helps us to preserve the heap HOT semantics and chain pruning and does not need VACUUM or any special handling. Can anyone spot a hole in this logic ? Comments ? Thanks, Pavan -- EnterpriseDB http://www.enterprisedb.com