Re: [PATCHES] [HACKERS] Including Snapshot Info with Indexes

Gokulakannan Somasundaram Wed, 23 Jan 2008 08:04:15 -0800

This fixes the bug in dealing with scans with 'or' conditions. I have also
attached the design document.



Thanks,
Gokul.

On Jan 16, 2008 7:25 PM, Gokulakannan Somasundaram <[EMAIL PROTECTED]>
wrote:

> Hi,
>     I did some more bug fixes and performance updates especially for
> select count(1) queries.
>
> Thanks,
> Gokul.
>

Thick Index Design:

Thick Index means index with snapshot information. It stores the snapshot 
information together with the IndexTuple. This would turn out to be useful, 
since the normal index scans need not refer to the heap for checking the heap 
tuple's visibility. This would also provide a new set of scans which will 
satisfy the complete query from Index. It will act like a mini Index Organized 
Table. This benefit comes at a cost, where in we need to update the index 
tuples, whenever there is an update / delete on the HeapTuple. Currently Thick 
indexes do not support Functional indexes.

Index Tuple Structure for thick index occupies 16 more bytes. ( Xmin - 4Bytes, 
Xmax - 4Bytes, Cid - 4 Bytes, t_infomask - 4 Bytes). In the t_infomask, it 
mainly uses the bits like  Xmin_Committed, Xmin_invalid etc. This 16 Byte field 
is called SnapshotFieldsData. Also there is a new column added into pg_index 
called 'indhassnapshot'.  This field, if true indicates the index being 
referred is Thick index
 The IndexScanDescData structure contains a pointer called SnapshotIndexParams. 
This structure contains various information, which can be used by a thick index 
to speed up queries of certain class.


+ typedef struct SnapshotIndexParamsData
+ {
+     bool         isIndexOnlyScan;
+     bool         is_minmax_scan;
+     bool         is_count_only_scan;
+     bool         cache_stack;
+     CmdType      operation;
+     HeapTuple    htup;
+     ScanKey      insertion_skey;  /* Insertion Scan Key */
+     void*      stack;
+     TupleTableSlot* slot;
+ } SnapshotIndexParamsData;


a) isIndexOnlyScan - Whether the scan need to goto the table to fetch extra 
attributes. If this is true, then thick index scan would start caching the 
index tuples. Currently the index tuples are being cached under the structure 
MinimalIndexTupleData

+ typedef struct MinimalIndexTupleData
+ {
+     bool  has_null;
+ } MinimalIndexTupleData;

While this might provide some benefit for 32 bit systems, the usefulness of 
this structure for 64 bit systems is questionable.


b) is_minmax_scan - This scan indicates that the scan can end as soon as it 
finds the first live tuple.
c) is_count_only_scan - This scan provides an optimization, using which we need 
not store the MinimalIndexTuple multiple times. Instead we just store a static 
all null tuple once and  use the count  which is already present in the 
IndexScanDesc
d) cache_stack  - This is another  optimization for updates involving  thick 
indexes.  The update operation involves a delete and insert. Since we already 
find out the leaf node where the tuple is present, we can insert the new tuple 
at the same place. This cache_stack gets turned on, when we find out that the 
attributes involving the thick index are not updated.
e) operation - informs whether it is a SELECT, INSERT, DELETE etc.
f) htup - the heap tuple being indexed. This is useful while we are deleting a 
thick index tuple. It would refer to the heap tuple for transaction snapshot 
information.This actually points to the tuple-tableslot where the old heap 
tuple is cached
g) insertion_skey - this is useful for the delete operation to delete the old 
tuple.
h) stack - refer to the cache_stack description
i) slot - This is  a pointer to the slot being used on a index scan

As pointed above the Minimal Index Tuple data has to get cached before we leave 
the index page. It gets cached under the BTScanOpaque structure under the name 
index_cache

Function ExecDeleteIndexTuples  is added for thick indexes to delete its index 
entry. It retraces the index tuple by forming a search scan key on the index 
tuple fields. ExecUpdateIndexTuples has also got updated to provide the 
necessary functionality for Thick index.

Vacuums:
Currently we store all the heap-tids which can be vacuumed in the heap tuple 
and we goto index and we scan the complete index and delete all the 
corresponding index tuples. with thick index, all the information necessary for 
Vacuum is already present in the Index. So the Tid array is not required. It 
just scans through the index and vacuums all the tuples. Actually Vacuum is not 
at all required for thick index, as we do a one page btree vacuum whenever 
necessary. But Freezing makes Vacuum necessary for thick indexes.
Also If a table contains only thick indexes, then tid array need not be 
created. This would make the table equivalent to a table with no indexes.


Pinning the index page:
Currently the heap tuple can't be cleaned up before ensuring all the index 
tuples are cleaned up. since we don't want any index scans to be using the 
index tuples while we are vacuuming the index page, we hold a pin. This 
basically means that the index page can't be vacuumed, which inturn means that 
the heap tuple being referred can't be vacuumed. But in case of thick indexes, 
the heap tuple can be vacuumed even while the index tuple exists, because the 
index tuple won't be considered by the index scan, if the tuple satisfies 
vacuum. Once a scan finds that the tuple satisfies vacuum, it is marked as 
LP_DEAD. So there is no concept of Super Exclusive Lock with Thick indexes.

A new recovery function has been written for updating the snapshot information 
during the heap tuple delete. This puts the necessary WAL information for 
recovery.

Cost Calculation:
There is a function called IndexOnlyScan which finds out whether the current 
scan can become a index only scan. If it becomes a Indexonlyscan, then the 
index scan is divided by a factor called DEFAULT_INDEX_ONLY_SCAN_FACTOR, which 
is currently set at 4.

For Bitmap Or scans, there are no index scans generated. Currently 
IndexOnlyScan is not supported for Bitmap Or scans. Since a MinimalIndexTuple 
cache gets created, the management of cache and finding out the duplicates 
inside the two or more scans present with 'Or' scan requires some more thinking.

There is a new function written to provide the Search scan key and Insertion 
scan key from the heap tuples. Search scan Key is using the 
oper_select_candidate function, which might run slightly faster with 8.3 Beta 
4. All my presented performance test results in the earlier threads were done 
on 8.3 Beta 1. All the HOT stuffs are skipped for Thick index scans, as HOT 
won't get applied to thick indexes.

Explain Plans:
   If IndexOnlyScan is chosen with a thick index, it gets notified in the 
Explain and Explain analyze statements.
   
Index Creation:
   Create thick index ... should be used in order to create a thick index on 
top of a column.

patchfile.tar.gz
Description: GNU Zip compressed data

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to [EMAIL PROTECTED] so that your
       message can get through to the mailing list cleanly

Re: [PATCHES] [HACKERS] Including Snapshot Info with Indexes

Reply via email to