This idea has been rejected to do poor performance results reported
later in the thread.

---------------------------------------------------------------------------

Heikki Linnakangas wrote:
> While thinking about index-organized-tables and similar ideas, it 
> occurred to me that there's some low-hanging-fruit: maintaining cluster 
> order on inserts by trying to place new heap tuples close to other 
> similar tuples. That involves asking the index am where on the heap the 
> new tuple should go, and trying to insert it there before using the FSM. 
> Using the new fillfactor parameter makes it more likely that there's 
> room on the page. We don't worry about the order within the page.
> 
> The API I'm thinking of introduces a new optional index am function, 
> amsuggestblock (suggestions for a better name are welcome). It gets the 
> same parameters as aminsert, and returns the heap block number that 
> would be optimal place to put the new tuple. It's be called from 
> ExecInsert before inserting the heap tuple, and the suggestion is passed 
> on to heap_insert and RelationGetBufferForTuple.
> 
> I wrote a little patch to implement this for btree, attached.
> 
> This could be optimized by changing the existing aminsert API, because 
> as it is, an insert will have to descend the btree twice. Once in 
> amsuggestblock and then in aminsert. amsuggestblock could keep the right 
> index page pinned so aminsert could locate it quicker. But I wanted to 
> keep this simple for now. Another improvement might be to allow 
> amsuggestblock to return a list of suggestions, but that makes it more 
> expensive to insert if there isn't room in the suggested pages, since 
> heap_insert will have to try them all before giving up.
> 
> Comments regarding the general idea or the patch? There should probably 
> be a index option to turn the feature on and off. You'll want to turn it 
> off when you first load a table, and turn it on after CLUSTER to keep it 
> clustered.
> 
> Since there's been discussion on keeping the TODO list more up-to-date, 
> I hereby officially claim the "Automatically maintain clustering on a 
> table" TODO item :). Feel free to bombard me with requests for status 
> reports. And just to be clear, I'm not trying to sneak this into 8.2 
> anymore, this is 8.3 stuff.
> 
> I won't be implementing a background daemon described on the TODO item, 
> since that would essentially be an online version of CLUSTER. Which sure 
> would be nice, but that's a different story.
> 
> - Heikki
> 

[ text/x-patch is unsupported, treating like TEXT/PLAIN ]

> Index: doc/src/sgml/catalogs.sgml
> ===================================================================
> RCS file: /home/hlinnaka/pgcvsrepository/pgsql/doc/src/sgml/catalogs.sgml,v
> retrieving revision 2.129
> diff -c -r2.129 catalogs.sgml
> *** doc/src/sgml/catalogs.sgml        31 Jul 2006 20:08:55 -0000      2.129
> --- doc/src/sgml/catalogs.sgml        8 Aug 2006 16:17:21 -0000
> ***************
> *** 499,504 ****
> --- 499,511 ----
>         <entry>Function to parse and validate reloptions for an index</entry>
>        </row>
>   
> +      <row>
> +       <entry><structfield>amsuggestblock</structfield></entry>
> +       <entry><type>regproc</type></entry>
> +       <entry><literal><link 
> linkend="catalog-pg-proc"><structname>pg_proc</structname></link>.oid</literal></entry>
> +       <entry>Get the best place in the heap to put a new tuple</entry>
> +      </row>
> + 
>       </tbody>
>      </tgroup>
>     </table>
> Index: doc/src/sgml/indexam.sgml
> ===================================================================
> RCS file: /home/hlinnaka/pgcvsrepository/pgsql/doc/src/sgml/indexam.sgml,v
> retrieving revision 2.16
> diff -c -r2.16 indexam.sgml
> *** doc/src/sgml/indexam.sgml 31 Jul 2006 20:08:59 -0000      2.16
> --- doc/src/sgml/indexam.sgml 8 Aug 2006 17:15:25 -0000
> ***************
> *** 391,396 ****
> --- 391,414 ----
>      <function>amoptions</> to test validity of options settings.
>     </para>
>   
> +   <para>
> + <programlisting>
> + BlockNumber
> + amsuggestblock (Relation indexRelation,
> +                 Datum *values,
> +                 bool *isnull,
> +                 Relation heapRelation);
> + </programlisting>
> +    Gets the optimal place in the heap for a new tuple. The parameters
> +    correspond the parameters for <literal>aminsert</literal>.
> +    This function is called on the clustered index before a new tuple
> +    is inserted to the heap, and it should choose the optimal insertion
> +    target page on the heap in such manner that the heap stays as close
> +    as possible to the index order.
> +    <literal>amsuggestblock</literal> can return InvalidBlockNumber if
> +    the index am doesn't have a suggestion.
> +   </para>
> + 
>    </sect1>
>   
>    <sect1 id="index-scanning">
> Index: src/backend/access/heap/heapam.c
> ===================================================================
> RCS file: 
> /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/heap/heapam.c,v
> retrieving revision 1.218
> diff -c -r1.218 heapam.c
> *** src/backend/access/heap/heapam.c  31 Jul 2006 20:08:59 -0000      1.218
> --- src/backend/access/heap/heapam.c  8 Aug 2006 16:17:21 -0000
> ***************
> *** 1325,1330 ****
> --- 1325,1335 ----
>    * use_fsm is passed directly to RelationGetBufferForTuple, which see for
>    * more info.
>    *
> +  * suggested_blk can be set by the caller to hint heap_insert which
> +  * block would be the best place to put the new tuple in. heap_insert can
> +  * ignore the suggestion, if there's not enough room on that block. 
> +  * InvalidBlockNumber means no preference.
> +  *
>    * The return value is the OID assigned to the tuple (either here or by the
>    * caller), or InvalidOid if no OID.  The header fields of *tup are updated
>    * to match the stored tuple; in particular tup->t_self receives the actual
> ***************
> *** 1333,1339 ****
>    */
>   Oid
>   heap_insert(Relation relation, HeapTuple tup, CommandId cid,
> !                     bool use_wal, bool use_fsm)
>   {
>       TransactionId xid = GetCurrentTransactionId();
>       HeapTuple       heaptup;
> --- 1338,1344 ----
>    */
>   Oid
>   heap_insert(Relation relation, HeapTuple tup, CommandId cid,
> !                     bool use_wal, bool use_fsm, BlockNumber suggested_blk)
>   {
>       TransactionId xid = GetCurrentTransactionId();
>       HeapTuple       heaptup;
> ***************
> *** 1386,1392 ****
>   
>       /* Find buffer to insert this tuple into */
>       buffer = RelationGetBufferForTuple(relation, heaptup->t_len,
> !                                                                        
> InvalidBuffer, use_fsm);
>   
>       /* NO EREPORT(ERROR) from here till changes are logged */
>       START_CRIT_SECTION();
> --- 1391,1397 ----
>   
>       /* Find buffer to insert this tuple into */
>       buffer = RelationGetBufferForTuple(relation, heaptup->t_len,
> !                                                                        
> InvalidBuffer, use_fsm, suggested_blk);
>   
>       /* NO EREPORT(ERROR) from here till changes are logged */
>       START_CRIT_SECTION();
> ***************
> *** 1494,1500 ****
>   Oid
>   simple_heap_insert(Relation relation, HeapTuple tup)
>   {
> !     return heap_insert(relation, tup, GetCurrentCommandId(), true, true);
>   }
>   
>   /*
> --- 1499,1506 ----
>   Oid
>   simple_heap_insert(Relation relation, HeapTuple tup)
>   {
> !     return heap_insert(relation, tup, GetCurrentCommandId(), true, 
> !                                        true, InvalidBlockNumber);
>   }
>   
>   /*
> ***************
> *** 2079,2085 ****
>               {
>                       /* Assume there's no chance to put heaptup on same 
> page. */
>                       newbuf = RelationGetBufferForTuple(relation, 
> heaptup->t_len,
> !                                                                             
>            buffer, true);
>               }
>               else
>               {
> --- 2085,2092 ----
>               {
>                       /* Assume there's no chance to put heaptup on same 
> page. */
>                       newbuf = RelationGetBufferForTuple(relation, 
> heaptup->t_len,
> !                                                                             
>            buffer, true, 
> !                                                                             
>            InvalidBlockNumber);
>               }
>               else
>               {
> ***************
> *** 2096,2102 ****
>                                */
>                               LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
>                               newbuf = RelationGetBufferForTuple(relation, 
> heaptup->t_len,
> !                                                                             
>                    buffer, true);
>                       }
>                       else
>                       {
> --- 2103,2110 ----
>                                */
>                               LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
>                               newbuf = RelationGetBufferForTuple(relation, 
> heaptup->t_len,
> !                                                                             
>                    buffer, true,
> !                                                                             
>                    InvalidBlockNumber);
>                       }
>                       else
>                       {
> Index: src/backend/access/heap/hio.c
> ===================================================================
> RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/heap/hio.c,v
> retrieving revision 1.63
> diff -c -r1.63 hio.c
> *** src/backend/access/heap/hio.c     3 Jul 2006 22:45:37 -0000       1.63
> --- src/backend/access/heap/hio.c     9 Aug 2006 18:03:01 -0000
> ***************
> *** 93,98 ****
> --- 93,100 ----
>    *  any committed data of other transactions.  (See heap_insert's comments
>    *  for additional constraints needed for safe usage of this behavior.)
>    *
> +  *  If the caller has a suggestion, it's passed in suggestedBlock.
> +  *
>    *  We always try to avoid filling existing pages further than the 
> fillfactor.
>    *  This is OK since this routine is not consulted when updating a tuple and
>    *  keeping it on the same page, which is the scenario fillfactor is meant
> ***************
> *** 103,109 ****
>    */
>   Buffer
>   RelationGetBufferForTuple(Relation relation, Size len,
> !                                               Buffer otherBuffer, bool 
> use_fsm)
>   {
>       Buffer          buffer = InvalidBuffer;
>       Page            pageHeader;
> --- 105,112 ----
>    */
>   Buffer
>   RelationGetBufferForTuple(Relation relation, Size len,
> !                                               Buffer otherBuffer, bool 
> use_fsm, 
> !                                               BlockNumber suggestedBlock)
>   {
>       Buffer          buffer = InvalidBuffer;
>       Page            pageHeader;
> ***************
> *** 135,142 ****
>               otherBlock = InvalidBlockNumber;                /* just to keep 
> compiler quiet */
>   
>       /*
> !      * We first try to put the tuple on the same page we last inserted a 
> tuple
> !      * on, as cached in the relcache entry.  If that doesn't work, we ask 
> the
>        * shared Free Space Map to locate a suitable page.  Since the FSM's 
> info
>        * might be out of date, we have to be prepared to loop around and retry
>        * multiple times.      (To insure this isn't an infinite loop, we must 
> update
> --- 138,147 ----
>               otherBlock = InvalidBlockNumber;                /* just to keep 
> compiler quiet */
>   
>       /*
> !      * We first try to put the tuple on the page suggested by the caller, if
> !      * any. Then we try to put the tuple on the same page we last inserted a
> !      * tuple on, as cached in the relcache entry. If that doesn't work, we 
> !      * ask the
>        * shared Free Space Map to locate a suitable page.  Since the FSM's 
> info
>        * might be out of date, we have to be prepared to loop around and retry
>        * multiple times.      (To insure this isn't an infinite loop, we must 
> update
> ***************
> *** 144,152 ****
>        * not to be suitable.)  If the FSM has no record of a page with enough
>        * free space, we give up and extend the relation.
>        *
> !      * When use_fsm is false, we either put the tuple onto the existing 
> target
> !      * page or extend the relation.
>        */
>       if (len + saveFreeSpace <= MaxTupleSize)
>               targetBlock = relation->rd_targblock;
>       else
> --- 149,167 ----
>        * not to be suitable.)  If the FSM has no record of a page with enough
>        * free space, we give up and extend the relation.
>        *
> !      * When use_fsm is false, we skip the fsm lookup if neither the 
> suggested
> !      * nor the cached last insertion page has enough room, and extend the
> !      * relation.
> !      *
> !      * The fillfactor is taken into account when calculating the free space
> !      * on the cached target block, and when using the FSM. The suggested 
> page
> !      * is used whenever there's enough room in it, regardless of the 
> fillfactor,
> !      * because that's exactly the purpose the space is reserved for in the
> !      * first place.
>        */
> +     if (suggestedBlock != InvalidBlockNumber)
> +             targetBlock = suggestedBlock;
> +     else
>       if (len + saveFreeSpace <= MaxTupleSize)
>               targetBlock = relation->rd_targblock;
>       else
> ***************
> *** 219,224 ****
> --- 234,244 ----
>                */
>               pageHeader = (Page) BufferGetPage(buffer);
>               pageFreeSpace = PageGetFreeSpace(pageHeader);
> + 
> +             /* If we're trying the suggested block, don't care about 
> fillfactor */
> +             if (targetBlock == suggestedBlock && len <= pageFreeSpace)
> +                     return buffer;
> + 
>               if (len + saveFreeSpace <= pageFreeSpace)
>               {
>                       /* use this page as future insert target, too */
> ***************
> *** 241,246 ****
> --- 261,275 ----
>                       ReleaseBuffer(buffer);
>               }
>   
> +             /* If we just tried the suggested block, try the cached target
> +              * block next, before consulting the FSM. */
> +             if(suggestedBlock == targetBlock)
> +             {
> +                     targetBlock = relation->rd_targblock;
> +                     suggestedBlock = InvalidBlockNumber;
> +                     continue;
> +             }
> + 
>               /* Without FSM, always fall out of the loop and extend */
>               if (!use_fsm)
>                       break;
> Index: src/backend/access/index/genam.c
> ===================================================================
> RCS file: 
> /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/index/genam.c,v
> retrieving revision 1.58
> diff -c -r1.58 genam.c
> *** src/backend/access/index/genam.c  31 Jul 2006 20:08:59 -0000      1.58
> --- src/backend/access/index/genam.c  8 Aug 2006 16:17:21 -0000
> ***************
> *** 259,261 ****
> --- 259,275 ----
>   
>       pfree(sysscan);
>   }
> + 
> + /*
> +  * This is a dummy implementation of amsuggestblock, to be used for index
> +  * access methods that don't or can't support it. It just returns 
> +  * InvalidBlockNumber, which means "no preference".
> +  *
> +  * This is probably not a good best place for this function, but it doesn't
> +  * fit naturally anywhere else either.
> +  */
> + Datum
> + dummysuggestblock(PG_FUNCTION_ARGS)
> + {
> +     PG_RETURN_UINT32(InvalidBlockNumber);
> + }
> Index: src/backend/access/index/indexam.c
> ===================================================================
> RCS file: 
> /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/index/indexam.c,v
> retrieving revision 1.94
> diff -c -r1.94 indexam.c
> *** src/backend/access/index/indexam.c        31 Jul 2006 20:08:59 -0000      
> 1.94
> --- src/backend/access/index/indexam.c        8 Aug 2006 16:17:21 -0000
> ***************
> *** 18,23 ****
> --- 18,24 ----
>    *          index_rescan    - restart a scan of an index
>    *          index_endscan   - end a scan
>    *          index_insert    - insert an index tuple into a relation
> +  *          index_suggestblock      - get desired insert location for a 
> heap tuple
>    *          index_markpos   - mark a scan position
>    *          index_restrpos  - restore a scan position
>    *          index_getnext   - get the next tuple from a scan
> ***************
> *** 202,207 ****
> --- 203,237 ----
>                                                                         
> BoolGetDatum(check_uniqueness)));
>   }
>   
> + /* ----------------
> +  *          index_suggestblock - get desired insert location for a heap 
> tuple
> +  *
> +  * The returned BlockNumber is the *heap* page that is the best place
> +  * to insert the given tuple to, according to the index am. The best
> +  * place is usually one that maintains the cluster order.
> +  * ----------------
> +  */
> + BlockNumber
> + index_suggestblock(Relation indexRelation,
> +                                Datum *values,
> +                                bool *isnull,
> +                                Relation heapRelation)
> + {
> +     FmgrInfo   *procedure;
> + 
> +     RELATION_CHECKS;
> +     GET_REL_PROCEDURE(amsuggestblock);
> + 
> +     /*
> +      * have the am's suggestblock proc do all the work.
> +      */
> +     return DatumGetUInt32(FunctionCall4(procedure,
> +                                                                       
> PointerGetDatum(indexRelation),
> +                                                                       
> PointerGetDatum(values),
> +                                                                       
> PointerGetDatum(isnull),
> +                                                                       
> PointerGetDatum(heapRelation)));
> + }
> + 
>   /*
>    * index_beginscan - start a scan of an index with amgettuple
>    *
> Index: src/backend/access/nbtree/nbtinsert.c
> ===================================================================
> RCS file: 
> /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/nbtree/nbtinsert.c,v
> retrieving revision 1.142
> diff -c -r1.142 nbtinsert.c
> *** src/backend/access/nbtree/nbtinsert.c     25 Jul 2006 19:13:00 -0000      
> 1.142
> --- src/backend/access/nbtree/nbtinsert.c     9 Aug 2006 17:51:33 -0000
> ***************
> *** 146,151 ****
> --- 146,221 ----
>   }
>   
>   /*
> +  *  _bt_suggestblock() -- Find the heap block of the closest index tuple.
> +  *
> +  * The logic to find the target should match _bt_doinsert, otherwise
> +  * we'll be making bad suggestions.
> +  */
> + BlockNumber
> + _bt_suggestblock(Relation rel, IndexTuple itup, Relation heapRel)
> + {
> +     int                     natts = rel->rd_rel->relnatts;
> +     OffsetNumber offset;
> +     Page            page;
> +     BTPageOpaque opaque;
> + 
> +     ScanKey         itup_scankey;
> +     BTStack         stack;
> +     Buffer          buf;
> +     IndexTuple      curitup;
> +     BlockNumber suggestion = InvalidBlockNumber;
> + 
> +     /* we need an insertion scan key to do our search, so build one */
> +     itup_scankey = _bt_mkscankey(rel, itup);
> + 
> +     /* find the first page containing this key */
> +     stack = _bt_search(rel, natts, itup_scankey, false, &buf, BT_READ);
> +     if(!BufferIsValid(buf))
> +     {
> +             /* The index was completely empty. No suggestion then. */
> +             return InvalidBlockNumber;
> +     }
> +     /* we don't need the stack, so free it right away */
> +     _bt_freestack(stack);
> + 
> +     page = BufferGetPage(buf);
> +     opaque = (BTPageOpaque) PageGetSpecialPointer(page);
> + 
> +     /* Find the location in the page where the new index tuple would go to. 
> */
> + 
> +     offset = _bt_binsrch(rel, buf, natts, itup_scankey, false);
> +     if (offset > PageGetMaxOffsetNumber(page))
> +     {
> +             /* _bt_binsrch returned pointer to end-of-page. It means that
> +              * there was no equal items on the page, and the new item 
> should 
> +              * be inserted as the last tuple of the page. There could be 
> equal
> +              * items on the next page, however.
> +              *
> +              * At the moment, we just ignore the potential equal items on 
> the 
> +              * right, and pretend there isn't any. We could instead walk 
> right
> +              * to the next page to check that, but let's keep it simple for 
> now.
> +              */
> +             offset = OffsetNumberPrev(offset);
> +     }
> +     if(offset < P_FIRSTDATAKEY(opaque))
> +     {
> +             /* We landed on an empty page. We could step left or right until
> +              * we find some items, but let's keep it simple for now. 
> +              */
> +     } else {
> +             /* We're now positioned at the index tuple that we're 
> interested in. */
> + 
> +             curitup = (IndexTuple) PageGetItem(page, PageGetItemId(page, 
> offset));
> +             suggestion = ItemPointerGetBlockNumber(&curitup->t_tid);
> +     }
> + 
> +     _bt_relbuf(rel, buf);
> +     _bt_freeskey(itup_scankey);
> + 
> +     return suggestion;
> + }
> + 
> + /*
>    *  _bt_check_unique() -- Check for violation of unique index constraint
>    *
>    * Returns InvalidTransactionId if there is no conflict, else an xact ID
> Index: src/backend/access/nbtree/nbtree.c
> ===================================================================
> RCS file: 
> /home/hlinnaka/pgcvsrepository/pgsql/src/backend/access/nbtree/nbtree.c,v
> retrieving revision 1.149
> diff -c -r1.149 nbtree.c
> *** src/backend/access/nbtree/nbtree.c        10 May 2006 23:18:39 -0000      
> 1.149
> --- src/backend/access/nbtree/nbtree.c        9 Aug 2006 18:04:02 -0000
> ***************
> *** 228,233 ****
> --- 228,265 ----
>   }
>   
>   /*
> +  *  btsuggestblock() -- find the best place in the heap to put a new tuple.
> +  *
> +  *          This uses the same logic as btinsert to find the place where 
> the index
> +  *          tuple would go if this was a btinsert call.
> +  *
> +  *          There's room for improvement here. An insert operation will 
> descend
> +  *          the tree twice, first by btsuggestblock, then by btinsert. 
> Things
> +  *          might have changed in between, so that the heap tuple is 
> actually
> +  *          not inserted in the optimal page, but since this is just an
> +  *          optimization, it's ok if it happens     sometimes.
> +  */
> + Datum
> + btsuggestblock(PG_FUNCTION_ARGS)
> + {
> +     Relation        rel = (Relation) PG_GETARG_POINTER(0);
> +     Datum      *values = (Datum *) PG_GETARG_POINTER(1);
> +     bool       *isnull = (bool *) PG_GETARG_POINTER(2);
> +     Relation        heapRel = (Relation) PG_GETARG_POINTER(3);
> +     IndexTuple      itup;
> +     BlockNumber suggestion;
> + 
> +     /* generate an index tuple */
> +     itup = index_form_tuple(RelationGetDescr(rel), values, isnull);
> + 
> +     suggestion =_bt_suggestblock(rel, itup, heapRel);
> + 
> +     pfree(itup);
> + 
> +     PG_RETURN_UINT32(suggestion);
> + }
> + 
> + /*
>    *  btgettuple() -- Get the next tuple in the scan.
>    */
>   Datum
> Index: src/backend/executor/execMain.c
> ===================================================================
> RCS file: 
> /home/hlinnaka/pgcvsrepository/pgsql/src/backend/executor/execMain.c,v
> retrieving revision 1.277
> diff -c -r1.277 execMain.c
> *** src/backend/executor/execMain.c   31 Jul 2006 01:16:37 -0000      1.277
> --- src/backend/executor/execMain.c   8 Aug 2006 16:17:21 -0000
> ***************
> *** 892,897 ****
> --- 892,898 ----
>       resultRelInfo->ri_RangeTableIndex = resultRelationIndex;
>       resultRelInfo->ri_RelationDesc = resultRelationDesc;
>       resultRelInfo->ri_NumIndices = 0;
> +     resultRelInfo->ri_ClusterIndex = -1;
>       resultRelInfo->ri_IndexRelationDescs = NULL;
>       resultRelInfo->ri_IndexRelationInfo = NULL;
>       /* make a copy so as not to depend on relcache info not changing... */
> ***************
> *** 1388,1394 ****
>               heap_insert(estate->es_into_relation_descriptor, tuple,
>                                       estate->es_snapshot->curcid,
>                                       estate->es_into_relation_use_wal,
> !                                     false);         /* never any point in 
> using FSM */
>               /* we know there are no indexes to update */
>               heap_freetuple(tuple);
>               IncrAppended();
> --- 1389,1396 ----
>               heap_insert(estate->es_into_relation_descriptor, tuple,
>                                       estate->es_snapshot->curcid,
>                                       estate->es_into_relation_use_wal,
> !                                     false, /* never any point in using FSM 
> */
> !                                     InvalidBlockNumber);
>               /* we know there are no indexes to update */
>               heap_freetuple(tuple);
>               IncrAppended();
> ***************
> *** 1419,1424 ****
> --- 1421,1427 ----
>       ResultRelInfo *resultRelInfo;
>       Relation        resultRelationDesc;
>       Oid                     newId;
> +     BlockNumber suggestedBlock;
>   
>       /*
>        * get the heap tuple out of the tuple table slot, making sure we have a
> ***************
> *** 1467,1472 ****
> --- 1470,1479 ----
>       if (resultRelationDesc->rd_att->constr)
>               ExecConstraints(resultRelInfo, slot, estate);
>   
> +     /* Ask the index am of the clustered index for the 
> +      * best place to put it */
> +     suggestedBlock = ExecSuggestBlock(slot, estate);
> + 
>       /*
>        * insert the tuple
>        *
> ***************
> *** 1475,1481 ****
>        */
>       newId = heap_insert(resultRelationDesc, tuple,
>                                               estate->es_snapshot->curcid,
> !                                             true, true);
>   
>       IncrAppended();
>       (estate->es_processed)++;
> --- 1482,1488 ----
>        */
>       newId = heap_insert(resultRelationDesc, tuple,
>                                               estate->es_snapshot->curcid,
> !                                             true, true, suggestedBlock);
>   
>       IncrAppended();
>       (estate->es_processed)++;
> Index: src/backend/executor/execUtils.c
> ===================================================================
> RCS file: 
> /home/hlinnaka/pgcvsrepository/pgsql/src/backend/executor/execUtils.c,v
> retrieving revision 1.139
> diff -c -r1.139 execUtils.c
> *** src/backend/executor/execUtils.c  4 Aug 2006 21:33:36 -0000       1.139
> --- src/backend/executor/execUtils.c  9 Aug 2006 18:05:05 -0000
> ***************
> *** 31,36 ****
> --- 31,37 ----
>    *          ExecOpenIndices                 \
>    *          ExecCloseIndices                 | referenced by InitPlan, 
> EndPlan,
>    *          ExecInsertIndexTuples   /  ExecInsert, ExecUpdate
> +  *          ExecSuggestBlock                Referenced by ExecInsert
>    *
>    *          RegisterExprContextCallback    Register function shutdown 
> callback
>    *          UnregisterExprContextCallback  Deregister function shutdown 
> callback
> ***************
> *** 874,879 ****
> --- 875,881 ----
>       IndexInfo **indexInfoArray;
>   
>       resultRelInfo->ri_NumIndices = 0;
> +     resultRelInfo->ri_ClusterIndex = -1;
>   
>       /* fast path if no indexes */
>       if (!RelationGetForm(resultRelation)->relhasindex)
> ***************
> *** 913,918 ****
> --- 915,925 ----
>               /* extract index key information from the index's pg_index info 
> */
>               ii = BuildIndexInfo(indexDesc);
>   
> +             /* Remember which index is the clustered one.
> +              * It's used to call the suggestblock-method on inserts */
> +             if(indexDesc->rd_index->indisclustered)
> +                     resultRelInfo->ri_ClusterIndex = i;
> + 
>               relationDescs[i] = indexDesc;
>               indexInfoArray[i] = ii;
>               i++;
> ***************
> *** 1062,1067 ****
> --- 1069,1137 ----
>       }
>   }
>   
> + /* ----------------------------------------------------------------
> +  *          ExecSuggestBlock
> +  *
> +  *          This routine asks the index am where a new heap tuple
> +  *          should be placed.
> +  * ----------------------------------------------------------------
> +  */
> + BlockNumber
> + ExecSuggestBlock(TupleTableSlot *slot,
> +                              EState *estate)
> + {
> +     ResultRelInfo *resultRelInfo;
> +     int                     i;
> +     Relation        relationDesc;
> +     Relation        heapRelation;
> +     ExprContext *econtext;
> +     Datum           values[INDEX_MAX_KEYS];
> +     bool            isnull[INDEX_MAX_KEYS];
> +     IndexInfo  *indexInfo;
> + 
> +     /*
> +      * Get information from the result relation info structure.
> +      */
> +     resultRelInfo = estate->es_result_relation_info;
> +     i = resultRelInfo->ri_ClusterIndex;
> +     if(i == -1)
> +             return InvalidBlockNumber; /* there was no clustered index */
> + 
> +     heapRelation = resultRelInfo->ri_RelationDesc;
> +     relationDesc = resultRelInfo->ri_IndexRelationDescs[i];
> +     indexInfo = resultRelInfo->ri_IndexRelationInfo[i];
> + 
> +     /* You can't cluster on a partial index */
> +     Assert(indexInfo->ii_Predicate == NIL);
> + 
> +     /*
> +      * We will use the EState's per-tuple context for evaluating 
> +      * index expressions (creating it if it's not already there).
> +      */
> +     econtext = GetPerTupleExprContext(estate);
> + 
> +     /* Arrange for econtext's scan tuple to be the tuple under test */
> +     econtext->ecxt_scantuple = slot;
> + 
> +     /*
> +      * FormIndexDatum fills in its values and isnull parameters with the
> +      * appropriate values for the column(s) of the index.
> +      */
> +     FormIndexDatum(indexInfo,
> +                                slot,
> +                                estate,
> +                                values,
> +                                isnull);
> + 
> +     /*
> +      * The index AM does the rest.
> +      */
> +     return index_suggestblock(relationDesc, /* index relation */
> +                              values,        /* array of index Datums */
> +                              isnull,        /* null flags */
> +                              heapRelation);
> + }
> + 
>   /*
>    * UpdateChangedParamSet
>    *          Add changed parameters to a plan node's chgParam set
> Index: src/include/access/genam.h
> ===================================================================
> RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/access/genam.h,v
> retrieving revision 1.65
> diff -c -r1.65 genam.h
> *** src/include/access/genam.h        31 Jul 2006 20:09:05 -0000      1.65
> --- src/include/access/genam.h        9 Aug 2006 17:53:44 -0000
> ***************
> *** 93,98 ****
> --- 93,101 ----
>                        ItemPointer heap_t_ctid,
>                        Relation heapRelation,
>                        bool check_uniqueness);
> + extern BlockNumber index_suggestblock(Relation indexRelation,
> +                      Datum *values, bool *isnull,
> +                      Relation heapRelation);
>   
>   extern IndexScanDesc index_beginscan(Relation heapRelation,
>                               Relation indexRelation,
> ***************
> *** 123,128 ****
> --- 126,133 ----
>   extern FmgrInfo *index_getprocinfo(Relation irel, AttrNumber attnum,
>                                 uint16 procnum);
>   
> + extern Datum dummysuggestblock(PG_FUNCTION_ARGS);
> + 
>   /*
>    * index access method support routines (in genam.c)
>    */
> Index: src/include/access/heapam.h
> ===================================================================
> RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/access/heapam.h,v
> retrieving revision 1.114
> diff -c -r1.114 heapam.h
> *** src/include/access/heapam.h       3 Jul 2006 22:45:39 -0000       1.114
> --- src/include/access/heapam.h       8 Aug 2006 16:17:21 -0000
> ***************
> *** 156,162 ****
>   extern void setLastTid(const ItemPointer tid);
>   
>   extern Oid heap_insert(Relation relation, HeapTuple tup, CommandId cid,
> !                     bool use_wal, bool use_fsm);
>   extern HTSU_Result heap_delete(Relation relation, ItemPointer tid,
>                       ItemPointer ctid, TransactionId *update_xmax,
>                       CommandId cid, Snapshot crosscheck, bool wait);
> --- 156,162 ----
>   extern void setLastTid(const ItemPointer tid);
>   
>   extern Oid heap_insert(Relation relation, HeapTuple tup, CommandId cid,
> !                     bool use_wal, bool use_fsm, BlockNumber suggestedblk);
>   extern HTSU_Result heap_delete(Relation relation, ItemPointer tid,
>                       ItemPointer ctid, TransactionId *update_xmax,
>                       CommandId cid, Snapshot crosscheck, bool wait);
> Index: src/include/access/hio.h
> ===================================================================
> RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/access/hio.h,v
> retrieving revision 1.32
> diff -c -r1.32 hio.h
> *** src/include/access/hio.h  13 Jul 2006 17:47:01 -0000      1.32
> --- src/include/access/hio.h  8 Aug 2006 16:17:21 -0000
> ***************
> *** 21,26 ****
>   extern void RelationPutHeapTuple(Relation relation, Buffer buffer,
>                                        HeapTuple tuple);
>   extern Buffer RelationGetBufferForTuple(Relation relation, Size len,
> !                                     Buffer otherBuffer, bool use_fsm);
>   
>   #endif   /* HIO_H */
> --- 21,26 ----
>   extern void RelationPutHeapTuple(Relation relation, Buffer buffer,
>                                        HeapTuple tuple);
>   extern Buffer RelationGetBufferForTuple(Relation relation, Size len,
> !                                     Buffer otherBuffer, bool use_fsm, 
> BlockNumber suggestedblk);
>   
>   #endif   /* HIO_H */
> Index: src/include/access/nbtree.h
> ===================================================================
> RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/access/nbtree.h,v
> retrieving revision 1.103
> diff -c -r1.103 nbtree.h
> *** src/include/access/nbtree.h       7 Aug 2006 16:57:57 -0000       1.103
> --- src/include/access/nbtree.h       8 Aug 2006 16:17:21 -0000
> ***************
> *** 467,472 ****
> --- 467,473 ----
>   extern Datum btbulkdelete(PG_FUNCTION_ARGS);
>   extern Datum btvacuumcleanup(PG_FUNCTION_ARGS);
>   extern Datum btoptions(PG_FUNCTION_ARGS);
> + extern Datum btsuggestblock(PG_FUNCTION_ARGS);
>   
>   /*
>    * prototypes for functions in nbtinsert.c
> ***************
> *** 476,481 ****
> --- 477,484 ----
>   extern Buffer _bt_getstackbuf(Relation rel, BTStack stack, int access);
>   extern void _bt_insert_parent(Relation rel, Buffer buf, Buffer rbuf,
>                                 BTStack stack, bool is_root, bool is_only);
> + extern BlockNumber _bt_suggestblock(Relation rel, IndexTuple itup,
> +                      Relation heapRel);
>   
>   /*
>    * prototypes for functions in nbtpage.c
> Index: src/include/catalog/pg_am.h
> ===================================================================
> RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/catalog/pg_am.h,v
> retrieving revision 1.46
> diff -c -r1.46 pg_am.h
> *** src/include/catalog/pg_am.h       31 Jul 2006 20:09:05 -0000      1.46
> --- src/include/catalog/pg_am.h       8 Aug 2006 16:17:21 -0000
> ***************
> *** 65,70 ****
> --- 65,71 ----
>       regproc         amvacuumcleanup;        /* post-VACUUM cleanup function 
> */
>       regproc         amcostestimate; /* estimate cost of an indexscan */
>       regproc         amoptions;              /* parse AM-specific parameters 
> */
> +     regproc         amsuggestblock; /* suggest a block where to put heap 
> tuple */
>   } FormData_pg_am;
>   
>   /* ----------------
> ***************
> *** 78,84 ****
>    *          compiler constants for pg_am
>    * ----------------
>    */
> ! #define Natts_pg_am                                         23
>   #define Anum_pg_am_amname                           1
>   #define Anum_pg_am_amstrategies                     2
>   #define Anum_pg_am_amsupport                        3
> --- 79,85 ----
>    *          compiler constants for pg_am
>    * ----------------
>    */
> ! #define Natts_pg_am                                         24
>   #define Anum_pg_am_amname                           1
>   #define Anum_pg_am_amstrategies                     2
>   #define Anum_pg_am_amsupport                        3
> ***************
> *** 102,123 ****
>   #define Anum_pg_am_amvacuumcleanup          21
>   #define Anum_pg_am_amcostestimate           22
>   #define Anum_pg_am_amoptions                        23
>   
>   /* ----------------
>    *          initial contents of pg_am
>    * ----------------
>    */
>   
> ! DATA(insert OID = 403 (  btree      5 1 1 t t t t f t btinsert btbeginscan 
> btgettuple btgetmulti btrescan btendscan btmarkpos btrestrpos btbuild 
> btbulkdelete btvacuumcleanup btcostestimate btoptions ));
>   DESCR("b-tree index access method");
>   #define BTREE_AM_OID 403
> ! DATA(insert OID = 405 (  hash       1 1 0 f f f f f f hashinsert 
> hashbeginscan hashgettuple hashgetmulti hashrescan hashendscan hashmarkpos 
> hashrestrpos hashbuild hashbulkdelete hashvacuumcleanup hashcostestimate 
> hashoptions ));
>   DESCR("hash index access method");
>   #define HASH_AM_OID 405
> ! DATA(insert OID = 783 (  gist       100 7 0 f t t t t t gistinsert 
> gistbeginscan gistgettuple gistgetmulti gistrescan gistendscan gistmarkpos 
> gistrestrpos gistbuild gistbulkdelete gistvacuumcleanup gistcostestimate 
> gistoptions ));
>   DESCR("GiST index access method");
>   #define GIST_AM_OID 783
> ! DATA(insert OID = 2742 (  gin       100 4 0 f f f f t f gininsert 
> ginbeginscan gingettuple gingetmulti ginrescan ginendscan ginmarkpos 
> ginrestrpos ginbuild ginbulkdelete ginvacuumcleanup gincostestimate 
> ginoptions ));
>   DESCR("GIN index access method");
>   #define GIN_AM_OID 2742
>   
> --- 103,125 ----
>   #define Anum_pg_am_amvacuumcleanup          21
>   #define Anum_pg_am_amcostestimate           22
>   #define Anum_pg_am_amoptions                        23
> + #define Anum_pg_am_amsuggestblock           24
>   
>   /* ----------------
>    *          initial contents of pg_am
>    * ----------------
>    */
>   
> ! DATA(insert OID = 403 (  btree      5 1 1 t t t t f t btinsert btbeginscan 
> btgettuple btgetmulti btrescan btendscan btmarkpos btrestrpos btbuild 
> btbulkdelete btvacuumcleanup btcostestimate btoptions btsuggestblock));
>   DESCR("b-tree index access method");
>   #define BTREE_AM_OID 403
> ! DATA(insert OID = 405 (  hash       1 1 0 f f f f f f hashinsert 
> hashbeginscan hashgettuple hashgetmulti hashrescan hashendscan hashmarkpos 
> hashrestrpos hashbuild hashbulkdelete hashvacuumcleanup hashcostestimate 
> hashoptions dummysuggestblock));
>   DESCR("hash index access method");
>   #define HASH_AM_OID 405
> ! DATA(insert OID = 783 (  gist       100 7 0 f t t t t t gistinsert 
> gistbeginscan gistgettuple gistgetmulti gistrescan gistendscan gistmarkpos 
> gistrestrpos gistbuild gistbulkdelete gistvacuumcleanup gistcostestimate 
> gistoptions dummysuggestblock));
>   DESCR("GiST index access method");
>   #define GIST_AM_OID 783
> ! DATA(insert OID = 2742 (  gin       100 4 0 f f f f t f gininsert 
> ginbeginscan gingettuple gingetmulti ginrescan ginendscan ginmarkpos 
> ginrestrpos ginbuild ginbulkdelete ginvacuumcleanup gincostestimate 
> ginoptions dummysuggestblock ));
>   DESCR("GIN index access method");
>   #define GIN_AM_OID 2742
>   
> Index: src/include/catalog/pg_proc.h
> ===================================================================
> RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/catalog/pg_proc.h,v
> retrieving revision 1.420
> diff -c -r1.420 pg_proc.h
> *** src/include/catalog/pg_proc.h     6 Aug 2006 03:53:44 -0000       1.420
> --- src/include/catalog/pg_proc.h     9 Aug 2006 18:06:44 -0000
> ***************
> *** 682,687 ****
> --- 682,689 ----
>   DESCR("btree(internal)");
>   DATA(insert OID = 2785 (  btoptions            PGNSP PGUID 12 f f t f s 2 
> 17 "1009 16" _null_ _null_ _null_  btoptions - _null_ ));
>   DESCR("btree(internal)");
> + DATA(insert OID = 2852 (  btsuggestblock   PGNSP PGUID 12 f f t f v 4 23 
> "2281 2281 2281 2281" _null_ _null_ _null_ btsuggestblock - _null_ ));
> + DESCR("btree(internal)");
>   
>   DATA(insert OID = 339 (  poly_same             PGNSP PGUID 12 f f t f i 2 
> 16 "604 604" _null_ _null_ _null_ poly_same - _null_ ));
>   DESCR("same as?");
> ***************
> *** 3936,3941 ****
> --- 3938,3946 ----
>   DATA(insert OID = 2749 (  arraycontained       PGNSP PGUID 12 f f t f i 2 
> 16 "2277 2277" _null_ _null_ _null_ arraycontained - _null_ ));
>   DESCR("anyarray contained");
>   
> + DATA(insert OID = 2853 (  dummysuggestblock   PGNSP PGUID 12 f f t f v 4 23 
> "2281 2281 2281 2281" _null_ _null_ _null_      dummysuggestblock - _null_ ));
> + DESCR("dummy amsuggestblock implementation (internal)");
> + 
>   /*
>    * Symbolic values for provolatile column: these indicate whether the result
>    * of a function is dependent *only* on the values of its explicit 
> arguments,
> Index: src/include/executor/executor.h
> ===================================================================
> RCS file: 
> /home/hlinnaka/pgcvsrepository/pgsql/src/include/executor/executor.h,v
> retrieving revision 1.128
> diff -c -r1.128 executor.h
> *** src/include/executor/executor.h   4 Aug 2006 21:33:36 -0000       1.128
> --- src/include/executor/executor.h   8 Aug 2006 16:17:21 -0000
> ***************
> *** 271,276 ****
> --- 271,277 ----
>   extern void ExecCloseIndices(ResultRelInfo *resultRelInfo);
>   extern void ExecInsertIndexTuples(TupleTableSlot *slot, ItemPointer tupleid,
>                                         EState *estate, bool is_vacuum);
> + extern BlockNumber ExecSuggestBlock(TupleTableSlot *slot, EState *estate);
>   
>   extern void RegisterExprContextCallback(ExprContext *econtext,
>                                                       
> ExprContextCallbackFunction function,
> Index: src/include/nodes/execnodes.h
> ===================================================================
> RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/nodes/execnodes.h,v
> retrieving revision 1.158
> diff -c -r1.158 execnodes.h
> *** src/include/nodes/execnodes.h     4 Aug 2006 21:33:36 -0000       1.158
> --- src/include/nodes/execnodes.h     8 Aug 2006 16:17:21 -0000
> ***************
> *** 257,262 ****
> --- 257,264 ----
>    *          NumIndices                              # of indices existing 
> on result relation
>    *          IndexRelationDescs              array of relation descriptors 
> for indices
>    *          IndexRelationInfo               array of key/attr info for 
> indices
> +  *          ClusterIndex                    index to the IndexRelationInfo 
> array of the
> +  *                                                          clustered 
> index, or -1 if there's none
>    *          TrigDesc                                triggers to be fired, 
> if any
>    *          TrigFunctions                   cached lookup info for trigger 
> functions
>    *          TrigInstrument                  optional runtime measurements 
> for triggers
> ***************
> *** 272,277 ****
> --- 274,280 ----
>       int                     ri_NumIndices;
>       RelationPtr ri_IndexRelationDescs;
>       IndexInfo **ri_IndexRelationInfo;
> +     int         ri_ClusterIndex;
>       TriggerDesc *ri_TrigDesc;
>       FmgrInfo   *ri_TrigFunctions;
>       struct Instrumentation *ri_TrigInstrument;
> Index: src/include/utils/rel.h
> ===================================================================
> RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/include/utils/rel.h,v
> retrieving revision 1.91
> diff -c -r1.91 rel.h
> *** src/include/utils/rel.h   3 Jul 2006 22:45:41 -0000       1.91
> --- src/include/utils/rel.h   8 Aug 2006 16:17:21 -0000
> ***************
> *** 116,121 ****
> --- 116,122 ----
>       FmgrInfo        amvacuumcleanup;
>       FmgrInfo        amcostestimate;
>       FmgrInfo        amoptions;
> +     FmgrInfo        amsuggestblock;
>   } RelationAmInfo;
>   
>   

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches

Reply via email to