On Tue, Sep 30, 2003 at 11:31:26PM -0700, James Rogers wrote: > The problem: My working set is typically several million rows (and growing) > at any one time, which has a tendency to thrash the buffers mercilessly. > Records are inserted in an order that does not reflect typical retrieval > such that a typical query has to hit many, many blocks to collect all the > rows in a given range query. CLUSTER isn't cutting it (more below). > Definitely sub-optimal.
I have a situation that is very similar to yours, and my problems are very similar to yours. I hope some of your ideas are implementable in some way, because I think they would solve some of my problems too. > 1.) B-tree organized tables. The primary key index also contains the > entire row. I think this is called a clustered index on some other database systems. Basically you want to replace the content of the btree item with the whole tuple, instead of the pointer to the heap element which contains the tuple. One thing to keep in mind is that index tuples are limited to BLCKSZ/3 IIRC, so this limits the size of tuples that can be put in such a "table". TOASTing may help alleviate this problem. As for other indexes, I'm not sure why you say this precludes the use of other indexes. The only thing they have to do is keep pointers to index elements, instead of heap elements. Doesn't sound impossible to me. Another thing to keep in mind: L&Y requires unique keys. In the current btree implementation this is worked around using the pointers-to-heap (ctid?) to differentiate items that have the same key. If you use a clustered index you won't have this pointer; maybe it will be required that this index is marked UNIQUE. This may not be a too onerous restriction, given that the index is a primary key after all. I don't have anything to say about 2). -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "No deja de ser humillante para una persona de ingenio saber que no hay tonto que no le pueda enseņar algo." (Jean B. Say) ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match