On 10/4/03 2:00 AM, "Hannu Krosing" <[EMAIL PROTECTED]> wrote: > > If the WHERE clause could use the same index (or any index with > visibility info) then there would be no need for "walking through the > tuples" in data relation. > > the typical usecase cited on [HACKERS] was time series data, where > inserts are roughly in (timestamp,id)order but queries in (id,timestamp) > order. Now if the index would include all relevant fields > (id,timestamp,data1,data2,...,dataN) then the query could run on index > only touching just a few pages and thus vastly improving performance. I > agree that this is not something everybody needs, but when it is needed > it is needed bad.
I would add that automatically index-organizing tuples isn't just useful for time-series data (though it is a good example), but can be used to substantially improve the query performance of any really large table in a number of different and not always direct ways. Once working sets routinely exceed the size of physical RAM, buffer access/utilization efficiency often becomes the center of performance tuning, but not one that many people know much about. One of the less direct ways of using btree-organized tables for improving scalability is to "materialize" table indexes of tables that *shouldn't* be btree-organized. Not only can you turn tables into indexes, but you can also turn indexes into tables, which can have advantages in some cases. For example, I did some scalability consulting at a well-known movie rental company with some very large Oracle databases running on big Sun boxen. One of the biggest problems was that their rental history table, which had a detailed record of every movie ever rented by every customer, had grown so large that the performance was getting painfully slow. To make matters worse, it and a few related tables had high concurrent usage, a mixture of many performance-sensitive queries grabbing windows of a customer's history plus a few broader OLAP queries which were not time sensitive. Everything was technically optimized in a relational and basic configuration sense, and the database guys at the company were at a loss on how to fix this problem. Performance of all queries was essentially bound by how fast pages could be moved between the disk and buffers. Issue #1: The history rows had quite a lot of columns and the OLAP processes used non-primary indexes, so the table was not particularly suitable for btree-organizing. Issue #2: Partitioning was not an option because it would have exceeded certain limits in Oracle (at that time, I don't know if that has changed). Issue #3: Although customer histories were being constantly queried, data needed most was really an index view of the customers history, not the details of the history itself. The solution I came up with was to use a synced btree-organized partial clone of the main history table that only contained a small number of key columns that mattered for generating customer history indexes in the applications that used them. While this substantially increased the disk space footprint for the same data (since we were cloning it), it greatly reduced the total number of cache misses for the typical query, only fetching the full history row pages when actually needed. In other words, basically concentrating more buffer traffic into a smaller number of page buffers. What we had was an exceedingly active but relatively compact materialized index of the history table that could essentially stay resident in RAM, and a much less active history table+indexes that while less likely to be buffered than before, had pages accessed at such a reduced frequency that there was a huge net performance gain because disk access plummeted. Average performance improvement for the time sensitive queries: 50-70x So btree-organized tables can do more than make tables behave like indexes. They can also make indexes behave like tables. Both are very useful in some cases when your working set exceeds the physical buffer space. For smaller databases this has much less utility and users need to understand the limitations, nonetheless when tables and databases get really big it becomes an important tool in the tool belt. Cheers, -James Rogers [EMAIL PROTECTED] ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend