> > 98304 22.07 5545984
> > 196608 45.60 11141120
> > 393216 92.53 22290432
> >
> > I tried probabilities from 0.67 to 0.999 and found that runtimes didn't
> > vary a whole lot (though this is near the minimum), while index size
> > consistently got larger as the probability of moving right decreased.
> > The runtime is nicely linear throughout the range.
>
> That looks brilliant!! (Bearing in mind that I have over 10 million
> tuples in my table, you can imagine what performance was like for me!)
> Is there any chance you could generate a patch against released 7.0.2
> to add just this functionality... It would be the kiss of life for my
> code!
>
> (Not in a hurry, I'm not back in work until Wednesday, as it happens)
>
> And, of course, what would /really/ get my code going speedily would
> be the partial indices mentioned elsewhere in this thread. If the
> backend could automagically drop keys containing > 10% (tunable) of
> the rows from the index, then my index would be (a) about 70% smaller!
> and (b) only used when it's faster. [This means it would have to
> update some simple histogram data. However, I can't see that being
> much of an overhead]
>
> For the short term, if I can get a working version of the above
> randomisation patch, I think I shall 'fake' a partial index by
> manually setting 'enable_seqscan=off' for all but the 4 or 5 most
> common categories. Those two factors combined will speed up my bulk
> inserts a lot.
What would be really nifty is to take the most common value found by
VACUUM ANALYZE, and cause sequential scans if that value represents more
than 50% of the entries in the table.
Added to TODO:
* Prevent index lookups (or index entries using partial index) on most
common values; instead use sequential scan
--
Bruce Momjian | http://candle.pha.pa.us
[EMAIL PROTECTED] | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026