On Mon, 2008-10-13 at 08:30 -0400, Tom Lane wrote:
> Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> > No, I was thinking of something along the lines of:
> > INFO:  clustering "public.my_c"
> > INFO:  complete, was 33%, now 100% clustered
> > The only such measure that we have is the correlation, which isn't very 
> > good anyway, so I'm not sure if that's worthwhile.
> 
> It'd be possible to count the number of order reversals during the
> indexscan, ie the number of tuples with CTID lower than the previous
> one's.  But I'm not sure how useful that number really is.  Also it's
> not clear how to preserve such functionality if cluster is
> re-implemented with a sort.
> 

I assume here you mean a CTID with a lower page number, as the line
pointer wouldn't make any difference, right?

I think it would be a useful metric to decide whether or not to use an
index scan (I don't know how easy it is to estimate this from a sample,
but a CLUSTER could clearly get an exact number). It would solve the
problem where synchronized scans used by pg_dump could result in poor
correlation on restore and therefore not choose index scans (which is
what prompted turning off sync scans for pg_dump).

Regards,
        Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to