On Fri, Dec 2, 2016 at 3:50 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Thu, Dec 1, 2016 at 1:39 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmh...@gmail.com> writes:
>>> I think that the indexes only need to be scanned if the VACUUM finds
>>> dead tuples.  But even 1 dead tuple will cause a complete scan of
>>> every index.  I've complained about this before and I think there's
>>> room for improvement here, but nobody's been motivated enough to
>>> pursue this yet.
>>
>> The thing that's been speculated about in the past is having some
>> threshold larger than 1 on the minimum number of dead tuples needed
>> to cause a cleanup pass.
>
> Agreed.
>
>> It wouldn't be hard to implement, if you
>> could get consensus on what the threshold should be.
>
> Also agreed.
>
>> I'd think
>> some algorithm similar to the autovacuum thresholds might be
>> appropriate.  It's not quite clear how this would interact with
>> HOT pruning, though.
>
> What's the relevance of HOT pruning here?
>
> I was thinking that the relevant metric might be how many pages
> contain dead tuples, because what we really want to do to reduce the
> cost of future vacuuming and future index-only scans is get pages
> marked all-visible.  Say, if less than 2% of the pages in the table
> contain dead tuples and the space required to store the TIDs is less
> than 50% of maintenance_work_mem, skip the index scans.  The first of
> those thresholds, at least, would probably need to be configurable,
> but that kind of idea.

I think that this idea is better. If the number of pages containing
dead tuple is less than threshold (e.g.
vacuum_index_cleanup_scale_factor), we can skip  the cleanup index
scans.
I will write the patch and submit to next CF.

> The alternative that's been proposed is to do something based on the
> number of dead tuples but, as somebody pointed out in a previous
> discussion of this topic, one dead tuple per page throughout the whole
> table is a LOT worse than same number of dead tuples all on the same
> pages.  You don't want to keep scanning large chunks of the heap
> because you're too lazy to visit the indexes.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to