Thanks every for your help. I am not familiar with the internal of the vacuum freeze, just curious if there is no row change on the table(in other words, all pages are frozen), why could index page have dead tuple?
is it possible to scan data page first, if all data page are frozen, skipping the index page scan step. Perhaps there is other reason vacuum freeze does index page first, then is it possible to provide a option to skip index page scan step in vacuum freeze command? thanks James ________________________________ 发件人: Robert Haas <robertmh...@gmail.com> 发送时间: 2016年12月1日 13:50:49 收件人: Tom Lane 抄送: xu jian; Masahiko Sawada; pgsql-hackers@postgresql.org 主题: Re: [HACKERS] Re: [HACKERS] 答复: [HACKERS] postgres 1 个(共 2 个) can pg 9.6 vacuum freeze skip page on index? On Thu, Dec 1, 2016 at 1:39 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Robert Haas <robertmh...@gmail.com> writes: >> I think that the indexes only need to be scanned if the VACUUM finds >> dead tuples. But even 1 dead tuple will cause a complete scan of >> every index. I've complained about this before and I think there's >> room for improvement here, but nobody's been motivated enough to >> pursue this yet. > > The thing that's been speculated about in the past is having some > threshold larger than 1 on the minimum number of dead tuples needed > to cause a cleanup pass. Agreed. > It wouldn't be hard to implement, if you > could get consensus on what the threshold should be. Also agreed. > I'd think > some algorithm similar to the autovacuum thresholds might be > appropriate. It's not quite clear how this would interact with > HOT pruning, though. What's the relevance of HOT pruning here? I was thinking that the relevant metric might be how many pages contain dead tuples, because what we really want to do to reduce the cost of future vacuuming and future index-only scans is get pages marked all-visible. Say, if less than 2% of the pages in the table contain dead tuples and the space required to store the TIDs is less than 50% of maintenance_work_mem, skip the index scans. The first of those thresholds, at least, would probably need to be configurable, but that kind of idea. The alternative that's been proposed is to do something based on the number of dead tuples but, as somebody pointed out in a previous discussion of this topic, one dead tuple per page throughout the whole table is a LOT worse than same number of dead tuples all on the same pages. You don't want to keep scanning large chunks of the heap because you're too lazy to visit the indexes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company