Hello, On Fri, Dec 27, 2019 at 2:02 PM Masahiko Sawada <masahiko.saw...@2ndquadrant.com> wrote: > On Fri, 27 Dec 2019 at 12:37, yuzuko <yuzukohos...@gmail.com> wrote: > > As Laurenz commented in this thread, I tried adding option > > to update parent's statistics during Autovacuum. To do that, > > I propose supporting 'autovacuum_enabled' option already > > exists on partitioned tables. > > > > In the attached patch, you can use 'autovacuum_enabled' option > > on partitioned table as usual, that is, a default value of this option > > is true. So if you don't need autovacuum on a partitioned table, > > you have to specify the option: > > CREATE TABLE p(i int) partition by range(i) with (autovacuum_enabled=0); > > > > I'm not sure but I wonder if a suitable value as a default of > > 'autovacuum_enabled' for partitioned tables might be false. > > Because autovacuum on *partitioned tables* requires scanning > > all children to make partitioned tables' statistics. > > But if the default value varies according to the relation, > > is it confusing? Any thoughts? > > I don't look at the patch deeply yet but your patch seems to attempt > to vacuum on partitioned table. IIUC partitioned tables don't need to > be vacuumed and its all child tables are vacuumed instead if we pass > the partitioned table to vacuum() function. But autovacuum on child > tables is normally triggered since their statistics are updated. > > I think it's a good idea to have that option but I think that doing > autovacuum on the parent table every time when autovacuum is triggered > on one of its child tables is very high cost especially when there are > a lot of child tables. Instead I thought it's more straight forward if > we compare the summation of the statistics of child tables (e.g. > n_live_tuples, n_dead_tuples etc) to vacuum thresholds when we > consider the needs of autovacuum on the parent table. What do you > think?
There's this old email where Tom outlines a few ideas about triggering auto-analyze on inheritance trees: https://www.postgresql.org/message-id/4823.1262132964%40sss.pgh.pa.us If I'm reading that correctly, the idea is to track only changes_since_analyze and none of the finer-grained stats like live/dead tuples for inheritance parents (partitioned tables) using some new pgstat infrastrcture, an idea that Hosoya-san also seems to be considering per an off-list discussion. Besides the complexity of getting that infrastructure in place, an important question is whether the current system of applying threshold and scale factor to changes_since_analyze should be used as-is for inheritance parents (partitioned tables), because if users set those parameters similarly to for regular tables, autovacuum might analyze partitioned tables more than necessary. We'll either need a different formula, or some commentary in the documentation about how partitioned tables might need different setting, or maybe both. By the way, maybe I'm misunderstanding what Sawada-san wrote above, but the only missing piece seems to be a way to trigger an *analyze* on the parent tables -- to collect optimizer statistics for the inheritance trees -- not vacuum, for which the existing system seems enough. Thanks, Amit