On Wed, 1 Feb 2023 at 16:53, Justin Pryzby <pry...@telsasoft.com> wrote: > > On Wed, Feb 01, 2023 at 04:21:35PM +0100, Matthias van de Meent wrote: > > On Wed, 1 Feb 2023 at 15:21, Ilya Gladyshev <ilya.v.gladys...@gmail.com> > > wrote: > > > > 1 февр. 2023 г., в 16:01, Alvaro Herrera <alvhe...@alvh.no-ip.org> > > > > написал(а): > > > > Hmm, count_leaf_partitions has to scan pg_inherits and do a syscache > > > > lookup for every single element therein ... this sounds slow. > > > > > > > > In one of the callsites, we already have the partition descriptor > > > > available. We could just scan partdesc->is_leaf[] and add one for each > > > > 'true' value we see there. > > > > > > The problem is that partdesc contains only direct children of the table > > > and we need all the children down the inheritance tree to count the total > > > number of leaf partitions in the first callsite. > > > > > > > In the other callsite, we had the table open just a few lines before the > > > > place you call count_leaf_partitions. Maybe we can rejigger things by > > > > examining its state before closing it: if relkind is not partitioned we > > > > know leaf_partitions=0, and only if partitioned we count leaf > > > > partitions. > > > > I think that would save some work. I also wonder if it's worth writing > > > > a bespoke function for counting leaf partitions rather than relying on > > > > find_all_inheritors. > > > > > > Sure, added this condition to avoid the extra work here. > > > > > > > > When creating an index on a partitioned table, this column is set > > > to > > > - the total number of partitions on which the index is to be > > > created. > > > + the total number of leaf partitions on which the index is to be > > > created or attached. > > > > I think we should also add a note about the (now) non-constant nature > > of the value, something along the lines of "This value is updated as > > we're processing and discovering partitioned tables in the partition > > hierarchy". > > But the TOTAL is constant, right ? Updating the total when being called > recursively is the problem these patches fix.
If that's the case, then I'm not seeing the 'fix' part of the patch. I thought this patch was fixing the provably incorrect TOTAL value where DONE > TOTAL due to the recursive operation overwriting the DONE/TOTAL values instead of updating them. In HEAD we set TOTAL to whatever number partitioned table we're currently processing has - regardless of whether we're the top level statement. With the patch we instead add the number of child relations to that count, for which REL_HAS_STORAGE(child) -- or at least, in the v3 posted by Ilya. Approximately immediately after updating that count we recurse to the child relations, and that only returns once it is done creating the indexes, so both TOTAL and DONE go up as we process more partitions in the hierarchy. An example hierarchy: CREATE TABLE parent (a, b) partition by list (a); CREATE TABLE a1 PARTITION OF parent FOR VALUES IN (1) PARTITION BY LIST (b); CREATE TABLE a1bd PARTITION OF a1 DEFAULT; CREATE TABLE a2 PARTITION OF parent FOR VALUES IN (2) PARTITION BY LIST (b); CREATE TABLE a2bd PARTITION OF a2 DEFAULT; INSERT INTO parent (a, b) SELECT * from generate_series(1, 2) a(a) cross join generate_series(1, 100000),b(b); CREATE INDEX ON parent(a,b); This will only discover that a2bd will need to be indexed after a1bd is done (or vice versa, depending on which order a1 and a2 are processed in the ). Kind regards, Matthias van de Meent