Hello Yuzuko, On 2020-Dec-02, yuzuko wrote:
> The problem Horiguchi-san mentioned is as follows: > [explanation] Hmm, I see. So the problem is that if some ancestor is analyzed first, then analyze of one of its partition will cause a redundant analyze of the ancestor, because the number of tuples that is propagated from the partition represents a set that had already been included in the ancestor's analysis. If the problem was just that, then I think it would be very simple to solve: just make sure to sort the tables to vacuum so that all leaves are vacuumed first, and then all ancestors, sorted from the bottom up. Problem solved. But I'm not sure that that's the whole story, for two reasons: one, two workers can run simultaneously, where one analyzes the partition and the other analyzes the ancestor. Then the order is not guaranteed (and each process will get no effect from remembering whether it did that one or not). Second, manual analyzes can occur in any order. Maybe it's more useful to think about this in terms of rememebering that partition P had changed_tuples set to N when we analyzed ancestor A. Then, when we analyze partition P, we send the message listing A as ancestor; on receipt of that message, we see M+N changed tuples in P, but we know that we had already seen N, so we only record M. I'm not sure how to implement this idea however, since on analyze of ancestor A we don't have the list of partitions, so we can't know the N for each partition.