Re: [HACKERS] Block level parallel vacuum

Amit Kapila Wed, 04 Dec 2019 21:23:36 -0800

On Thu, Dec 5, 2019 at 1:41 AM Robert Haas <robertmh...@gmail.com> wrote:
>
> On Mon, Dec 2, 2019 at 2:26 PM Masahiko Sawada
> <masahiko.saw...@2ndquadrant.com> wrote:
> > It's just an example, I'm not saying your idea is bad. ISTM the idea
> > is good on an assumption that all indexes take the same time or take a
> > long time so I'd also like to consider if this is true even in
> > production and which approaches is better if we don't have such
> > assumption.
>
> I think his idea is good. You're not wrong when you say that there are
> cases where it could work out badly, but I think on the whole it's a
> clear improvement. Generally, the indexes should be of relatively
> similar size because index size is driven by table size; it's true
> that different AMs could result in different-size indexes, but it
> seems like a stretch to suppose that the indexes that don't support
> parallelism are also going to be the little tiny ones that go fast
> anyway, unless we have some evidence that this is really true. I also
> wonder whether we really need the option to disable parallel vacuum in
> the first place.
>


I think it could be required for the cases where the AM doesn't have a
way (or it is difficult to come up with a way) to communicate the
stats allocated by the first ambulkdelete call to the subsequent ones
until amvacuumcleanup.  Currently, we have such a case for the Gist
index, see email thread [1]. Though we have come up with a way to
avoid that for Gist indexes, I am not sure if we can assume that it is
the case for any possible index AM especially when there is a
provision that indexAM can have additional stats information.  In the
worst case, if we have to modify some existing index AM like we did
for the Gist index, we need such a provision so that it is possible.
In the ideal case, the index AM should provide a way to copy such
stats, but we can't assume that, so we come up with this option.

We have used this for dummy_index_am which also provides a way to test it.

> Maybe I'm looking in the right place, but I'm not
> finding anything in the way of comments or documentation explaining
> why some AMs don't support it. It's an important design point, and
> should be documented.
>

Agreed.  We should do this.

> I also think PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION seems like a
> waste of space. For parallel queries, there is a trade-off between
> having the leader do work (which can speed up the query) and having it
> remain idle so that it can immediately absorb tuples from workers and
> keep them from having their tuple queues fill up (which can speed up
> the query). But here, at least as I understand it, there's no such
> trade-off. Having the leader fail to participate is just a loser.
> Maybe it's useful to test while debugging the patch,
>

Yeah, it is primarily a debugging/testing aid patch and it helped us
in discovering some issues.  During development, it is being used for
tesing purpose as well.  This is the reason the code is under #ifdef

> but why should
> the committed code support it?
>

I am also not sure whether we should commit this part of code and that
is why I told in one of the above emails to keep it as a separate
patch.  We can later see whether to commit this code.  Now, the point
in its favor is that we already have a similar define
(DISABLE_LEADER_PARTICIPATION) for parallel create index, so having it
here is not a bad idea.  I think it might help us in debugging some
bugs where we want forcefully the index to be vacuumed by some worker.
We might want to have something like force_parallel_mode for
testing/debugging purpose, but not sure which is better.  I think
having something as a debugging aid for such features is good.

> To respond to another point from a different part of the email chain,
> the reason why LaunchParallelWorkers() does not take an argument for
> the number of workers is because I believed that the caller should
> always know how many workers they're going to want at the time they
> CreateParallelContext(). Increasing it later is not possible, because
> the DSM has already sized based on the count provided. I grant that it
> would be possible to allow the number to be reduced later, but why
> would you want to do that? Why not get the number right when first
> creating the DSM?
>

Here, we have a need to reduce the number of workers.  Index Vacuum
has two different phases (index vacuum and index cleanup) which uses
the same parallel-context/DSM but both could have different
requirements for workers.  The second phase (cleanup) would normally
need fewer workers as if the work is done in the first phase, second
wouldn't need it, but we have exceptions like gin indexes where we
need it for the second phase as well because it takes the pass
over-index again even if we have cleaned the index in the first phase.
Now, consider the case where we have 3 btree indexes and 2 gin
indexes, we would need 5 workers for index vacuum phase and 2 workers
for index cleanup phase.  There are other cases too.

We also considered to have a separate DSM for each phase, but that
appeared to have overhead without much benefit.

> Is there any legitimate use case for parallel vacuum in combination
> with vacuum cost delay?
>

Yeah, we also initially thought that it is not legitimate to use a
parallel vacuum with a cost delay.  But to get a wider view, we
started a separate thread [2] and there we reach to the conclusion
that we need a solution for throttling [3].

>
> + * vacuum and for performing index cleanup.  Note that all parallel workers
> + * live during either index vacuuming or index cleanup but the leader process
> + * neither exits from the parallel mode nor destroys the parallel context.
> + * For updating the index statistics, since any updates are not allowed 
> during
> + * parallel mode we update the index statistics after exited from the 
> parallel
>
> The first of these sentences ("Note that all...") is not very clear to
> me, and seems like it may amount to a statement that the leader
> doesn't try to destroy the parallel context too early, but since I
> don't understand it, maybe that's not what it is saying.
>

Your understanding is correct.  How about if we modify it to something
like: "Note that parallel workers are alive only during index vacuum
or index cleanup but the leader process neither exits from the
parallel mode nor destroys the parallel context until the entire
parallel operation is finished." OR something like "The leader backend
holds the parallel context till the index vacuum and cleanup is
finished.  Both index vacuum and cleanup separately perform the work
with parallel workers."

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Block level parallel vacuum

Reply via email to