Hi, Due to bug #17245: [1] I spent a considerably amount of time looking at vacuum related code. And I found a few things that I think could stand improvement:
- There's pretty much no tests. This is way way too complicated feature for that. If there had been tests for the obvious edge case of some indexes being too small to be handled in parallel, but others needing parallelism, the mistake leading to #17245 would have been caught during development. - There should be error check verifying that all indexes have actually been vacuumed. It's way too easy to have bugs leading to index vacuuming being skipped. - The state machine is complicated. It's very unobvious that an index needs to be processed serially by the leader if shared_indstats == NULL. - I'm very confused by the existance of LVShared->bitmap. Why is it worth saving 7 bits per index for something like this (compared to a simple array of bools)? Nor does the naming explain what it's for. The presence of the bitmap requires stuff like SizeOfLVShared(), which accounts for some of the bitmap size, but not all? But even though we have this space optimized bitmap thing, we actually need more memory allocated for each index, making this whole thing pointless. - Imo it's pretty confusing to have functions like lazy_parallel_vacuum_indexes() (in 13, renamed in 14) that "Perform index vacuum or index cleanup with parallel workers.", based on lps->lvshared->for_cleanup. - I don't like some of the new names introduced in 14. E.g. "do_parallel_processing" is way too generic. - On a higher level, a lot of this actually doesn't seem to belong into vacuumlazy.c, but should be somewhere more generic. Pretty much none of this code is heap specific. And vacuumlazy.c is large enough without the parallel code. Greetings, Andres Freund [1] https://postgr.es/m/17245-ddf06aaf85735f36%40postgresql.org