Re: Behavior change in PostgreSQL 14Beta3 or bug?
On Mon, 2021-09-06 at 12:11 -0700, Peter Geoghegan wrote: > On Mon, Sep 6, 2021 at 9:21 AM Laurenz Albe wrote: > > #define BYPASS_THRESHOLD_PAGES 0.02 /* i.e. 2% of rel_pages */ > > > > So up to an additional 2% of all pages can have the all-visible bit > > unset with "index_cleanup = auto". > > > > That is probably not worth worrying, right? > > I don't think it's worth worrying about. > > The bypass-index-vacuuming feature may have had a bit of a messaging > problem. It was something we usually talked about as being about > skipping index vacuuming, because that's what it actually does. > However, the feature isn't really about doing less work during VACUUM. > It's actually about doing *more* work during VACUUM -- more useful > work. Especially setting visibility map bits. But also freezing. Now > you can very aggressively tune VACUUM to do these things more often, > with no fear of that being way too expensive because of index > vacuuming that has only marginal value. That makes sense; thanks for the detailed explanation. Yours, Laurenz Albe
Re: Behavior change in PostgreSQL 14Beta3 or bug?
On Mon, Sep 6, 2021 at 7:52 AM Daniel Westermann (DWE) wrote: > >Try running vacuum with index cleanup = on. > > Thank you, Peter Thanks for testing! -- Peter Geoghegan
Re: Behavior change in PostgreSQL 14Beta3 or bug?
On Mon, Sep 6, 2021 at 8:59 AM Tom Lane wrote: > Assuming that that choice was made appropriately, I think the advice you > propose here will just cause people to waste lots of cycles on VACUUM > runs that have only marginal effects. Right. The advice that they should receive (if any) is to tune autovacuum aggressively, and enable autovacuum log output. The log output reports on whether or not the implementation applied the optimization in each case. As I pointed out to Laurenz just now, users that care about index-only scans are actually the big beneficiaries here. Now they can set autovacuum_vacuum_insert_threshold very aggressively, without doing a useless round of index vacuuming just because one inserting transaction out of a million aborted. Once indexes are removed from the equation (to the extent that that makes sense), each round of vacuuming by autovacuum only needs to do work that is proportional to the number of unset-in-vm heap pages. I believe that that trade-off makes a lot of sense. Autovacuum has little chance of keeping anything like 100% of all pages set in the VM anyway. But it can get a lot closer to it in some cases now. -- Peter Geoghegan
Re: Behavior change in PostgreSQL 14Beta3 or bug?
On Mon, Sep 6, 2021 at 9:21 AM Laurenz Albe wrote: > #define BYPASS_THRESHOLD_PAGES 0.02/* i.e. 2% of rel_pages */ > > So up to an additional 2% of all pages can have the all-visible bit > unset with "index_cleanup = auto". > > That is probably not worth worrying, right? I don't think it's worth worrying about. I would say that, since I chose the exact threshold myself. The threshold was a bit arbitrary, of course. Note that Daniel's example had a non-HOT update, even though it's the kind of update that we imagine can use HOT (because it didn't modify an indexed column). He could have ensured a HOT update by lowering heap fill factor, but why should that be necessary if updates are rare anyway? The bypass-index-vacuuming feature may have had a bit of a messaging problem. It was something we usually talked about as being about skipping index vacuuming, because that's what it actually does. However, the feature isn't really about doing less work during VACUUM. It's actually about doing *more* work during VACUUM -- more useful work. Especially setting visibility map bits. But also freezing. Now you can very aggressively tune VACUUM to do these things more often, with no fear of that being way too expensive because of index vacuuming that has only marginal value. The threshold is not so much about any one VACUUM operation -- you have to think about the aggregate effect on the table over time. Most individual tables will never have the new optimization kick in even once, because the workload just couldn't possibly allow it -- the 2% threshold is vastly exceeded every single time. The cases that it actually applies to are pretty much insert-only tables, perhaps with some HOT updates. 100% clean inserts are probably very rare in the real world. I believe that it's *vastly* more likely that such a table will have pages that are ~98%+ free of LP_DEAD line pointers in heap pages (i.e., the thing that BYPASS_THRESHOLD_PAGES applies to). To get to 100% you cannot allow even one single insert transaction to abort since the last VACUUM. If you assume that BYPASS_THRESHOLD_PAGES is actually too low for your workload (which is the opposite problem), then it doesn't matter very much. The feature as coded should still have the desired effect of skipping index vacuuming in *most* cases where it's unnecessary (however you happen to define "unnecessary") -- the number of pages with LP_DEAD items will naturally increase over time without index vacuuming, until the threshold is crossed. Maybe still-unnecessary index vacuuming will still take place in 1 out of 5 cases with the feature. This is still much better than 5 out of 5. More importantly, you can now aggressively tune vacuuming without noticeably increasing the number of individual vacuums that still have the problem of unnecessary index vacuuming. So if you go from 5 vacuums per day to 20 through tuning alone, the number of vacuum operations that do unnecessary index vacuuming doesn't increase at all (except perhaps due to rounding effects). -- Peter Geoghegan
Re: Behavior change in PostgreSQL 14Beta3 or bug?
On Mon, 2021-09-06 at 11:59 -0400, Tom Lane wrote: > Laurenz Albe writes: > > It is not an incompatibility that warrants a mention in the release notes, > > but perhaps somthing in > > https://www.postgresql.org/docs/14/indexes-index-only-scans.html > > and/or > > https://www.postgresql.org/docs/14/routine-vacuuming.html#VACUUM-FOR-VISIBILITY-MAP > > could be added that recommends that people should consider frequent > > VACUUM with "index_cleanup = on" for best performance with index-only scans. > > If enough pages would change their all-visible state to make a significant > difference in index-only scan performance, VACUUM should not be skipping > the cleanup. If it is, the threshold for that is too aggressive. > > Assuming that that choice was made appropriately, I think the advice you > propose here will just cause people to waste lots of cycles on VACUUM > runs that have only marginal effects. #define BYPASS_THRESHOLD_PAGES 0.02/* i.e. 2% of rel_pages */ So up to an additional 2% of all pages can have the all-visible bit unset with "index_cleanup = auto". That is probably not worth worrying, right? Yours, Laurenz Albe
Re: Behavior change in PostgreSQL 14Beta3 or bug?
Laurenz Albe writes: > It is not an incompatibility that warrants a mention in the release notes, > but perhaps somthing in > https://www.postgresql.org/docs/14/indexes-index-only-scans.html > and/or > https://www.postgresql.org/docs/14/routine-vacuuming.html#VACUUM-FOR-VISIBILITY-MAP > could be added that recommends that people should consider frequent > VACUUM with "index_cleanup = on" for best performance with index-only scans. If enough pages would change their all-visible state to make a significant difference in index-only scan performance, VACUUM should not be skipping the cleanup. If it is, the threshold for that is too aggressive. Assuming that that choice was made appropriately, I think the advice you propose here will just cause people to waste lots of cycles on VACUUM runs that have only marginal effects. regards, tom lane
Re: Behavior change in PostgreSQL 14Beta3 or bug?
On Mon, 2021-09-06 at 07:46 -0700, Peter Geoghegan wrote: > It's a non-hot update, and so there is a single dead index tuple. You're > seeing > the new optimization that makes vacuum skip indexes in marginal cases. > > Try running vacuum with index cleanup = on. It occurs to me that this new default "auto" setting for "index_cleanup" may cause a performance regression for people who VACUUM tables frequently in order to get fast index-only scans. That is not a bug, but it would be good to alert the users. It is not an incompatibility that warrants a mention in the release notes, but perhaps somthing in https://www.postgresql.org/docs/14/indexes-index-only-scans.html and/or https://www.postgresql.org/docs/14/routine-vacuuming.html#VACUUM-FOR-VISIBILITY-MAP could be added that recommends that people should consider frequent VACUUM with "index_cleanup = on" for best performance with index-only scans. Suggested patch attached, should be backpatched to v14. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com From d98f4c4cb62b564e8f9a26ed4e8da80dadfbc55c Mon Sep 17 00:00:00 2001 From: Laurenz Albe Date: Mon, 6 Sep 2021 17:47:15 +0200 Subject: [PATCH] Document VACUUM tips for index-only scans Add hints for tuning autovacuum to get efficient index-only scans. This has become even more relevant than before, because the default "auto" option of "index_cleanup" introduced by commit 3499df0dee adds yet another thing to consider. --- doc/src/sgml/indices.sgml | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/doc/src/sgml/indices.sgml b/doc/src/sgml/indices.sgml index 56fbd45178..4257615b85 100644 --- a/doc/src/sgml/indices.sgml +++ b/doc/src/sgml/indices.sgml @@ -1134,8 +1134,9 @@ SELECT x FROM tab WHERE x = 'key' AND z 42; problem. PostgreSQL tracks, for each page in a table's heap, whether all rows stored in that page are old enough to be visible to all current and future transactions. This information is - stored in a bit in the table's visibility map. An - index-only scan, after finding a candidate index entry, checks the + stored in a bit in the table's + visibility map. + An index-only scan, after finding a candidate index entry, checks the visibility map bit for the corresponding heap page. If it's set, the row is known visible and so the data can be returned with no further work. If it's not set, the heap entry must be visited to find out whether it's @@ -1155,6 +1156,14 @@ SELECT x FROM tab WHERE x = 'key' AND z 42; make this type of scan very useful in practice. + + To make sure that index-only scans are efficient, it can be a good idea + to see that the table is VACUUMed often enough. This + can be done by lowering + on that table and setting + to auto. + + INCLUDE -- 2.31.1
Re: Behavior change in PostgreSQL 14Beta3 or bug?
>It's a non-hot update, and so there is a single dead index tuple. You're >seeing the new optimization that makes vacuum skip indexes in >marginal cases. >Try running vacuum with index cleanup = on. Thank you, Peter
Re: Behavior change in PostgreSQL 14Beta3 or bug?
It's a non-hot update, and so there is a single dead index tuple. You're seeing the new optimization that makes vacuum skip indexes in marginal cases. Try running vacuum with index cleanup = on. Peter Geoghegan (Sent from my phone)