Re: Behavior change in PostgreSQL 14Beta3 or bug?

2021-09-07 Thread Laurenz Albe
On Mon, 2021-09-06 at 12:11 -0700, Peter Geoghegan wrote:
> On Mon, Sep 6, 2021 at 9:21 AM Laurenz Albe  wrote:
> > #define BYPASS_THRESHOLD_PAGES  0.02    /* i.e. 2% of rel_pages */
> > 
> > So up to an additional 2% of all pages can have the all-visible bit
> > unset with "index_cleanup = auto".
> > 
> > That is probably not worth worrying, right?
> 
> I don't think it's worth worrying about.
> 
> The bypass-index-vacuuming feature may have had a bit of a messaging
> problem. It was something we usually talked about as being about
> skipping index vacuuming, because that's what it actually does.
> However, the feature isn't really about doing less work during VACUUM.
> It's actually about doing *more* work during VACUUM -- more useful
> work. Especially setting visibility map bits. But also freezing. Now
> you can very aggressively tune VACUUM to do these things more often,
> with no fear of that being way too expensive because of index
> vacuuming that has only marginal value.

That makes sense; thanks for the detailed explanation.

Yours,
Laurenz Albe





Re: Behavior change in PostgreSQL 14Beta3 or bug?

2021-09-06 Thread Peter Geoghegan
On Mon, Sep 6, 2021 at 7:52 AM Daniel Westermann (DWE)
 wrote:
> >Try running vacuum with index cleanup = on.
>
> Thank you, Peter

Thanks for testing!

-- 
Peter Geoghegan




Re: Behavior change in PostgreSQL 14Beta3 or bug?

2021-09-06 Thread Peter Geoghegan
On Mon, Sep 6, 2021 at 8:59 AM Tom Lane  wrote:
> Assuming that that choice was made appropriately, I think the advice you
> propose here will just cause people to waste lots of cycles on VACUUM
> runs that have only marginal effects.

Right. The advice that they should receive (if any) is to tune
autovacuum aggressively, and enable autovacuum log output. The log
output reports on whether or not the implementation applied the
optimization in each case.

As I pointed out to Laurenz just now, users that care about index-only
scans are actually the big beneficiaries here. Now they can set
autovacuum_vacuum_insert_threshold very aggressively, without doing a
useless round of index vacuuming just because one inserting
transaction out of a million aborted. Once indexes are removed from
the equation (to the extent that that makes sense), each round of
vacuuming by autovacuum only needs to do work that is proportional to
the number of unset-in-vm heap pages.

I believe that that trade-off makes a lot of sense. Autovacuum has
little chance of keeping anything like 100% of all pages set in the VM
anyway. But it can get a lot closer to it in some cases now.

-- 
Peter Geoghegan




Re: Behavior change in PostgreSQL 14Beta3 or bug?

2021-09-06 Thread Peter Geoghegan
On Mon, Sep 6, 2021 at 9:21 AM Laurenz Albe  wrote:
> #define BYPASS_THRESHOLD_PAGES  0.02/* i.e. 2% of rel_pages */
>
> So up to an additional 2% of all pages can have the all-visible bit
> unset with "index_cleanup = auto".
>
> That is probably not worth worrying, right?

I don't think it's worth worrying about. I would say that, since I
chose the exact threshold myself. The threshold was a bit arbitrary,
of course.

Note that Daniel's example had a non-HOT update, even though it's the
kind of update that we imagine can use HOT (because it didn't modify
an indexed column). He could have ensured a HOT update by lowering
heap fill factor, but why should that be necessary if updates are rare
anyway?

The bypass-index-vacuuming feature may have had a bit of a messaging
problem. It was something we usually talked about as being about
skipping index vacuuming, because that's what it actually does.
However, the feature isn't really about doing less work during VACUUM.
It's actually about doing *more* work during VACUUM -- more useful
work. Especially setting visibility map bits. But also freezing. Now
you can very aggressively tune VACUUM to do these things more often,
with no fear of that being way too expensive because of index
vacuuming that has only marginal value.

The threshold is not so much about any one VACUUM operation -- you
have to think about the aggregate effect on the table over time. Most
individual tables will never have the new optimization kick in even
once, because the workload just couldn't possibly allow it -- the 2%
threshold is vastly exceeded every single time. The cases that it
actually applies to are pretty much insert-only tables, perhaps with
some HOT updates. 100% clean inserts are probably very rare in the
real world. I believe that it's *vastly* more likely that such a table
will have pages that are ~98%+ free of LP_DEAD line pointers in heap
pages (i.e., the thing that BYPASS_THRESHOLD_PAGES applies to). To get
to 100% you cannot allow even one single insert transaction to abort
since the last VACUUM.

If you assume that BYPASS_THRESHOLD_PAGES is actually too low for your
workload (which is the opposite problem), then it doesn't matter very
much. The feature as coded should still have the desired effect of
skipping index vacuuming in *most* cases where it's unnecessary
(however you happen to define "unnecessary") -- the number of pages
with LP_DEAD items will naturally increase over time without index
vacuuming, until the threshold is crossed. Maybe still-unnecessary
index vacuuming will still take place in 1 out of 5 cases with the
feature. This is still much better than 5 out of 5. More importantly,
you can now aggressively tune vacuuming without noticeably increasing
the number of individual vacuums that still have the problem of
unnecessary index vacuuming. So if you go from 5 vacuums per day to 20
through tuning alone, the number of vacuum operations that do
unnecessary index vacuuming doesn't increase at all (except perhaps
due to rounding effects).

-- 
Peter Geoghegan




Re: Behavior change in PostgreSQL 14Beta3 or bug?

2021-09-06 Thread Laurenz Albe
On Mon, 2021-09-06 at 11:59 -0400, Tom Lane wrote:
> Laurenz Albe  writes:
> > It is not an incompatibility that warrants a mention in the release notes,
> > but perhaps somthing in
> > https://www.postgresql.org/docs/14/indexes-index-only-scans.html
> > and/or
> > https://www.postgresql.org/docs/14/routine-vacuuming.html#VACUUM-FOR-VISIBILITY-MAP
> > could be added that recommends that people should consider frequent
> > VACUUM with "index_cleanup = on" for best performance with index-only scans.
> 
> If enough pages would change their all-visible state to make a significant
> difference in index-only scan performance, VACUUM should not be skipping
> the cleanup.  If it is, the threshold for that is too aggressive.
> 
> Assuming that that choice was made appropriately, I think the advice you
> propose here will just cause people to waste lots of cycles on VACUUM
> runs that have only marginal effects.

#define BYPASS_THRESHOLD_PAGES  0.02/* i.e. 2% of rel_pages */

So up to an additional 2% of all pages can have the all-visible bit
unset with "index_cleanup = auto".

That is probably not worth worrying, right?

Yours,
Laurenz Albe





Re: Behavior change in PostgreSQL 14Beta3 or bug?

2021-09-06 Thread Tom Lane
Laurenz Albe  writes:
> It is not an incompatibility that warrants a mention in the release notes,
> but perhaps somthing in
> https://www.postgresql.org/docs/14/indexes-index-only-scans.html
> and/or
> https://www.postgresql.org/docs/14/routine-vacuuming.html#VACUUM-FOR-VISIBILITY-MAP
> could be added that recommends that people should consider frequent
> VACUUM with "index_cleanup = on" for best performance with index-only scans.

If enough pages would change their all-visible state to make a significant
difference in index-only scan performance, VACUUM should not be skipping
the cleanup.  If it is, the threshold for that is too aggressive.

Assuming that that choice was made appropriately, I think the advice you
propose here will just cause people to waste lots of cycles on VACUUM
runs that have only marginal effects.

regards, tom lane




Re: Behavior change in PostgreSQL 14Beta3 or bug?

2021-09-06 Thread Laurenz Albe
On Mon, 2021-09-06 at 07:46 -0700, Peter Geoghegan wrote:
> It's a non-hot update, and so there is a single dead index tuple. You're 
> seeing
> the new optimization that makes vacuum skip indexes in marginal cases. 
> 
> Try running vacuum with index cleanup = on. 

It occurs to me that this new default "auto" setting for "index_cleanup"
may cause a performance regression for people who VACUUM tables frequently
in order to get fast index-only scans.

That is not a bug, but it would be good to alert the users.

It is not an incompatibility that warrants a mention in the release notes,
but perhaps somthing in
https://www.postgresql.org/docs/14/indexes-index-only-scans.html
and/or
https://www.postgresql.org/docs/14/routine-vacuuming.html#VACUUM-FOR-VISIBILITY-MAP
could be added that recommends that people should consider frequent
VACUUM with "index_cleanup = on" for best performance with index-only scans.

Suggested patch attached, should be backpatched to v14.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com
From d98f4c4cb62b564e8f9a26ed4e8da80dadfbc55c Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Mon, 6 Sep 2021 17:47:15 +0200
Subject: [PATCH] Document VACUUM tips for index-only scans

Add hints for tuning autovacuum to get efficient
index-only scans.  This has become even more relevant
than before, because the default "auto" option of
"index_cleanup" introduced by commit 3499df0dee
adds yet another thing to consider.
---
 doc/src/sgml/indices.sgml | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/indices.sgml b/doc/src/sgml/indices.sgml
index 56fbd45178..4257615b85 100644
--- a/doc/src/sgml/indices.sgml
+++ b/doc/src/sgml/indices.sgml
@@ -1134,8 +1134,9 @@ SELECT x FROM tab WHERE x = 'key' AND z  42;
problem.  PostgreSQL tracks, for each page in
a table's heap, whether all rows stored in that page are old enough to be
visible to all current and future transactions.  This information is
-   stored in a bit in the table's visibility map.  An
-   index-only scan, after finding a candidate index entry, checks the
+   stored in a bit in the table's
+   visibility map.
+   An index-only scan, after finding a candidate index entry, checks the
visibility map bit for the corresponding heap page.  If it's set, the row
is known visible and so the data can be returned with no further work.
If it's not set, the heap entry must be visited to find out whether it's
@@ -1155,6 +1156,14 @@ SELECT x FROM tab WHERE x = 'key' AND z  42;
make this type of scan very useful in practice.
   
 
+  
+   To make sure that index-only scans are efficient, it can be a good idea
+   to see that the table is VACUUMed often enough.  This
+   can be done by lowering 
+   on that table and setting 
+   to auto.
+  
+
   

 INCLUDE
-- 
2.31.1



Re: Behavior change in PostgreSQL 14Beta3 or bug?

2021-09-06 Thread Daniel Westermann (DWE)
>It's a non-hot update, and so there is a single dead index tuple. You're 
>seeing the new optimization that makes vacuum skip indexes in >marginal cases.

>Try running vacuum with index cleanup = on.

Thank you, Peter


Re: Behavior change in PostgreSQL 14Beta3 or bug?

2021-09-06 Thread Peter Geoghegan
It's a non-hot update, and so there is a single dead index tuple. You're
seeing the new optimization that makes vacuum skip indexes in marginal
cases.

Try running vacuum with index cleanup = on.

Peter Geoghegan
(Sent from my phone)