Re: New GUC autovacuum_max_threshold ?

2024-05-13 Thread Robert Haas
On Mon, May 13, 2024 at 11:14 AM Frédéric Yhuel wrote: > FWIW, I do agree with your math. I found your demonstration convincing. > 50 was selected with the wet finger. Good to know. > Using the formula I suggested earlier: > > vacthresh = Min(vac_base_thresh + vac_scale_factor * reltuples,

Re: New GUC autovacuum_max_threshold ?

2024-05-13 Thread Frédéric Yhuel
Le 09/05/2024 à 16:58, Robert Haas a écrit : As I see it, a lot of the lack of agreement up until now is people just not understanding the math. Since I think I've got the right idea about the math, I attribute this to other people being confused about what is going to happen and would tend

Re: New GUC autovacuum_max_threshold ?

2024-05-09 Thread Robert Haas
On Wed, May 8, 2024 at 1:30 PM Imseih (AWS), Sami wrote: > There is also an alternative of making this GUC -1 by default, which > means it has not effect and any value larger will be used in the threshold > calculation of autovacuunm. A user will have to be careful not to set it too > low, > but

Re: New GUC autovacuum_max_threshold ?

2024-05-08 Thread Imseih (AWS), Sami
> This is about how I feel, too. In any case, I +1'd a higher default > because I think we need to be pretty conservative with these changes, at > least until we have a better prioritization strategy. While folks may opt > to set this value super low, I think that's more likely to lead to some >

Re: New GUC autovacuum_max_threshold ?

2024-05-07 Thread Nathan Bossart
On Tue, May 07, 2024 at 10:31:00AM -0400, Robert Haas wrote: > On Wed, May 1, 2024 at 10:03 PM David Rowley wrote: >> I think we need at least 1a) before we can give autovacuum more work >> to do, especially if we do something like multiply its workload by >> 1024x, per your comment above. > > I

Re: New GUC autovacuum_max_threshold ?

2024-05-07 Thread Robert Haas
On Wed, May 1, 2024 at 10:03 PM David Rowley wrote: > Here are some of the problems that I know about: > > 1. Autovacuum has exactly zero forward vision and operates reactively > rather than proactively. This "blind operating" causes tables to > either not need vacuumed or suddenly need vacuumed

Re: New GUC autovacuum_max_threshold ?

2024-05-02 Thread Imseih (AWS), Sami
> And as far as that goes, I'd like you - and others - to spell out more > precisely why you think 100 or 200 million tuples is too much. It > might be, or maybe it is in some cases but not in others. To me, > that's not a terribly large amount of data. Unless your tuples are > very wide, it's a

Re: New GUC autovacuum_max_threshold ?

2024-05-02 Thread Frédéric Yhuel
Le 01/05/2024 à 20:50, Robert Haas a écrit : Possibly what we need here is something other than a cap, where, say, we vacuum a 10GB table twice as often as now, a 100GB table four times as often, and a 1TB table eight times as often. Or whatever the right answer is. IMO, it would make more

Re: New GUC autovacuum_max_threshold ?

2024-05-01 Thread David Rowley
On Sat, 27 Apr 2024 at 02:13, Robert Haas wrote: > Let's compare the current situation to the situation post-patch with a > cap of 500k. Consider a table 1024 times larger than the one I > mentioned above, so pgbench scale factor 25600, size on disk 320GB. > Currently, that table will be vacuumed

Re: New GUC autovacuum_max_threshold ?

2024-05-01 Thread Robert Haas
On Wed, May 1, 2024 at 2:19 PM Imseih (AWS), Sami wrote: > > Unless I'm missing something major, that's completely bonkers. It > > might be true that it would be a good idea to vacuum such a table more > > often than we do at present, but there's no shot that we want to do it > > that much more

Re: New GUC autovacuum_max_threshold ?

2024-05-01 Thread Imseih (AWS), Sami
I've been following this discussion and would like to add my 2 cents. > Unless I'm missing something major, that's completely bonkers. It > might be true that it would be a good idea to vacuum such a table more > often than we do at present, but there's no shot that we want to do it > that much

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Robert Haas
On Fri, Apr 26, 2024 at 9:40 AM Joe Conway wrote: > > Can you elaborate on why you think that? I mean, to me, that's almost > > equivalent to removing autovacuum_vacuum_scale_factor entirely, > > because only for very small tables will that calculation produce a > > value lower than 500k. > > If

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Joe Conway
On 4/26/24 09:31, Robert Haas wrote: On Fri, Apr 26, 2024 at 9:22 AM Joe Conway wrote: Although I don't think 50 is necessarily too small. In my view, having autovac run very quickly, even if more frequently, provides an overall better user experience. Can you elaborate on why you think

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Robert Haas
On Fri, Apr 26, 2024 at 4:43 AM Michael Banck wrote: > > I believe that the defaults should work well in moderately sized databases > > with moderate usage characteristics. If you have large tables or a high > > number of transactions per second, you can be expected to make the effort > > and

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Robert Haas
On Fri, Apr 26, 2024 at 9:22 AM Joe Conway wrote: > Although I don't think 50 is necessarily too small. In my view, > having autovac run very quickly, even if more frequently, provides an > overall better user experience. Can you elaborate on why you think that? I mean, to me, that's almost

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Robert Haas
On Thu, Apr 25, 2024 at 10:24 PM Laurenz Albe wrote: > I don't find that convincing. Why are 2TB of wasted space in a 10TB > table worse than 2TB of wasted space in 100 tables of 100GB each? It's not worse, but it's more avoidable. No matter what you do, any table that suffers a reasonable

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Joe Conway
On 4/26/24 04:43, Michael Banck wrote: So this proposal (probably along with a higher default threshold than 50, but IMO less than what Robert and Nathan suggested) sounds like a stop forward to me. DBAs can set the threshold lower if they want, or maybe we can just turn it off by default if

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Michael Banck
Hi, On Fri, Apr 26, 2024 at 10:18:00AM +0200, Laurenz Albe wrote: > On Fri, 2024-04-26 at 09:35 +0200, Frédéric Yhuel wrote: > > Le 26/04/2024 à 04:24, Laurenz Albe a écrit : > > > On Thu, 2024-04-25 at 14:33 -0400, Robert Haas wrote: > > > > I believe that the underlying problem here can be

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Laurenz Albe
On Fri, 2024-04-26 at 09:35 +0200, Frédéric Yhuel wrote: > > Le 26/04/2024 à 04:24, Laurenz Albe a écrit : > > On Thu, 2024-04-25 at 14:33 -0400, Robert Haas wrote: > > > I believe that the underlying problem here can be summarized in this > > > way: just because I'm OK with 2MB of bloat in my

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Frédéric Yhuel
Le 25/04/2024 à 22:21, Robert Haas a écrit : The analyze case, I feel, is really murky. autovacuum_analyze_scale_factor stands for the proposition that as the table becomes larger, analyze doesn't need to be done as often. If what you're concerned about is the frequency estimates, that's

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Michael Banck
Hi, On Fri, Apr 26, 2024 at 04:24:45AM +0200, Laurenz Albe wrote: > On Thu, 2024-04-25 at 14:33 -0400, Robert Haas wrote: > > Another reason, at least in existing releases, is that at some > > point index vacuuming hits a wall because we run out of space for dead > > tuples. We *most definitely*

Re: New GUC autovacuum_max_threshold ?

2024-04-26 Thread Frédéric Yhuel
Le 26/04/2024 à 04:24, Laurenz Albe a écrit : On Thu, 2024-04-25 at 14:33 -0400, Robert Haas wrote: I believe that the underlying problem here can be summarized in this way: just because I'm OK with 2MB of bloat in my 10MB table doesn't mean that I'm OK with 2TB of bloat in my 10TB table.

Re: New GUC autovacuum_max_threshold ?

2024-04-25 Thread Laurenz Albe
On Thu, 2024-04-25 at 14:33 -0400, Robert Haas wrote: > I believe that the underlying problem here can be summarized in this > way: just because I'm OK with 2MB of bloat in my 10MB table doesn't > mean that I'm OK with 2TB of bloat in my 10TB table. One reason for > this is simply that I can

Re: New GUC autovacuum_max_threshold ?

2024-04-25 Thread Robert Haas
On Thu, Apr 25, 2024 at 4:57 PM Frédéric Yhuel wrote: > Now I have just read Robert's new message, and I understand his point. > But is there a real problem with triggering analyze after every 50 > (or more) modifications in the table anyway? It depends on the situation, but even on a

Re: New GUC autovacuum_max_threshold ?

2024-04-25 Thread Frédéric Yhuel
Le 25/04/2024 à 18:51, Melanie Plageman a écrit : I'm not too sure I understand. What are the reasons it might by skipped? I can think of a concurrent index creation on the same table, or anything holding a SHARE UPDATE EXCLUSIVE lock or above. Is this the sort of thing you are talking about?

Re: New GUC autovacuum_max_threshold ?

2024-04-25 Thread Frédéric Yhuel
Le 25/04/2024 à 21:21, Nathan Bossart a écrit : On Thu, Apr 25, 2024 at 02:33:05PM -0400, Robert Haas wrote: What does surprise me is that Frédéric suggests a default value of 500,000. If half a million tuples (proposed default) is 20% of your table (default value of

Re: New GUC autovacuum_max_threshold ?

2024-04-25 Thread Robert Haas
On Thu, Apr 25, 2024 at 3:21 PM Nathan Bossart wrote: > Agreed, the default should probably be on the order of 100-200M minimum. > > The original proposal also seems to introduce one parameter that would > affect all three of autovacuum_vacuum_threshold, > autovacuum_vacuum_insert_threshold, and

Re: New GUC autovacuum_max_threshold ?

2024-04-25 Thread Nathan Bossart
On Thu, Apr 25, 2024 at 02:33:05PM -0400, Robert Haas wrote: > What does surprise me is that Frédéric suggests a default value of > 500,000. If half a million tuples (proposed default) is 20% of your > table (default value of autovacuum_vacuum_scale_factor) then your > table has 2.5 million

Re: New GUC autovacuum_max_threshold ?

2024-04-25 Thread Robert Haas
On Wed, Apr 24, 2024 at 3:57 PM Nathan Bossart wrote: > Yeah, I'm having trouble following the proposed mechanics for this new GUC, > and it's difficult to understand how users would choose a value. If we > just want to cap the number of tuples required before autovacuum takes > action, perhaps

Re: New GUC autovacuum_max_threshold ?

2024-04-25 Thread Melanie Plageman
On Thu, Apr 25, 2024 at 2:52 AM Frédéric Yhuel wrote: > > Le 24/04/2024 à 21:10, Melanie Plageman a écrit : > > On Wed, Apr 24, 2024 at 8:08 AM Frédéric Yhuel > > wrote: > >> > >> Hello, > >> > >> I would like to suggest a new parameter, autovacuum_max_threshold, which > >> would set an upper

Re: New GUC autovacuum_max_threshold ?

2024-04-25 Thread Nathan Bossart
On Thu, Apr 25, 2024 at 09:13:07AM +0200, Frédéric Yhuel wrote: > Le 24/04/2024 à 21:57, Nathan Bossart a écrit : >> Yeah, I'm having trouble following the proposed mechanics for this new GUC, >> and it's difficult to understand how users would choose a value. If we >> just want to cap the number

Re: New GUC autovacuum_max_threshold ?

2024-04-25 Thread Frédéric Yhuel
Hi Nathan, thanks for your review. Le 24/04/2024 à 21:57, Nathan Bossart a écrit : Yeah, I'm having trouble following the proposed mechanics for this new GUC, and it's difficult to understand how users would choose a value. If we just want to cap the number of tuples required before autovacuum

Re: New GUC autovacuum_max_threshold ?

2024-04-25 Thread Frédéric Yhuel
Le 24/04/2024 à 21:10, Melanie Plageman a écrit : On Wed, Apr 24, 2024 at 8:08 AM Frédéric Yhuel wrote: Hello, I would like to suggest a new parameter, autovacuum_max_threshold, which would set an upper limit on the number of tuples to delete/update/insert prior to vacuum/analyze. Hi

Re: New GUC autovacuum_max_threshold ?

2024-04-24 Thread Nathan Bossart
On Wed, Apr 24, 2024 at 03:10:27PM -0400, Melanie Plageman wrote: > On Wed, Apr 24, 2024 at 8:08 AM Frédéric Yhuel > wrote: >> I would like to suggest a new parameter, autovacuum_max_threshold, which >> would set an upper limit on the number of tuples to delete/update/insert >> prior to

Re: New GUC autovacuum_max_threshold ?

2024-04-24 Thread Melanie Plageman
On Wed, Apr 24, 2024 at 8:08 AM Frédéric Yhuel wrote: > > Hello, > > I would like to suggest a new parameter, autovacuum_max_threshold, which > would set an upper limit on the number of tuples to delete/update/insert > prior to vacuum/analyze. Hi Frédéric, thanks for the proposal! You are

New GUC autovacuum_max_threshold ?

2024-04-24 Thread Frédéric Yhuel
9027d857e3426f327a2a5f61aec11a7604bb48a9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20Yhuel?= Date: Fri, 19 Apr 2024 14:05:37 +0200 Subject: [PATCH] Add new GUC autovacuum_max_threshold --- src/backend/access/common/reloptions.c | 11 +++ src/backend/postmaster/autovacuum.c