Re: Should we remove vacuum_defer_cleanup_age?

2023-04-24 Thread Andres Freund
Hi,

On 2023-04-24 14:36:36 +0200, Alvaro Herrera wrote:
> On 2023-Apr-22, Andres Freund wrote:
> > I'm afraid we'll need TransactionIdRetreatSafely() again, when we convert 
> > more
> > things to 64bit xids (lest they end up with the same bug as fixed by
> > be504a3e974), so it's perhaps worth thinking about how to make it less
> > confusing.
> 
> The one thing that IMO makes it less confusing is to have it return the
> value rather than modifying it in place.

Partially I made it that way because you otherwise end up repeating long
variable names multiple times within one statement, yielding long repetitive
lines...  Not sure that's a good enough reason, but ...



> > >
> > > Replication slots overcome these disadvantages by retaining only the 
> > > number
> > > of segments known to be needed.
> > > On the other hand, replication slots can retain so
> > > many WAL segments that they fill up the space allocated
> > > for pg_wal;
> > >  limits the size of WAL 
> > > files
> > > retained by replication slots.
> > >
> > 
> > It seems a bit confusing now, because "by retaining only the number of
> > segments ..." now also should cover hs_feedback (due to merging), but 
> > doesn't.
> 
> Hah, ok.
> 

> > I think I'll push the version I had. Then we can separately word-smith the
> > section? Unless somebody protests I'm gonna do that soon.
> 
> No objection.

Cool. Pushed now.




RE: Should we remove vacuum_defer_cleanup_age?

2023-04-24 Thread Phil Florent


Hi,
Not very convenient but if autovacuum is enabled isn't vacuum_defer_cleanup_age 
the way to make extensions like pg_dirtyread more effective for temporal 
queries to quickly correct human DML mistakes without the need of a complete 
restore, even if no standby is in use ? vacuum_defer_cleanup_age+pg_dirtyread 
give PostgreSQL something like "flashback query" in Oracle.
Best regards,
Phil


De : Andres Freund 
Envoyé : dimanche 23 avril 2023 00:47
À : Alvaro Herrera 
Cc : Justin Pryzby ; pgsql-hack...@postgresql.org 
; Amit Kapila 
Objet : Re: Should we remove vacuum_defer_cleanup_age?

Hi,

On 2023-04-13 13:18:38 +0200, Alvaro Herrera wrote:
> On 2023-Apr-11, Andres Freund wrote:
>
> > Updated patch attached. I think we should either apply something like that
> > patch, or at least add a  to the docs.
>
> I gave this patch a look.  The only code change is that
> ComputeXidHorizons() and GetSnapshotData() no longer handle the case
> where vacuum_defer_cleanup_age is different from zero.  It looks good.
> The TransactionIdRetreatSafely() code being removed is pretty weird (I
> spent a good dozen minutes writing a complaint that your rewrite was
> faulty, but it turns out I had misunderstood the function), so I'm glad
> it's being retired.

My rewrite of what? The creation of TransactionIdRetreatSafely() in
be504a3e974?

I'm afraid we'll need TransactionIdRetreatSafely() again, when we convert more
things to 64bit xids (lest they end up with the same bug as fixed by
be504a3e974), so it's perhaps worth thinking about how to make it less
confusing.


> > 
> > -Similarly, 
> > -and  provide protection 
> > against
> > -relevant rows being removed by vacuum, but the former provides no
> > -protection during any time period when the standby is not connected,
> > -and the latter often needs to be set to a high value to provide 
> > adequate
> > -protection.  Replication slots overcome these disadvantages.
> > +Similarly,  on its own, 
> > without
> > +also using a replication slot, provides protection against relevant 
> > rows
> > +being removed by vacuum, but provides no protection during any time 
> > period
> > +when the standby is not connected.  Replication slots overcome these
> > +disadvantages.
>
> I think it made sense to have this paragraph be separate from the
> previous one when it was talking about two separate variables, but now
> that it's just one, it looks a bit isolated.  I would merge it with the
> one above, which is talking about pretty much the same thing, and
> reorder the whole thing approximately like this
>
>
> In lieu of using replication slots, it is possible to prevent the removal
> of old WAL segments using , or by
> storing the segments in an archive using
>  or  linkend="guc-archive-library"/>.
> However, these methods often result in retaining more WAL segments than
> required.
> Similarly,  without
> a replication slot provides protection against relevant rows
> being removed by vacuum, but provides no protection during any time period
> when the standby is not connected.
>
>
> Replication slots overcome these disadvantages by retaining only the 
> number
> of segments known to be needed.
> On the other hand, replication slots can retain so
> many WAL segments that they fill up the space allocated
> for pg_wal;
>  limits the size of WAL files
> retained by replication slots.
>

It seems a bit confusing now, because "by retaining only the number of
segments ..." now also should cover hs_feedback (due to merging), but doesn't.


> Though the "However," looks a poor fit; I would do this:

I agree, I don't like the however.


I think I'll push the version I had. Then we can separately word-smith the
section? Unless somebody protests I'm gonna do that soon.

Greetings,

Andres Freund




Re: Should we remove vacuum_defer_cleanup_age?

2023-04-24 Thread Robert Haas
On Mon, Apr 24, 2023 at 8:36 AM Alvaro Herrera  wrote:
> The one thing that IMO makes it less confusing is to have it return the
> value rather than modifying it in place.

Yeah, I don't understand why we have these functions that modify the
value in place. Those are probably convenient here and there, but
overall they seem to make things more confusing.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Should we remove vacuum_defer_cleanup_age?

2023-04-24 Thread Alvaro Herrera
On 2023-Apr-22, Andres Freund wrote:

> On 2023-04-13 13:18:38 +0200, Alvaro Herrera wrote:
> > 
> > > Updated patch attached. I think we should either apply something like that
> > > patch, or at least add a  to the docs.
> > 
> > I gave this patch a look.  The only code change is that
> > ComputeXidHorizons() and GetSnapshotData() no longer handle the case
> > where vacuum_defer_cleanup_age is different from zero.  It looks good.
> > The TransactionIdRetreatSafely() code being removed is pretty weird (I
> > spent a good dozen minutes writing a complaint that your rewrite was
> > faulty, but it turns out I had misunderstood the function), so I'm glad
> > it's being retired.
> 
> My rewrite of what? The creation of TransactionIdRetreatSafely() in
> be504a3e974?

I meant the code that used to call TransactionIdRetreatSafely() and that
you're changing in the proposed patch.

> I'm afraid we'll need TransactionIdRetreatSafely() again, when we convert more
> things to 64bit xids (lest they end up with the same bug as fixed by
> be504a3e974), so it's perhaps worth thinking about how to make it less
> confusing.

The one thing that IMO makes it less confusing is to have it return the
value rather than modifying it in place.

> >
> > Replication slots overcome these disadvantages by retaining only the 
> > number
> > of segments known to be needed.
> > On the other hand, replication slots can retain so
> > many WAL segments that they fill up the space allocated
> > for pg_wal;
> >  limits the size of WAL 
> > files
> > retained by replication slots.
> >
> 
> It seems a bit confusing now, because "by retaining only the number of
> segments ..." now also should cover hs_feedback (due to merging), but doesn't.

Hah, ok.

> I think I'll push the version I had. Then we can separately word-smith the
> section? Unless somebody protests I'm gonna do that soon.

No objection.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/




Re: Should we remove vacuum_defer_cleanup_age?

2023-04-22 Thread Andres Freund
Hi,

On 2023-04-13 13:18:38 +0200, Alvaro Herrera wrote:
> On 2023-Apr-11, Andres Freund wrote:
> 
> > Updated patch attached. I think we should either apply something like that
> > patch, or at least add a  to the docs.
> 
> I gave this patch a look.  The only code change is that
> ComputeXidHorizons() and GetSnapshotData() no longer handle the case
> where vacuum_defer_cleanup_age is different from zero.  It looks good.
> The TransactionIdRetreatSafely() code being removed is pretty weird (I
> spent a good dozen minutes writing a complaint that your rewrite was
> faulty, but it turns out I had misunderstood the function), so I'm glad
> it's being retired.

My rewrite of what? The creation of TransactionIdRetreatSafely() in
be504a3e974?

I'm afraid we'll need TransactionIdRetreatSafely() again, when we convert more
things to 64bit xids (lest they end up with the same bug as fixed by
be504a3e974), so it's perhaps worth thinking about how to make it less
confusing.


> > 
> > -Similarly, 
> > -and  provide protection 
> > against
> > -relevant rows being removed by vacuum, but the former provides no
> > -protection during any time period when the standby is not connected,
> > -and the latter often needs to be set to a high value to provide 
> > adequate
> > -protection.  Replication slots overcome these disadvantages.
> > +Similarly,  on its own, 
> > without
> > +also using a replication slot, provides protection against relevant 
> > rows
> > +being removed by vacuum, but provides no protection during any time 
> > period
> > +when the standby is not connected.  Replication slots overcome these
> > +disadvantages.
> 
> I think it made sense to have this paragraph be separate from the
> previous one when it was talking about two separate variables, but now
> that it's just one, it looks a bit isolated.  I would merge it with the
> one above, which is talking about pretty much the same thing, and
> reorder the whole thing approximately like this
> 
>
> In lieu of using replication slots, it is possible to prevent the removal
> of old WAL segments using , or by
> storing the segments in an archive using
>  or  linkend="guc-archive-library"/>.
> However, these methods often result in retaining more WAL segments than
> required.
> Similarly,  without
> a replication slot provides protection against relevant rows
> being removed by vacuum, but provides no protection during any time period
> when the standby is not connected.
>
>
> Replication slots overcome these disadvantages by retaining only the 
> number
> of segments known to be needed.
> On the other hand, replication slots can retain so
> many WAL segments that they fill up the space allocated
> for pg_wal;
>  limits the size of WAL files
> retained by replication slots.
>

It seems a bit confusing now, because "by retaining only the number of
segments ..." now also should cover hs_feedback (due to merging), but doesn't.


> Though the "However," looks a poor fit; I would do this:

I agree, I don't like the however.


I think I'll push the version I had. Then we can separately word-smith the
section? Unless somebody protests I'm gonna do that soon.

Greetings,

Andres Freund




Re: Should we remove vacuum_defer_cleanup_age?

2023-04-20 Thread Nathan Bossart
On Fri, Apr 14, 2023 at 03:07:37PM -0400, Jonathan S. Katz wrote:
> +1.

+1.  I agree with the upthread discussion and support removing
vacuum_defer_cleanup_age.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Should we remove vacuum_defer_cleanup_age?

2023-04-14 Thread Jonathan S. Katz

On 4/14/23 1:15 PM, Laurenz Albe wrote:


Let's remove vacuum_defer_cleanup_age, and put a note in the release notes
that recommends using statement_timeout and hot_standby_feedback = on
on the standby instead.
That should have pretty much the same effect, and it is measured in
time and not in the number of transactions.


+1.

Jonathan



OpenPGP_signature
Description: OpenPGP digital signature


Re: Should we remove vacuum_defer_cleanup_age?

2023-04-14 Thread Greg Stark
On Fri, 14 Apr 2023 at 13:15, Laurenz Albe  wrote:
>
> On Fri, 2023-04-14 at 18:43 +0200, Alvaro Herrera wrote:
> > On 2023-Apr-14, Greg Stark wrote:
> > > I assume people would use hot_standby_feedback if they have streaming
> > > replication.
> >
> > Yes, either that or a replication slot.
>
> A replication slot doesn't do anything against snapshot conflicts,
> which is what we are discussing here.  Or are we not?

They're related -- the replication slot holds the feedback xmin so
that if your standby disconnects it can reconnect later and not have
lost data in the meantime. At least I think that's what I think it
does -- I don't know if I'm just assuming that, but xmin is indeed in
pg_replication_slots.

-- 
greg




Re: Should we remove vacuum_defer_cleanup_age?

2023-04-14 Thread Laurenz Albe
On Fri, 2023-04-14 at 18:43 +0200, Alvaro Herrera wrote:
> On 2023-Apr-14, Greg Stark wrote:
> > I assume people would use hot_standby_feedback if they have streaming
> > replication. 
> 
> Yes, either that or a replication slot.

A replication slot doesn't do anything against snapshot conflicts,
which is what we are discussing here.  Or are we not?

> 
> I find it very hard to believe that people are doing stuff with
> vacuum_defer_cleanup_age that cannot be done with either of the other
> newer mechanisms, which have also seen much wider usage and so bugs
> fixed, etc.

vacuum_defer_cleanup_age offers a more fine-grained approach.
With hot_standby_feedback you can only say "don't ever remove any dead
tuples that the standby still needs".

But perhaps you'd prefer "don't remove dead tuples unless they are
quite old", so that you can get your shorter queries on the standby
to complete, without delaying replay and without the danger that a
long running query on the standby bloats your primary.

How about this:
Let's remove vacuum_defer_cleanup_age, and put a note in the release notes
that recommends using statement_timeout and hot_standby_feedback = on
on the standby instead.
That should have pretty much the same effect, and it is measured in
time and not in the number of transactions.

Yours,
Laurenz Albe




Re: Should we remove vacuum_defer_cleanup_age?

2023-04-14 Thread Alvaro Herrera
On 2023-Apr-14, Greg Stark wrote:

> On Fri, 14 Apr 2023 at 09:47, Jonathan S. Katz  wrote:
> >
> > Let me restate [1] in a different way.
> >
> > Using a large enough dataset, I did qualitatively look at overall usage
> > of both "vacuum_defer_cleanup_age" and compared to
> > "hot_standby_feedback", given you can use both to accomplish similar
> > outcomes.
> 
> I assume people would use hot_standby_feedback if they have streaming
> replication. 

Yes, either that or a replication slot.

vacuum_defer_cleanup_age was added in commit efc16ea52067 (2009-12-19),
for Postgres 9.0.  hot_standby_feedback is a bit newer
(bca8b7f16a3e, 2011-02-16), and replication slots are newer still
(858ec11858a9, 2014-01-31).

> The main use cases for vacuum_defer_cleanup_age would be if you're
> replaying WAL files.  That may sound archaic but there are plenty of
> circumstances where your standby may not have network access to your
> primary at all or not want to be replaying continuously.

True, those cases exist.  However, it sounds to me like in those cases
vacuum_defer_cleanup_age doesn't really help you either; you'd just want
to pause WAL replay depending on your queries or whatever.  After all,
you'd have to feed the WAL files "manually" to replay, so you're in
control anyway without having to touch the primary.

I find it very hard to believe that people are doing stuff with
vacuum_defer_cleanup_age that cannot be done with either of the other
newer mechanisms, which have also seen much wider usage and so bugs
fixed, etc.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Oh, great altar of passive entertainment, bestow upon me thy discordant images
at such speed as to render linear thought impossible" (Calvin a la TV)




Re: Should we remove vacuum_defer_cleanup_age?

2023-04-14 Thread Greg Stark
On Fri, 14 Apr 2023 at 09:47, Jonathan S. Katz  wrote:
>
> Let me restate [1] in a different way.
>
> Using a large enough dataset, I did qualitatively look at overall usage
> of both "vacuum_defer_cleanup_age" and compared to
> "hot_standby_feedback", given you can use both to accomplish similar
> outcomes.

I assume people would use hot_standby_feedback if they have streaming
replication. The main use cases for vacuum_defer_cleanup_age would be
if you're replaying WAL files. That may sound archaic but there are
plenty of circumstances where your standby may not have network access
to your primary at all or not want to be replaying continuously.

I wonder whether your dataset is self-selecting sites that have
streaming replication. That does seem like the more common usage
pattern.

Systems using wal files are more likely to be things like data
warehouses, offline analytics systems, etc. They may not even be well
known in the same organization that runs the online operations -- in
my experience they're often run by marketing or sales organizations or
in some cases infosec teams and consume data from lots of sources. The
main reason to use wal archive replay is often to provide the
isolation so that the operations team don't need to worry about the
impact on production and that makes it easy to forget these even
exist.

-- 
greg




Re: Should we remove vacuum_defer_cleanup_age?

2023-04-14 Thread Jonathan S. Katz

On 4/14/23 8:30 AM, Robert Haas wrote:

On Thu, Apr 13, 2023 at 11:06 PM Laurenz Albe  wrote:

I am not against this in principle, but I know that there are people using
this parameter; see the discussion linked in

https://postgr.es/m/e1jkzxe-0006dw...@gemulon.postgresql.org

I can't say if they have a good use case for that parameter or not.


Yeah, I feel similarly. Actually, personally I have no evidence, not
even an anecdote, suggesting that this parameter is in use, but I'm a
bit skeptical of any consensus of the form "no one is using X,"
because there sure are a lot of people running PostgreSQL and they
sure do a lot of things. Some more justifiably than others, but often
people have surprisingly good excuses for doing stuff that sounds
bizarre when you first hear about it, and it doesn't seem totally
impossible that somebody could have found a way to get value out of
this.


Let me restate [1] in a different way.

Using a large enough dataset, I did qualitatively look at overall usage 
of both "vacuum_defer_cleanup_age" and compared to 
"hot_standby_feedback", given you can use both to accomplish similar 
outcomes. The usage of "vacuum_defer_cleanup_age" was really minimal, in 
fact approaching "0", whereas "hot_standby_feedback" had significant 
adoption.


I'm not saying that "we should remove a setting just because it's not 
used" or that it may not have utility -- I'm saying that we can remove 
the setting given:


1. We know that using this setting incorrectly (which can be done fairly 
easily) can cause significant issues
2. There's another setting that can accomplish similar goals that's much 
safer

3. The setting itself is not widely used

It's the combination of all 3 that led to my conclusion. If it were just 
(1), I'd lean more strongly towards trying to fix it first.



However, I suspect that there aren't many such people, and I think the
setting is a kludge, so I support removing it. Maybe we'll find out
that we ought to add something else instead, like a limited delimited
in time rather than in XIDs. Or maybe the existing facilities are good
enough. But as Peter rightly says, XID age is likely a poor proxy for
whatever people really care about, so I don't think continuing to have
a setting that works like that is a good plan.


That seems like a good eventual outcome.

Thanks,

Jonathan

[1] 
https://www.postgresql.org/message-id/bf42784f-4d57-0a3d-1a06-ffac1a09318c%40postgresql.org




OpenPGP_signature
Description: OpenPGP digital signature


Re: Should we remove vacuum_defer_cleanup_age?

2023-04-14 Thread Daniel Gustafsson
> On 14 Apr 2023, at 14:30, Robert Haas  wrote:

> ..as Peter rightly says, XID age is likely a poor proxy for
> whatever people really care about, so I don't think continuing to have
> a setting that works like that is a good plan.

Agreed, and removing it is likely to be a good vehicle for figuring out what we
should have instead (if anything).

--
Daniel Gustafsson





Re: Should we remove vacuum_defer_cleanup_age?

2023-04-14 Thread Robert Haas
On Thu, Apr 13, 2023 at 11:06 PM Laurenz Albe  wrote:
> I am not against this in principle, but I know that there are people using
> this parameter; see the discussion linked in
>
> https://postgr.es/m/e1jkzxe-0006dw...@gemulon.postgresql.org
>
> I can't say if they have a good use case for that parameter or not.

Yeah, I feel similarly. Actually, personally I have no evidence, not
even an anecdote, suggesting that this parameter is in use, but I'm a
bit skeptical of any consensus of the form "no one is using X,"
because there sure are a lot of people running PostgreSQL and they
sure do a lot of things. Some more justifiably than others, but often
people have surprisingly good excuses for doing stuff that sounds
bizarre when you first hear about it, and it doesn't seem totally
impossible that somebody could have found a way to get value out of
this.

However, I suspect that there aren't many such people, and I think the
setting is a kludge, so I support removing it. Maybe we'll find out
that we ought to add something else instead, like a limited delimited
in time rather than in XIDs. Or maybe the existing facilities are good
enough. But as Peter rightly says, XID age is likely a poor proxy for
whatever people really care about, so I don't think continuing to have
a setting that works like that is a good plan.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Should we remove vacuum_defer_cleanup_age?

2023-04-13 Thread Laurenz Albe
On Thu, 2023-04-13 at 12:16 -0400, Jonathan S. Katz wrote:
> On 4/13/23 11:32 AM, Jonathan S. Katz wrote:
> > On 4/12/23 11:34 PM, Amit Kapila wrote:
> > > On Tue, Apr 11, 2023 at 11:50 PM Andres Freund  
> 
> > > +1 to do one of the above. I think there is a good chance that
> > > somebody might be doing more harm by using it so removing this
> > > shouldn't be a problem. Personally, I have not heard of people using
> > > it but OTOH it is difficult to predict so giving some time is also not
> > > a bad idea.
> > > 
> > > Do others have any opinion/suggestion on this matter?
> > 
> > I need a bit more time to study this before formulating an opinion on 
> > whether we should remove it for v16. In any case, I'm not against 
> > documentation.
> 
> [RMT hat]
> 
> +1 for removing.

I am not against this in principle, but I know that there are people using
this parameter; see the discussion linked in

https://postgr.es/m/e1jkzxe-0006dw...@gemulon.postgresql.org

I can't say if they have a good use case for that parameter or not.

Yours,
Laurenz Albe




Re: Should we remove vacuum_defer_cleanup_age?

2023-04-13 Thread Jonathan S. Katz

On 4/13/23 11:32 AM, Jonathan S. Katz wrote:

On 4/12/23 11:34 PM, Amit Kapila wrote:
On Tue, Apr 11, 2023 at 11:50 PM Andres Freund  



+1 to do one of the above. I think there is a good chance that
somebody might be doing more harm by using it so removing this
shouldn't be a problem. Personally, I have not heard of people using
it but OTOH it is difficult to predict so giving some time is also not
a bad idea.

Do others have any opinion/suggestion on this matter?


I need a bit more time to study this before formulating an opinion on 
whether we should remove it for v16. In any case, I'm not against 
documentation.


(didn't need too much more time).

[RMT hat]

+1 for removing.

I looked at some data and it doesn't seem like vacuum_defer_cleanup_age 
is used in any significant way, whereas hot_standby_feedback is much 
more widely used. Given this, and all the problems + arguments made in 
the thread, we should just get rid of it for v16.


There are cases where we should deprecate before removing, but I don't 
think this one based upon usage and having a better alternative.


Per [1] it does sound like we can make some improvements to 
hot_standby_feedback, but those can wait to v17.


We should probably set $DATE to finish this, too. I don't think it's a 
rush, but we should give enough time before Beta 1.


Jonathan

[1] 
https://www.postgresql.org/message-id/20230317230930.nhsgk3qfk7f4axls%40awork3.anarazel.de


OpenPGP_signature
Description: OpenPGP digital signature


Re: Should we remove vacuum_defer_cleanup_age?

2023-04-13 Thread Jonathan S. Katz

On 4/12/23 11:34 PM, Amit Kapila wrote:

On Tue, Apr 11, 2023 at 11:50 PM Andres Freund  wrote:


On 2023-04-11 11:33:01 -0500, Justin Pryzby wrote:

On Wed, Mar 22, 2023 at 10:00:48AM -0700, Andres Freund wrote:

I don't know whether others think we should apply it this release, given the
"late submission", but I tend to think it's not worth caring the complication
of vacuum_defer_cleanup_age forward.


I don't see any utility in waiting; it just makes the process of
removing it take longer for no reason.

As long as it's done before the betas, it seems completely reasonable to
remove it for v16.


Added the RMT.

We really should have a rmt@pg.o alias...


(I had thought something as much -- will reach out to pginfra about options)


Updated patch attached. I think we should either apply something like that
patch, or at least add a  to the docs.




+1 to do one of the above. I think there is a good chance that
somebody might be doing more harm by using it so removing this
shouldn't be a problem. Personally, I have not heard of people using
it but OTOH it is difficult to predict so giving some time is also not
a bad idea.

Do others have any opinion/suggestion on this matter?


I need a bit more time to study this before formulating an opinion on 
whether we should remove it for v16. In any case, I'm not against 
documentation.


Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: Should we remove vacuum_defer_cleanup_age?

2023-04-13 Thread Alvaro Herrera
On 2023-Apr-11, Andres Freund wrote:

> Updated patch attached. I think we should either apply something like that
> patch, or at least add a  to the docs.

I gave this patch a look.  The only code change is that
ComputeXidHorizons() and GetSnapshotData() no longer handle the case
where vacuum_defer_cleanup_age is different from zero.  It looks good.
The TransactionIdRetreatSafely() code being removed is pretty weird (I
spent a good dozen minutes writing a complaint that your rewrite was
faulty, but it turns out I had misunderstood the function), so I'm glad
it's being retired.


> 
> -Similarly, 
> -and  provide protection 
> against
> -relevant rows being removed by vacuum, but the former provides no
> -protection during any time period when the standby is not connected,
> -and the latter often needs to be set to a high value to provide adequate
> -protection.  Replication slots overcome these disadvantages.
> +Similarly,  on its own, without
> +also using a replication slot, provides protection against relevant rows
> +being removed by vacuum, but provides no protection during any time 
> period
> +when the standby is not connected.  Replication slots overcome these
> +disadvantages.

I think it made sense to have this paragraph be separate from the
previous one when it was talking about two separate variables, but now
that it's just one, it looks a bit isolated.  I would merge it with the
one above, which is talking about pretty much the same thing, and
reorder the whole thing approximately like this

   
In lieu of using replication slots, it is possible to prevent the removal
of old WAL segments using , or by
storing the segments in an archive using
 or .
However, these methods often result in retaining more WAL segments than
required.
Similarly,  without
a replication slot provides protection against relevant rows
being removed by vacuum, but provides no protection during any time period
when the standby is not connected.
   
   
Replication slots overcome these disadvantages by retaining only the number
of segments known to be needed.
On the other hand, replication slots can retain so
many WAL segments that they fill up the space allocated
for pg_wal;
 limits the size of WAL files
retained by replication slots.
   

Though the "However," looks a poor fit; I would do this:

   
In lieu of using replication slots, it is possible to prevent the removal
of old WAL segments using , or by
storing the segments in an archive using
 or .
A disadvantage of these methods is that they often result in retaining
more WAL segments than required.
Similarly,  without
a replication slot provides protection against relevant rows
being removed by vacuum, but provides no protection during any time period
when the standby is not connected.
   
   
Replication slots overcome these disadvantages by retaining only the number
of segments known to be needed.
On the other hand, replication slots can retain so
many WAL segments that they fill up the space allocated
for pg_wal;
 limits the size of WAL files
retained by replication slots.
   

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
Tom: There seems to be something broken here.
Teodor: I'm in sackcloth and ashes...  Fixed.
   http://postgr.es/m/482d1632.8010...@sigaev.ru




Re: Should we remove vacuum_defer_cleanup_age?

2023-04-12 Thread Amit Kapila
On Tue, Apr 11, 2023 at 11:50 PM Andres Freund  wrote:
>
> On 2023-04-11 11:33:01 -0500, Justin Pryzby wrote:
> > On Wed, Mar 22, 2023 at 10:00:48AM -0700, Andres Freund wrote:
> > > I don't know whether others think we should apply it this release, given 
> > > the
> > > "late submission", but I tend to think it's not worth caring the 
> > > complication
> > > of vacuum_defer_cleanup_age forward.
> >
> > I don't see any utility in waiting; it just makes the process of
> > removing it take longer for no reason.
> >
> > As long as it's done before the betas, it seems completely reasonable to
> > remove it for v16.
>
> Added the RMT.
>
> We really should have a rmt@pg.o alias...
>
> Updated patch attached. I think we should either apply something like that
> patch, or at least add a  to the docs.
>

+1 to do one of the above. I think there is a good chance that
somebody might be doing more harm by using it so removing this
shouldn't be a problem. Personally, I have not heard of people using
it but OTOH it is difficult to predict so giving some time is also not
a bad idea.

Do others have any opinion/suggestion on this matter?

-- 
With Regards,
Amit Kapila.




Re: Should we remove vacuum_defer_cleanup_age?

2023-04-11 Thread Andres Freund
Hi,

On 2023-04-11 11:33:01 -0500, Justin Pryzby wrote:
> On Wed, Mar 22, 2023 at 10:00:48AM -0700, Andres Freund wrote:
> > I don't know whether others think we should apply it this release, given the
> > "late submission", but I tend to think it's not worth caring the 
> > complication
> > of vacuum_defer_cleanup_age forward.
>
> I don't see any utility in waiting; it just makes the process of
> removing it take longer for no reason.
>
> As long as it's done before the betas, it seems completely reasonable to
> remove it for v16.

Added the RMT.

We really should have a rmt@pg.o alias...

Updated patch attached. I think we should either apply something like that
patch, or at least add a  to the docs.

Greetings,

Andres Freund
>From c51c9e450b1b1342c0f6ff32e7e360677ccbc2c6 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Tue, 11 Apr 2023 10:21:36 -0700
Subject: [PATCH v2] Remove vacuum_defer_cleanup_age

This commit removes TransactionIdRetreatSafely(), as there are no users
anymore. There might be potential future users, hence noting that here.

Reviewed-by: Daniel Gustafsson 
Reviewed-by: Justin Pryzby 
Discussion: https://postgr.es/m/20230317230930.nhsgk3qfk7f4a...@awork3.anarazel.de
---
 src/include/storage/standby.h |   1 -
 src/backend/storage/ipc/procarray.c   | 105 ++
 src/backend/storage/ipc/standby.c |   1 -
 src/backend/utils/misc/guc_tables.c   |   9 --
 src/backend/utils/misc/postgresql.conf.sample |   1 -
 src/bin/pg_upgrade/server.c   |   5 +-
 doc/src/sgml/config.sgml  |  35 --
 doc/src/sgml/high-availability.sgml   |  28 +
 8 files changed, 15 insertions(+), 170 deletions(-)

diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index 41f4dc372e6..e8f50569491 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -21,7 +21,6 @@
 #include "storage/standbydefs.h"
 
 /* User-settable GUC parameters */
-extern PGDLLIMPORT int vacuum_defer_cleanup_age;
 extern PGDLLIMPORT int max_standby_archive_delay;
 extern PGDLLIMPORT int max_standby_streaming_delay;
 extern PGDLLIMPORT bool log_recovery_conflict_waits;
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index ea91ce355f2..106b184a3e6 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -367,9 +367,6 @@ static inline void ProcArrayEndTransactionInternal(PGPROC *proc, TransactionId l
 static void ProcArrayGroupClearXid(PGPROC *proc, TransactionId latestXid);
 static void MaintainLatestCompletedXid(TransactionId latestXid);
 static void MaintainLatestCompletedXidRecovery(TransactionId latestXid);
-static void TransactionIdRetreatSafely(TransactionId *xid,
-	   int retreat_by,
-	   FullTransactionId rel);
 
 static inline FullTransactionId FullXidRelativeTo(FullTransactionId rel,
   TransactionId xid);
@@ -1709,10 +1706,7 @@ TransactionIdIsActive(TransactionId xid)
  * do about that --- data is only protected if the walsender runs continuously
  * while queries are executed on the standby.  (The Hot Standby code deals
  * with such cases by failing standby queries that needed to access
- * already-removed data, so there's no integrity bug.)  The computed values
- * are also adjusted with vacuum_defer_cleanup_age, so increasing that setting
- * on the fly is another easy way to make horizons move backwards, with no
- * consequences for data integrity.
+ * already-removed data, so there's no integrity bug.)
  *
  * Note: the approximate horizons (see definition of GlobalVisState) are
  * updated by the computations done here. That's currently required for
@@ -1877,50 +1871,11 @@ ComputeXidHorizons(ComputeXidHorizonsResult *h)
 			TransactionIdOlder(h->data_oldest_nonremovable, kaxmin);
 		/* temp relations cannot be accessed in recovery */
 	}
-	else
-	{
-		/*
-		 * Compute the cutoff XID by subtracting vacuum_defer_cleanup_age.
-		 *
-		 * vacuum_defer_cleanup_age provides some additional "slop" for the
-		 * benefit of hot standby queries on standby servers.  This is quick
-		 * and dirty, and perhaps not all that useful unless the primary has a
-		 * predictable transaction rate, but it offers some protection when
-		 * there's no walsender connection.  Note that we are assuming
-		 * vacuum_defer_cleanup_age isn't large enough to cause wraparound ---
-		 * so guc.c should limit it to no more than the xidStopLimit threshold
-		 * in varsup.c.  Also note that we intentionally don't apply
-		 * vacuum_defer_cleanup_age on standby servers.
-		 *
-		 * Need to use TransactionIdRetreatSafely() instead of open-coding the
-		 * subtraction, to prevent creating an xid before
-		 * FirstNormalTransactionId.
-		 */
-		Assert(TransactionIdPrecedesOrEquals(h->oldest_considered_running,
-			 h->shared_oldest_nonremovable));
-		

Re: Should we remove vacuum_defer_cleanup_age?

2023-04-11 Thread Justin Pryzby
On Wed, Mar 22, 2023 at 10:00:48AM -0700, Andres Freund wrote:
> I don't know whether others think we should apply it this release, given the
> "late submission", but I tend to think it's not worth caring the complication
> of vacuum_defer_cleanup_age forward.

I don't see any utility in waiting; it just makes the process of
removing it take longer for no reason.

As long as it's done before the betas, it seems completely reasonable to
remove it for v16. 

-- 
Justin




Re: Should we remove vacuum_defer_cleanup_age?

2023-03-24 Thread Peter Geoghegan
On Sat, Mar 18, 2023 at 2:34 AM Alvaro Herrera  wrote:
>
> On 2023-Mar-17, Andres Freund wrote:
>
> > I started writing a test for vacuum_defer_cleanup_age while working on the 
> > fix
> > referenced above, but now I am wondering if said energy would be better 
> > spent
> > removing vacuum_defer_cleanup_age alltogether.
>
> +1  I agree it's not useful anymore.

+1.

I am suspicious of most of the GUCs whose value is an XID age. It
strikes me as something that is convenient to the implementation, but
not to the user, since there are so many ways that XID age might be a
poor proxy for whatever it is that you really care about in each case.

A theoretical advantage of vacuum_defer_cleanup_age is that it allows
the user to control things in terms of the impact on the primary --
whereas hot_standby_feedback is a mechanism that controls things in
terms of the needs of the standby. In practice this is pretty useless,
but it seems like it might be possible to come up with some other new
mechanism that somehow does this in a way that's truly useful.
Something that allows the user to constrain how far we hold back
conflicts/vacuuming in terms of the *impact* on the primary.

It might be helpful to permit opportunistic cleanup by pruning and
index deletion at some point, but to throttle it when we know it would
violate some soft limit related to hot_standby_feedback. Maybe the
system could prevent the first few attempts at pruning when it
violates the soft limit, or make pruning prune somewhat less
aggressively where there is little advantage to it in terms of
space/tuples freed -- decide on what to do at the very last minute,
based on all available information at that late stage, with the full
context available. The system could be taught to be very patient at
first, when relatively few pruning operations have been attempted,
when the cost is basically still acceptable. But as more pruning
operations ran and clearly didn't free space that really should be
freed, we'd quickly lose patience.

The big idea here is to delay committing to any course of action for
as long as possible, so we wouldn't kill queries on standbys for very
little benefit on the primary, while at the same time avoiding ever
really failing to kill queries on standbys when the cost proved too
high on the primary. For this to have any chance of working it needs
to focus on the actual costs on the primary, and not some extremely
noisy proxy for that cost. The standby will have its query killed by
just one prune record affecting just one heap page, and delaying that
specific prune record is likely no big deal. It's preventing pruning
of tens of thousands of heap pages that we need to worry about.

-- 
Peter Geoghegan




Re: Should we remove vacuum_defer_cleanup_age?

2023-03-24 Thread Daniel Gustafsson
> On 24 Mar 2023, at 21:27, Andres Freund  wrote:
> On 2023-03-23 10:18:35 +0100, Daniel Gustafsson wrote:
>>> On 22 Mar 2023, at 18:00, Andres Freund  wrote:
>> 
>>> It wasn't actually that much work to write a patch to remove
>>> vacuum_defer_cleanup_age, see the attached.
>> 
>> -and  provide protection 
>> against
>> +provides protection against
>> relevant rows being removed by vacuum, but the former provides no
>> protection during any time period when the standby is not connected,
>> and the latter often needs to be set to a high value to provide adequate
>> 
>> Isn't "the latter" in the kept part of the sentence referring to the guc 
>> we're
>> removing here?
> 
> You're right. That paragraph generally seems a bit off. In HEAD:
> 
> ...
> 
> Replication slots alone don't prevent row removal, that requires
> hot_standby_feedback to be used as well.
> 
> A minimal rephrasing would be:
>   
>Similarly,  on its own, without
>also using a replication slot, provides protection against relevant rows
>being removed by vacuum, but provides no protection during any time period
>when the standby is not connected.  Replication slots overcome these
>disadvantages.
>   

+1, that's definitely an improvement.

--
Daniel Gustafsson





Re: Should we remove vacuum_defer_cleanup_age?

2023-03-24 Thread Andres Freund
Hi,

On 2023-03-23 10:18:35 +0100, Daniel Gustafsson wrote:
> > On 22 Mar 2023, at 18:00, Andres Freund  wrote:
> 
> > It wasn't actually that much work to write a patch to remove
> > vacuum_defer_cleanup_age, see the attached.
> 
> -and  provide protection 
> against
> +provides protection against
>  relevant rows being removed by vacuum, but the former provides no
>  protection during any time period when the standby is not connected,
>  and the latter often needs to be set to a high value to provide adequate
> 
> Isn't "the latter" in the kept part of the sentence referring to the guc we're
> removing here?

You're right. That paragraph generally seems a bit off. In HEAD:

   
In lieu of using replication slots, it is possible to prevent the removal
of old WAL segments using , or by
storing the segments in an archive using
 or .
However, these methods often result in retaining more WAL segments than
required, whereas replication slots retain only the number of segments
known to be needed.  On the other hand, replication slots can retain so
many WAL segments that they fill up the space allocated
for pg_wal;
 limits the size of WAL files
retained by replication slots.
   
   
Similarly, 
and  provide protection 
against
relevant rows being removed by vacuum, but the former provides no
protection during any time period when the standby is not connected,
and the latter often needs to be set to a high value to provide adequate
protection.  Replication slots overcome these disadvantages.
   

Replication slots alone don't prevent row removal, that requires
hot_standby_feedback to be used as well.

A minimal rephrasing would be:
   
Similarly,  on its own, without
also using a replication slot, provides protection against relevant rows
being removed by vacuum, but provides no protection during any time period
when the standby is not connected.  Replication slots overcome these
disadvantages.
   



> -  * It's possible that slots / vacuum_defer_cleanup_age backed up the
> -  * horizons further than oldest_considered_running. Fix.
> +  * It's possible that slots backed up the horizons further than
> +  * oldest_considered_running. Fix.
> 
> While not the fault of this patch, can't we use the opportunity to expand
> "Fix." to a short "Fix this by ..." sentence?  Or remove "Fix." perhaps, 
> either
> of those would improve the comment IMHO.

Perhaps unsurprisingly, given that I wrote that comment, I don't see a problem
with the "Fix."...

Greetings,

Andres Freund




Re: Should we remove vacuum_defer_cleanup_age?

2023-03-23 Thread Daniel Gustafsson
> On 22 Mar 2023, at 18:00, Andres Freund  wrote:

> It wasn't actually that much work to write a patch to remove
> vacuum_defer_cleanup_age, see the attached.

-and  provide protection 
against
+provides protection against
 relevant rows being removed by vacuum, but the former provides no
 protection during any time period when the standby is not connected,
 and the latter often needs to be set to a high value to provide adequate

Isn't "the latter" in the kept part of the sentence referring to the guc we're
removing here?

-* It's possible that slots / vacuum_defer_cleanup_age backed up the
-* horizons further than oldest_considered_running. Fix.
+* It's possible that slots backed up the horizons further than
+* oldest_considered_running. Fix.

While not the fault of this patch, can't we use the opportunity to expand
"Fix." to a short "Fix this by ..." sentence?  Or remove "Fix." perhaps, either
of those would improve the comment IMHO.

> I don't know whether others think we should apply it this release, given the
> "late submission", but I tend to think it's not worth caring the complication
> of vacuum_defer_cleanup_age forward.

It might be late in the cycle, but as it's not adding something that might
break but rather removing something that's known for being problematic (and not
useful) I think it's Ok.

--
Daniel Gustafsson





Re: Should we remove vacuum_defer_cleanup_age?

2023-03-22 Thread Andres Freund
Hi,

On 2023-03-22 11:44:20 -0500, Justin Pryzby wrote:
> On Sat, Mar 18, 2023 at 10:33:57AM +0100, Alvaro Herrera wrote:
> > On 2023-Mar-17, Andres Freund wrote:
> > 
> > > I started writing a test for vacuum_defer_cleanup_age while working on 
> > > the fix
> > > referenced above, but now I am wondering if said energy would be better 
> > > spent
> > > removing vacuum_defer_cleanup_age alltogether.
> > 
> > +1  I agree it's not useful anymore.
> > 
> > > I don't think I have the cycles to push this through in the next weeks, 
> > > but if
> > > we agree removing vacuum_defer_cleanup_age is a good idea, it seems like a
> > > good idea to mark it as deprecated in 16?
> > 
> > Hmm, for the time being, can we just "disable" it by disallowing to set
> > the GUC to a value different from 0?  Then we can remove the code later
> > in the cycle at leisure.
> 
> It can be useful to do a "rolling transition", and it's something I do
> often.
> 
> But I can't see why that would be useful here?  It seems like something
> that could be done after the feature freeze.  It's removing a feature,
> not adding one.

It wasn't actually that much work to write a patch to remove
vacuum_defer_cleanup_age, see the attached.

I don't know whether others think we should apply it this release, given the
"late submission", but I tend to think it's not worth caring the complication
of vacuum_defer_cleanup_age forward.

Greetings,

Andres Freund
>From 24b519015bb1922a9746eda2ebea8702ccdd486f Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Wed, 22 Mar 2023 09:56:39 -0700
Subject: [PATCH v1] Remove vacuum_defer_cleanup_age

Discussion: https://postgr.es/m/20230317230930.nhsgk3qfk7f4a...@awork3.anarazel.de
---
 src/include/storage/standby.h |   1 -
 src/backend/storage/ipc/procarray.c   | 105 ++
 src/backend/storage/ipc/standby.c |   1 -
 src/backend/utils/misc/guc_tables.c   |   9 --
 src/backend/utils/misc/postgresql.conf.sample |   1 -
 src/bin/pg_upgrade/server.c   |   5 +-
 doc/src/sgml/config.sgml  |  35 --
 doc/src/sgml/high-availability.sgml   |  19 +---
 8 files changed, 11 insertions(+), 165 deletions(-)

diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index 2effdea126f..4bc23adf516 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -21,7 +21,6 @@
 #include "storage/standbydefs.h"
 
 /* User-settable GUC parameters */
-extern PGDLLIMPORT int vacuum_defer_cleanup_age;
 extern PGDLLIMPORT int max_standby_archive_delay;
 extern PGDLLIMPORT int max_standby_streaming_delay;
 extern PGDLLIMPORT bool log_recovery_conflict_waits;
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index ea91ce355f2..106b184a3e6 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -367,9 +367,6 @@ static inline void ProcArrayEndTransactionInternal(PGPROC *proc, TransactionId l
 static void ProcArrayGroupClearXid(PGPROC *proc, TransactionId latestXid);
 static void MaintainLatestCompletedXid(TransactionId latestXid);
 static void MaintainLatestCompletedXidRecovery(TransactionId latestXid);
-static void TransactionIdRetreatSafely(TransactionId *xid,
-	   int retreat_by,
-	   FullTransactionId rel);
 
 static inline FullTransactionId FullXidRelativeTo(FullTransactionId rel,
   TransactionId xid);
@@ -1709,10 +1706,7 @@ TransactionIdIsActive(TransactionId xid)
  * do about that --- data is only protected if the walsender runs continuously
  * while queries are executed on the standby.  (The Hot Standby code deals
  * with such cases by failing standby queries that needed to access
- * already-removed data, so there's no integrity bug.)  The computed values
- * are also adjusted with vacuum_defer_cleanup_age, so increasing that setting
- * on the fly is another easy way to make horizons move backwards, with no
- * consequences for data integrity.
+ * already-removed data, so there's no integrity bug.)
  *
  * Note: the approximate horizons (see definition of GlobalVisState) are
  * updated by the computations done here. That's currently required for
@@ -1877,50 +1871,11 @@ ComputeXidHorizons(ComputeXidHorizonsResult *h)
 			TransactionIdOlder(h->data_oldest_nonremovable, kaxmin);
 		/* temp relations cannot be accessed in recovery */
 	}
-	else
-	{
-		/*
-		 * Compute the cutoff XID by subtracting vacuum_defer_cleanup_age.
-		 *
-		 * vacuum_defer_cleanup_age provides some additional "slop" for the
-		 * benefit of hot standby queries on standby servers.  This is quick
-		 * and dirty, and perhaps not all that useful unless the primary has a
-		 * predictable transaction rate, but it offers some protection when
-		 * there's no walsender connection.  Note that we are assuming
-		 * vacuum_defer_cleanup_age isn't large enough to cause wraparound ---
-		 * so guc.c should limit it to no 

Re: Should we remove vacuum_defer_cleanup_age?

2023-03-22 Thread Justin Pryzby
On Sat, Mar 18, 2023 at 10:33:57AM +0100, Alvaro Herrera wrote:
> On 2023-Mar-17, Andres Freund wrote:
> 
> > I started writing a test for vacuum_defer_cleanup_age while working on the 
> > fix
> > referenced above, but now I am wondering if said energy would be better 
> > spent
> > removing vacuum_defer_cleanup_age alltogether.
> 
> +1  I agree it's not useful anymore.
> 
> > I don't think I have the cycles to push this through in the next weeks, but 
> > if
> > we agree removing vacuum_defer_cleanup_age is a good idea, it seems like a
> > good idea to mark it as deprecated in 16?
> 
> Hmm, for the time being, can we just "disable" it by disallowing to set
> the GUC to a value different from 0?  Then we can remove the code later
> in the cycle at leisure.

It can be useful to do a "rolling transition", and it's something I do
often.

But I can't see why that would be useful here?  It seems like something
that could be done after the feature freeze.  It's removing a feature,
not adding one.

-- 
Justin




Re: Should we remove vacuum_defer_cleanup_age?

2023-03-18 Thread Alvaro Herrera
On 2023-Mar-17, Andres Freund wrote:

> I started writing a test for vacuum_defer_cleanup_age while working on the fix
> referenced above, but now I am wondering if said energy would be better spent
> removing vacuum_defer_cleanup_age alltogether.

+1  I agree it's not useful anymore.

> I don't think I have the cycles to push this through in the next weeks, but if
> we agree removing vacuum_defer_cleanup_age is a good idea, it seems like a
> good idea to mark it as deprecated in 16?

Hmm, for the time being, can we just "disable" it by disallowing to set
the GUC to a value different from 0?  Then we can remove the code later
in the cycle at leisure.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"La gente vulgar sólo piensa en pasar el tiempo;
el que tiene talento, en aprovecharlo"




Re: Should we remove vacuum_defer_cleanup_age?

2023-03-17 Thread Magnus Hagander
On Sat, Mar 18, 2023 at 12:09 AM Andres Freund  wrote:

> Hi,
>
> As evidenced by the bug fixed in be504a3e974, vacuum_defer_cleanup_age is
> not
> heavily used - the bug was trivial to hit as soon as
> vacuum_defer_cleanup_age
> is set to a non-toy value. It complicates thinking about visibility
> horizons
> substantially, as vacuum_defer_cleanup_age can make them go backward
> substantially. Obviously it's also severely undertested.
>
> I started writing a test for vacuum_defer_cleanup_age while working on the
> fix
> referenced above, but now I am wondering if said energy would be better
> spent
> removing vacuum_defer_cleanup_age alltogether.
>
> vacuum_defer_cleanup_age was added as part of hot standby. Back then we did
> not yet have hot_standby_feedback. Now that that exists,
> vacuum_defer_cleanup_age doesn't seem like a good idea anymore. It's
> pessimisistic, i.e. always retains rows, even if none of the standbys has
> an
> old enough snapshot.
>
> The only benefit of vacuum_defer_cleanup_age over hot_standby_feedback is
> that
> it provides a limit of some sort. But transactionids aren't producing dead
> rows in a uniform manner, so limiting via xid isn't particularly useful.
> And
> even if there are use cases, it seems those would be served better by
> introducing a cap on how much hot_standby_feedback can hold the horizon
> back.
>
> I don't think I have the cycles to push this through in the next weeks,
> but if
> we agree removing vacuum_defer_cleanup_age is a good idea, it seems like a
> good idea to mark it as deprecated in 16?
>

+1. I haven't seen any (correct) use of this in many many years on my end
at least.

And yes, having a cap on hot_standby_feedback would also be great.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Should we remove vacuum_defer_cleanup_age?

2023-03-17 Thread Andres Freund
Hi,

As evidenced by the bug fixed in be504a3e974, vacuum_defer_cleanup_age is not
heavily used - the bug was trivial to hit as soon as vacuum_defer_cleanup_age
is set to a non-toy value. It complicates thinking about visibility horizons
substantially, as vacuum_defer_cleanup_age can make them go backward
substantially. Obviously it's also severely undertested.

I started writing a test for vacuum_defer_cleanup_age while working on the fix
referenced above, but now I am wondering if said energy would be better spent
removing vacuum_defer_cleanup_age alltogether.

vacuum_defer_cleanup_age was added as part of hot standby. Back then we did
not yet have hot_standby_feedback. Now that that exists,
vacuum_defer_cleanup_age doesn't seem like a good idea anymore. It's
pessimisistic, i.e. always retains rows, even if none of the standbys has an
old enough snapshot.

The only benefit of vacuum_defer_cleanup_age over hot_standby_feedback is that
it provides a limit of some sort. But transactionids aren't producing dead
rows in a uniform manner, so limiting via xid isn't particularly useful. And
even if there are use cases, it seems those would be served better by
introducing a cap on how much hot_standby_feedback can hold the horizon back.

I don't think I have the cycles to push this through in the next weeks, but if
we agree removing vacuum_defer_cleanup_age is a good idea, it seems like a
good idea to mark it as deprecated in 16?

Greetings,

Andres Freund