Re: Question regarding "Make archiver process an auxiliary process. commit"

2023-01-03 Thread Sravan Kumar
I have added the thread to the commitfest: https://commitfest.postgresql.org/42/
Did you get a chance to review the patch? Please let me know if you
need anything from my end.

Thanks & Regards,
Sravan Velagandula
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

On Wed, Dec 7, 2022 at 11:49 AM Kyotaro Horiguchi
 wrote:
>
> At Wed, 7 Dec 2022 11:28:23 +0530, Sravan Kumar  
> wrote in
> > On Tue, Dec 6, 2022 at 5:24 PM Bharath Rupireddy <
> > bharath.rupireddyforpostg...@gmail.com> wrote:
> >
> > > On Tue, Dec 6, 2022 at 4:57 PM Sravan Kumar 
> > > wrote:
> > > >
> > > > Thank you for the feedback.
> > > >
> > > > I'll be glad to help with the fix. Here's the patch for review.
> > >
> > > Thanks. +1 for fixing this.
> > >> I would like to quote recent discussions on reducing the useless
> > >> wakeups or increasing the sleep/hibernation times in various processes
> > >> to reduce the power savings [1] [2] [3] [4] [5]. With that in context,
> > >> does the archiver need to wake up every 60 sec at all when its latch
> > >> gets set (PgArchWakeup()) whenever the server switches to a new WAL
> > >> file? What happens if we get rid of PGARCH_AUTOWAKE_INTERVAL and rely
> > >> on its latch being set? If required, we can spread PgArchWakeup() to
> > >> more places, no?
> > >
> > >
> > I like the idea of not having to wake up intermittently and probably we
> > should start a discussion about it.
> >
> > I see the following comment in PgArchWakeup().
> >
> >  /*
> > * We don't acquire ProcArrayLock here.  It's actually fine because
> > * procLatch isn't ever freed, so we just can potentially set the wrong
> > * process' (or no process') latch.  Even in that case the archiver will
> > * be relaunched shortly and will start archiving.
> > */
> >
> > While I do not fully understand the comment, it gives me an impression that
> > the SetLatch() done here is counting on the timeout to wake up the archiver
> > in some cases where the latch is wrongly set.
>
> It is telling about the first iteration of archive process, not
> periodical wakeups.  So I think it is doable by considering how to
> handle incomplete archiving iterations.
>
> > The proposed idea is a behaviour change while this thread intends to clean
> > up some code that's
> > a result of the mentioned commit. So probably the proposed idea can be
> > discussed as a separate thread.
>
> Agreed.
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center



--
Thanks & Regards,
Sravan Velagandula
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Question regarding "Make archiver process an auxiliary process. commit"

2023-01-20 Thread Nathan Bossart
On Wed, Jan 04, 2023 at 11:35:33AM +0530, Sravan Kumar wrote:
> I have added the thread to the commitfest: 
> https://commitfest.postgresql.org/42/
> Did you get a chance to review the patch? Please let me know if you
> need anything from my end.

This seems like worthwhile simplification to me.  Ultimately, your patch
shouldn't result in any sort of signficant behavior change, and I don't see
any reason to further complicate the timeout calculation.  The copy loop
will run any time the archiver's latch is set, and it'll wait up to 60
seconds otherwise.  As discussed upthread, it might be possible to remove
the timeout completely, but that probably deserves its own thread.

I noticed that time.h is no longer needed by the archiver, so I removed
that and fixed an indentation nitpick in the attached v2.  I'm going to set
the commitfest entry to ready-for-committer shortly after sending this
message.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From a06609e839f039b7e7806456eaf4ee113cfabc3c Mon Sep 17 00:00:00 2001
From: Sravan Velagandula 
Date: Tue, 6 Dec 2022 06:21:38 -0500
Subject: [PATCH v2 1/1] simplify wait loop in the archiver

---
 src/backend/postmaster/pgarch.c | 29 +
 1 file changed, 9 insertions(+), 20 deletions(-)

diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index 8ecdb9ca23..6e28067596 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -25,7 +25,6 @@
  */
 #include "postgres.h"
 
-#include 
 #include 
 #include 
 
@@ -297,7 +296,6 @@ pgarch_waken_stop(SIGNAL_ARGS)
 static void
 pgarch_MainLoop(void)
 {
-	pg_time_t	last_copy_time = 0;
 	bool		time_to_stop;
 
 	/*
@@ -335,30 +333,21 @@ pgarch_MainLoop(void)
 
 		/* Do what we're here for */
 		pgarch_ArchiverCopyLoop();
-		last_copy_time = time(NULL);
 
 		/*
 		 * Sleep until a signal is received, or until a poll is forced by
-		 * PGARCH_AUTOWAKE_INTERVAL having passed since last_copy_time, or
-		 * until postmaster dies.
+		 * PGARCH_AUTOWAKE_INTERVAL, or until postmaster dies.
 		 */
 		if (!time_to_stop)		/* Don't wait during last iteration */
 		{
-			pg_time_t	curtime = (pg_time_t) time(NULL);
-			int			timeout;
-
-			timeout = PGARCH_AUTOWAKE_INTERVAL - (curtime - last_copy_time);
-			if (timeout > 0)
-			{
-int			rc;
-
-rc = WaitLatch(MyLatch,
-			   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
-			   timeout * 1000L,
-			   WAIT_EVENT_ARCHIVER_MAIN);
-if (rc & WL_POSTMASTER_DEATH)
-	time_to_stop = true;
-			}
+			int			rc;
+
+			rc = WaitLatch(MyLatch,
+		   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+		   PGARCH_AUTOWAKE_INTERVAL * 1000L,
+		   WAIT_EVENT_ARCHIVER_MAIN);
+			if (rc & WL_POSTMASTER_DEATH)
+time_to_stop = true;
 		}
 
 		/*
-- 
2.25.1



Re: Question regarding "Make archiver process an auxiliary process. commit"

2023-01-31 Thread Nathan Bossart
On Fri, Jan 20, 2023 at 11:39:56AM -0800, Nathan Bossart wrote:
> I noticed that time.h is no longer needed by the archiver, so I removed
> that and fixed an indentation nitpick in the attached v2.  I'm going to set
> the commitfest entry to ready-for-committer shortly after sending this
> message.

I'm not sure why I thought time.h was no longer needed.  time() is clearly
used elsewhere in this file.  Here's a new version with that added back.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 3313989a30b0821ed7014527b0342e7c63b59169 Mon Sep 17 00:00:00 2001
From: Sravan Velagandula 
Date: Tue, 6 Dec 2022 06:21:38 -0500
Subject: [PATCH v3 1/1] simplify wait loop in the archiver

---
 src/backend/postmaster/pgarch.c | 28 +---
 1 file changed, 9 insertions(+), 19 deletions(-)

diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index 8ecdb9ca23..51d882c17a 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -297,7 +297,6 @@ pgarch_waken_stop(SIGNAL_ARGS)
 static void
 pgarch_MainLoop(void)
 {
-	pg_time_t	last_copy_time = 0;
 	bool		time_to_stop;
 
 	/*
@@ -335,30 +334,21 @@ pgarch_MainLoop(void)
 
 		/* Do what we're here for */
 		pgarch_ArchiverCopyLoop();
-		last_copy_time = time(NULL);
 
 		/*
 		 * Sleep until a signal is received, or until a poll is forced by
-		 * PGARCH_AUTOWAKE_INTERVAL having passed since last_copy_time, or
-		 * until postmaster dies.
+		 * PGARCH_AUTOWAKE_INTERVAL, or until postmaster dies.
 		 */
 		if (!time_to_stop)		/* Don't wait during last iteration */
 		{
-			pg_time_t	curtime = (pg_time_t) time(NULL);
-			int			timeout;
-
-			timeout = PGARCH_AUTOWAKE_INTERVAL - (curtime - last_copy_time);
-			if (timeout > 0)
-			{
-int			rc;
-
-rc = WaitLatch(MyLatch,
-			   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
-			   timeout * 1000L,
-			   WAIT_EVENT_ARCHIVER_MAIN);
-if (rc & WL_POSTMASTER_DEATH)
-	time_to_stop = true;
-			}
+			int			rc;
+
+			rc = WaitLatch(MyLatch,
+		   WL_LATCH_SET | WL_TIMEOUT | WL_POSTMASTER_DEATH,
+		   PGARCH_AUTOWAKE_INTERVAL * 1000L,
+		   WAIT_EVENT_ARCHIVER_MAIN);
+			if (rc & WL_POSTMASTER_DEATH)
+time_to_stop = true;
 		}
 
 		/*
-- 
2.25.1



Re: Question regarding "Make archiver process an auxiliary process. commit"

2023-01-31 Thread Michael Paquier
On Tue, Jan 31, 2023 at 08:30:13PM -0800, Nathan Bossart wrote:
> I'm not sure why I thought time.h was no longer needed.  time() is clearly
> used elsewhere in this file.  Here's a new version with that added back.

Ah, I see.  The key point is that curtime and last_copy_time will most
likely be the same value as time() is second-based, so timeout is
basically always PGARCH_AUTOWAKE_INTERVAL.  There is no need to care
about time_to_stop, as we just go through and exit if it happens to be
switched to true.  Applied v3, keeping time_to_stop as it is in v2 and
v3 so as we don't loop again on a postmaster death.
--
Michael


signature.asc
Description: PGP signature


Re: Question regarding "Make archiver process an auxiliary process. commit"

2022-12-06 Thread Kyotaro Horiguchi
At Mon, 5 Dec 2022 12:06:11 +0530, Sravan Kumar  wrote 
in 
> timeout = PGARCH_AUTOWAKE_INTERVAL - (curtime - last_copy_time);
> It so happens that last_copy_time and curtime are always set at the same
> time which always makes timeout equal (actually roughly equal) to
> PGARCH_AUTOWAKE_INTERVAL.

Oooo *^^*.

> This behaviour was different before the commit:
> d75288fb27b8fe0a926aaab7d75816f091ecdc27,
> in which the archiver keeps track of how much time has elapsed since
> last_copy_time
> in case there was a signal, and it results in a smaller subsequent value of
> timeout, until timeout is zero. This also avoids calling
> pgarch_ArchiverCopyLoop
> before PGARCH_AUTOWAKE_INTERVAL in case there's an intermittent signal.

Yes, WaitLatch() (I believe) no longer makes a spurious wakeup.

> With the current changes it may be okay to always sleep for
> PGARCH_AUTOWAKE_INTERVAL,
> but that means curtime and last_copy_time are no more needed.

I think you're right.

> I would like to validate if my understanding is correct, and which of the
> behaviours we would like to retain.

As my understanding the patch didn't change the copying behavior of
the function. I think we should simplify the loop by removing
last_copy_time and curtime in the "if (!time_to_stop)" block. Then we
can remove the variable "timeout" and the "if (timeout > 0)"
branch. Are you willing to work on this?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Question regarding "Make archiver process an auxiliary process. commit"

2022-12-06 Thread Sravan Kumar
Thank you for the feedback.

I'll be glad to help with the fix. Here's the patch for review.


On Tue, Dec 6, 2022 at 1:54 PM Kyotaro Horiguchi 
wrote:

> At Mon, 5 Dec 2022 12:06:11 +0530, Sravan Kumar 
> wrote in
> > timeout = PGARCH_AUTOWAKE_INTERVAL - (curtime - last_copy_time);
> > It so happens that last_copy_time and curtime are always set at the same
> > time which always makes timeout equal (actually roughly equal) to
> > PGARCH_AUTOWAKE_INTERVAL.
>
> Oooo *^^*.
>
> > This behaviour was different before the commit:
> > d75288fb27b8fe0a926aaab7d75816f091ecdc27,
> > in which the archiver keeps track of how much time has elapsed since
> > last_copy_time
> > in case there was a signal, and it results in a smaller subsequent value
> of
> > timeout, until timeout is zero. This also avoids calling
> > pgarch_ArchiverCopyLoop
> > before PGARCH_AUTOWAKE_INTERVAL in case there's an intermittent signal.
>
> Yes, WaitLatch() (I believe) no longer makes a spurious wakeup.
>
> > With the current changes it may be okay to always sleep for
> > PGARCH_AUTOWAKE_INTERVAL,
> > but that means curtime and last_copy_time are no more needed.
>
> I think you're right.
>
> > I would like to validate if my understanding is correct, and which of the
> > behaviours we would like to retain.
>
> As my understanding the patch didn't change the copying behavior of
> the function. I think we should simplify the loop by removing
> last_copy_time and curtime in the "if (!time_to_stop)" block. Then we
> can remove the variable "timeout" and the "if (timeout > 0)"
> branch. Are you willing to work on this?
>
> regards.
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>


-- 
Thanks And Regards,
Sravan

Take life one day at a time.


v1-0001-simplify-wait-loop-in-the-archiver.patch
Description: Binary data


Re: Question regarding "Make archiver process an auxiliary process. commit"

2022-12-06 Thread Bharath Rupireddy
On Tue, Dec 6, 2022 at 4:57 PM Sravan Kumar  wrote:
>
> Thank you for the feedback.
>
> I'll be glad to help with the fix. Here's the patch for review.

Thanks. +1 for fixing this.

I would like to quote recent discussions on reducing the useless
wakeups or increasing the sleep/hibernation times in various processes
to reduce the power savings [1] [2] [3] [4] [5]. With that in context,
does the archiver need to wake up every 60 sec at all when its latch
gets set (PgArchWakeup()) whenever the server switches to a new WAL
file? What happens if we get rid of PGARCH_AUTOWAKE_INTERVAL and rely
on its latch being set? If required, we can spread PgArchWakeup() to
more places, no?

Before even answering the above questions, I think we need to see if
there're any cases where a process can miss SetLatch() calls (I don't
have an answer for that).

Or do we want to stick to what the below comment says?

/*
 * There shouldn't be anything for the archiver to do except to wait for a
 * signal ... however, the archiver exists to protect our data, so she
 * wakes up occasionally to allow herself to be proactive.
 */

[1] 
https://www.postgresql.org/message-id/CA%2BhUKGJCbcv8AtujLw3kEO2wRB7Ffzo1fmwaGG-tQLuMOjf6qQ%40mail.gmail.com
[2]
commit cd4329d9393f84dce34f0bd2dd936adc8ffaa213
Author: Thomas Munro 
Date:   Tue Nov 29 11:28:08 2022 +1300

Remove promote_trigger_file.

Previously, an idle startup (recovery) process would wake up every 5
seconds to have a chance to poll for promote_trigger_file, even if that
GUC was not configured.  That promotion triggering mechanism was
effectively superseded by pg_ctl promote and pg_promote() a long time
ago.  There probably aren't many users left and it's very easy to change
to the modern mechanisms, so we agreed to remove the feature.

This is part of a campaign to reduce wakeups on idle systems.

[3] https://commitfest.postgresql.org/41/4035/
[4] https://commitfest.postgresql.org/41/4020/
[5]
commit 05a7be93558c614ab89c794cb1d301ea9ff33199
Author: Thomas Munro 
Date:   Tue Nov 8 20:36:36 2022 +1300

Suppress useless wakeups in walreceiver.

Instead of waking up 10 times per second to check for various timeout
conditions, keep track of when we next have periodic work to do.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Question regarding "Make archiver process an auxiliary process. commit"

2022-12-06 Thread Sravan Kumar
On Tue, Dec 6, 2022 at 5:24 PM Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:

> On Tue, Dec 6, 2022 at 4:57 PM Sravan Kumar 
> wrote:
> >
> > Thank you for the feedback.
> >
> > I'll be glad to help with the fix. Here's the patch for review.
>
> Thanks. +1 for fixing this.
>> I would like to quote recent discussions on reducing the useless
>> wakeups or increasing the sleep/hibernation times in various processes
>> to reduce the power savings [1] [2] [3] [4] [5]. With that in context,
>> does the archiver need to wake up every 60 sec at all when its latch
>> gets set (PgArchWakeup()) whenever the server switches to a new WAL
>> file? What happens if we get rid of PGARCH_AUTOWAKE_INTERVAL and rely
>> on its latch being set? If required, we can spread PgArchWakeup() to
>> more places, no?
>
>
I like the idea of not having to wake up intermittently and probably we
should start a discussion about it.

I see the following comment in PgArchWakeup().

 /*
* We don't acquire ProcArrayLock here.  It's actually fine because
* procLatch isn't ever freed, so we just can potentially set the wrong
* process' (or no process') latch.  Even in that case the archiver will
* be relaunched shortly and will start archiving.
*/

While I do not fully understand the comment, it gives me an impression that
the SetLatch() done here is counting on the timeout to wake up the archiver
in some cases where the latch is wrongly set.

The proposed idea is a behaviour change while this thread intends to clean
up some code that's
a result of the mentioned commit. So probably the proposed idea can be
discussed as a separate thread.


Before even answering the above questions, I think we need to see if
> there're any cases where a process can miss SetLatch() calls (I don't
> have an answer for that).
>
> Or do we want to stick to what the below comment says?
>
> /*
>  * There shouldn't be anything for the archiver to do except to wait
> for a
>  * signal ... however, the archiver exists to protect our data, so she
>  * wakes up occasionally to allow herself to be proactive.
>  */
>
> [1]
> https://www.postgresql.org/message-id/CA%2BhUKGJCbcv8AtujLw3kEO2wRB7Ffzo1fmwaGG-tQLuMOjf6qQ%40mail.gmail.com
> [2]
> commit cd4329d9393f84dce34f0bd2dd936adc8ffaa213
> Author: Thomas Munro 
> Date:   Tue Nov 29 11:28:08 2022 +1300
>
> Remove promote_trigger_file.
>
> Previously, an idle startup (recovery) process would wake up every 5
> seconds to have a chance to poll for promote_trigger_file, even if that
> GUC was not configured.  That promotion triggering mechanism was
> effectively superseded by pg_ctl promote and pg_promote() a long time
> ago.  There probably aren't many users left and it's very easy to
> change
> to the modern mechanisms, so we agreed to remove the feature.
>
> This is part of a campaign to reduce wakeups on idle systems.
>
> [3] https://commitfest.postgresql.org/41/4035/
> [4] https://commitfest.postgresql.org/41/4020/
> [5]
> commit 05a7be93558c614ab89c794cb1d301ea9ff33199
> Author: Thomas Munro 
> Date:   Tue Nov 8 20:36:36 2022 +1300
>
> Suppress useless wakeups in walreceiver.
>
> Instead of waking up 10 times per second to check for various timeout
> conditions, keep track of when we next have periodic work to do.
>
> --
> Bharath Rupireddy
> PostgreSQL Contributors Team
> RDS Open Source Databases
> Amazon Web Services: https://aws.amazon.com
>


-- 
Thanks And Regards,
Sravan

Take life one day at a time.


Re: Question regarding "Make archiver process an auxiliary process. commit"

2022-12-06 Thread Kyotaro Horiguchi
At Tue, 6 Dec 2022 17:23:50 +0530, Bharath Rupireddy 
 wrote in 
> Thanks. +1 for fixing this.
> 
> I would like to quote recent discussions on reducing the useless
> wakeups or increasing the sleep/hibernation times in various processes
> to reduce the power savings [1] [2] [3] [4] [5]. With that in context,
> does the archiver need to wake up every 60 sec at all when its latch
> gets set (PgArchWakeup()) whenever the server switches to a new WAL
> file? What happens if we get rid of PGARCH_AUTOWAKE_INTERVAL and rely
> on its latch being set? If required, we can spread PgArchWakeup() to
> more places, no?

I thought so first, but archiving may be interrupted for various
reasons (disk full I think is the most common one). So, only
intentional wakeups aren't sufficient.

> Before even answering the above questions, I think we need to see if
> there're any cases where a process can miss SetLatch() calls (I don't
> have an answer for that).

I read a recent Thomas' mail that says something like "should we
consider the case latch sets are missed?".  It is triggered by SIGURG
or SetEvent().  I'm not sure but I believe the former is now reliable
and the latter was born reliable.

> Or do we want to stick to what the below comment says?
> 
> /*
>  * There shouldn't be anything for the archiver to do except to wait for a
>  * signal ... however, the archiver exists to protect our data, so she
>  * wakes up occasionally to allow herself to be proactive.
>  */

So I think this is still valid.  If we want to eliminate useless
wakeups, archiver needs to remember whether the last iteration was
fully done or not. But it seems to be a race condition is involved.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Question regarding "Make archiver process an auxiliary process. commit"

2022-12-06 Thread Kyotaro Horiguchi
At Wed, 7 Dec 2022 11:28:23 +0530, Sravan Kumar  wrote 
in 
> On Tue, Dec 6, 2022 at 5:24 PM Bharath Rupireddy <
> bharath.rupireddyforpostg...@gmail.com> wrote:
> 
> > On Tue, Dec 6, 2022 at 4:57 PM Sravan Kumar 
> > wrote:
> > >
> > > Thank you for the feedback.
> > >
> > > I'll be glad to help with the fix. Here's the patch for review.
> >
> > Thanks. +1 for fixing this.
> >> I would like to quote recent discussions on reducing the useless
> >> wakeups or increasing the sleep/hibernation times in various processes
> >> to reduce the power savings [1] [2] [3] [4] [5]. With that in context,
> >> does the archiver need to wake up every 60 sec at all when its latch
> >> gets set (PgArchWakeup()) whenever the server switches to a new WAL
> >> file? What happens if we get rid of PGARCH_AUTOWAKE_INTERVAL and rely
> >> on its latch being set? If required, we can spread PgArchWakeup() to
> >> more places, no?
> >
> >
> I like the idea of not having to wake up intermittently and probably we
> should start a discussion about it.
> 
> I see the following comment in PgArchWakeup().
> 
>  /*
> * We don't acquire ProcArrayLock here.  It's actually fine because
> * procLatch isn't ever freed, so we just can potentially set the wrong
> * process' (or no process') latch.  Even in that case the archiver will
> * be relaunched shortly and will start archiving.
> */
> 
> While I do not fully understand the comment, it gives me an impression that
> the SetLatch() done here is counting on the timeout to wake up the archiver
> in some cases where the latch is wrongly set.

It is telling about the first iteration of archive process, not
periodical wakeups.  So I think it is doable by considering how to
handle incomplete archiving iterations.

> The proposed idea is a behaviour change while this thread intends to clean
> up some code that's
> a result of the mentioned commit. So probably the proposed idea can be
> discussed as a separate thread.

Agreed.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center