Re: [HACKERS] Re: pg_ctl wait exit code (was Re: [COMMITTERS] pgsql: Additional tests for subtransactions in recovery)

2017-07-05 Thread Michael Paquier
On Thu, Jul 6, 2017 at 2:41 AM, Peter Eisentraut
 wrote:
> On 7/2/17 20:28, Michael Paquier wrote:
>>> I was going to hold this back for PG11, but since we're now doing some
>>> other tweaks in pg_ctl, it might be useful to add this too.  Thoughts?
>>
>> The use of 0 as exit code for the new promote -w if timeout is reached
>> looks like an open item to me. Cleaning up the pool queries after
>> promotion would be nice to see as well.
>
> committed

Thanks for finishing the cleanup.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: pg_ctl wait exit code (was Re: [COMMITTERS] pgsql: Additional tests for subtransactions in recovery)

2017-07-05 Thread Peter Eisentraut
On 7/2/17 20:28, Michael Paquier wrote:
>> I was going to hold this back for PG11, but since we're now doing some
>> other tweaks in pg_ctl, it might be useful to add this too.  Thoughts?
> 
> The use of 0 as exit code for the new promote -w if timeout is reached
> looks like an open item to me. Cleaning up the pool queries after
> promotion would be nice to see as well.

committed

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: pg_ctl wait exit code (was Re: [COMMITTERS] pgsql: Additional tests for subtransactions in recovery)

2017-07-02 Thread Michael Paquier
On Sat, Jul 1, 2017 at 4:47 AM, Peter Eisentraut
 wrote:
> On 5/1/17 12:19, Peter Eisentraut wrote:
>> However: Failure to complete promotion within the waiting time does not
>> lead to an error exit, so you will not get a failure if the promotion
>> does not finish.  This is probably a mistake.  Looking around pg_ctl, I
>> found that this was handled seemingly inconsistently in do_start(), but
>> do_stop() errors when it does not complete.

This inconsistency could be treated like a bug, though changing such
an old behavior in bacl-branches would be risky. So +1 for only HEAD
with such a change, and pg_ctl promote -w is new in 10.

>> Possible patches for this attached.
>>
>> Perhaps we need a separate exit code in pg_ctl to distinguish general
>> errors from did not finish within timeout?

I would treat that as a separate item for 11, but that's as far as my
opinion goes. Per this link in pg_ctl.c the error code ought to be 4:
https://refspecs.linuxbase.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html

> I was going to hold this back for PG11, but since we're now doing some
> other tweaks in pg_ctl, it might be useful to add this too.  Thoughts?

The use of 0 as exit code for the new promote -w if timeout is reached
looks like an open item to me. Cleaning up the pool queries after
promotion would be nice to see as well.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: pg_ctl wait exit code (was Re: [COMMITTERS] pgsql: Additional tests for subtransactions in recovery)

2017-06-30 Thread Peter Eisentraut
On 5/1/17 12:19, Peter Eisentraut wrote:
> On 4/27/17 08:41, Michael Paquier wrote:
>> +$node_slave->promote;
>> +$node_slave->poll_query_until('postgres',
>> +   "SELECT NOT pg_is_in_recovery()")
>> +  or die "Timed out while waiting for promotion of standby";
>>
>> This reminds me that we should really switch PostgresNode::promote to
>> use the wait mode of pg_ctl promote, and remove all those polling
>> queries...
> 
> I was going to say: This should all be obsolete already, because pg_ctl
> promote waits by default.
> 
> However: Failure to complete promotion within the waiting time does not
> lead to an error exit, so you will not get a failure if the promotion
> does not finish.  This is probably a mistake.  Looking around pg_ctl, I
> found that this was handled seemingly inconsistently in do_start(), but
> do_stop() errors when it does not complete.
> 
> Possible patches for this attached.
> 
> Perhaps we need a separate exit code in pg_ctl to distinguish general
> errors from did not finish within timeout?

I was going to hold this back for PG11, but since we're now doing some
other tweaks in pg_ctl, it might be useful to add this too.  Thoughts?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 67707d541a2d9e088109385c8fa1eced8af83d54 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 1 May 2017 12:10:17 -0400
Subject: [PATCH v2 1/2] pg_ctl: Make failure to complete operation a nonzero
 exit

If an operation being waited for does not complete within the timeout,
then exit with a nonzero exit status.  This was previously handled
inconsistently.
---
 doc/src/sgml/ref/pg_ctl-ref.sgml | 7 +++
 src/bin/pg_ctl/pg_ctl.c  | 8 ++--
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/pg_ctl-ref.sgml b/doc/src/sgml/ref/pg_ctl-ref.sgml
index 71e52c4c35..12fa011c4e 100644
--- a/doc/src/sgml/ref/pg_ctl-ref.sgml
+++ b/doc/src/sgml/ref/pg_ctl-ref.sgml
@@ -412,6 +412,13 @@ Options
 pg_ctl returns an exit code based on the
 success of the startup or shutdown.

+
+   
+If the operation does not complete within the timeout (see
+option -t), then pg_ctl exits with
+a nonzero exit status.  But note that the operation might continue in
+the background and eventually succeed.
+   
   
  
 
diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
index 0c65196bda..4e02c4cea1 100644
--- a/src/bin/pg_ctl/pg_ctl.c
+++ b/src/bin/pg_ctl/pg_ctl.c
@@ -840,7 +840,9 @@ do_start(void)
break;
case POSTMASTER_STILL_STARTING:
print_msg(_(" stopped waiting\n"));
-   print_msg(_("server is still starting up\n"));
+   write_stderr(_("%s: server did not start in 
time\n"),
+progname);
+   exit(1);
break;
case POSTMASTER_FAILED:
print_msg(_(" stopped waiting\n"));
@@ -1166,7 +1168,9 @@ do_promote(void)
else
{
print_msg(_(" stopped waiting\n"));
-   print_msg(_("server is still promoting\n"));
+   write_stderr(_("%s: server did not promote in time\n"),
+progname);
+   exit(1);
}
}
else
-- 
2.13.1

From b30b7d96161a2e27d80cc96073b44c5266c2b751 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 1 May 2017 12:11:25 -0400
Subject: [PATCH v2 2/2] Remove unnecessary pg_is_in_recovery calls in tests

Since pg_ctl promote already waits for recovery to end, these calls are
obsolete.
---
 src/test/modules/commit_ts/t/003_standby_2.pl | 1 -
 src/test/recovery/t/008_fsm_truncation.pl | 2 --
 src/test/recovery/t/009_twophase.pl   | 6 --
 src/test/recovery/t/010_logical_decoding_timelines.pl | 3 ---
 src/test/recovery/t/012_subtransactions.pl| 6 --
 5 files changed, 18 deletions(-)

diff --git a/src/test/modules/commit_ts/t/003_standby_2.pl 
b/src/test/modules/commit_ts/t/003_standby_2.pl
index 2fd561115c..c3000f5b4c 100644
--- a/src/test/modules/commit_ts/t/003_standby_2.pl
+++ b/src/test/modules/commit_ts/t/003_standby_2.pl
@@ -55,7 +55,6 @@
 $master->restart;
 
 system_or_bail('pg_ctl', '-D', $standby->data_dir, 'promote');
-$standby->poll_query_until('postgres', "SELECT pg_is_in_recovery() <> true");
 
 $standby->safe_psql('postgres', "create table t11()");
 my $standby_ts = $standby->safe_psql('postgres',
diff --git a/src/test/recovery/t/008_fsm_truncation.pl 
b/src/test/recovery/t/008_fsm_truncation.pl
index 56eecf722c..ddab464a97