Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2016-03-01 Thread Alvaro Herrera
Michael Paquier wrote:
> On Tue, Mar 1, 2016 at 5:13 PM, Kyotaro HORIGUCHI
>  wrote:

> +   return wantarray ? ($stdout, $stderr) : $stdout;
> So you are willing to extend that so as you could perform conparison
> tests on the error strings returned. Why no, it looks useful, though
> now there is no test in need of it I think. So without a proper need I
> think that we could live without.

Does this change let us implement psql_ok and psql_fail?  I think I've
seen a few places already, both in committed code and in submitted
patches, that test for some kind of failure from psql.

> > 0002-Prefix-test-numbers-to-node-
> >
> >   This is rather a example usage of 0001- patch (except for
> >   stderr stuff). 00n_xxx test leaves temporary directories with
> >   the names of 00n_(master|standby)_ on failure. If this is
> >   considered reasonable, I'll make same patches for the other
> >   /t/nnn_*.pl tests.
> 
> -my $node_master = get_new_node('master');
> +my $node_master = get_new_node('001_master');
> I am not a fan of appending the test number in the node name. For one,
> this complicates the log file name associated with a node by
> duplicating the test number in its name. Also, it is possible to
> easily get the name of the data folder for a node by looking at the
> logs.

Why don't we use something similar to what's in $test_logfile in
TestLib?

> Also, it is possible to easily get the name of the data folder for a
> node by looking at the logs.

No disagreement on it being possible, but "easily" seems a bad
description for that.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2016-03-01 Thread Craig Ringer
On 1 March 2016 at 21:05, Craig Ringer  wrote:

> On 1 March 2016 at 20:45, Michael Paquier 
> wrote:
>
>> On Tue, Mar 1, 2016 at 5:13 PM, Kyotaro HORIGUCHI
>>  wrote:
>> > 0001-Change-behavior-...
>> >
>> >   Changes of PostmasterNode.pm and TestLib.pm to add some
>> >   features and change a behavior.
>>
>> +   # Preserve temporary directory for this test if failure
>> +   $File::Temp::KEEP_ALL = 1 unless Test::More->builder->is_passing;
>> +1. Having the data folders being removed even in case of a failure is
>> really annoying.
>>
>
> I agree on all points re your review. Keeping tempdirs is really needed,
> the tempdir name change is good, the  the rest I'm not keen on.
>
> I've addressed the need to get stderr from psql in a patch I'll submit
> separately, which provides a thinner wrapper around IPC::Run for more
> complex needs, then uses that for the existing 'psql' function. It also
> provides a 'psql_check' that dies on error.
>
> I'll incorporate the wanted changes into that patch.
>

OK, done.

https://commitfest.postgresql.org/9/569/#

Michael, I'd value your thoughts on the patches. I think the psql one is
important, I found the existing 'psql' function too limiting and I think
defaulting to ignoring errors is sub-optimal.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2016-03-01 Thread Craig Ringer
On 1 March 2016 at 20:45, Michael Paquier  wrote:

> On Tue, Mar 1, 2016 at 5:13 PM, Kyotaro HORIGUCHI
>  wrote:
> > 0001-Change-behavior-...
> >
> >   Changes of PostmasterNode.pm and TestLib.pm to add some
> >   features and change a behavior.
>
> +   # Preserve temporary directory for this test if failure
> +   $File::Temp::KEEP_ALL = 1 unless Test::More->builder->is_passing;
> +1. Having the data folders being removed even in case of a failure is
> really annoying.
>

I agree on all points re your review. Keeping tempdirs is really needed,
the tempdir name change is good, the  the rest I'm not keen on.

I've addressed the need to get stderr from psql in a patch I'll submit
separately, which provides a thinner wrapper around IPC::Run for more
complex needs, then uses that for the existing 'psql' function. It also
provides a 'psql_check' that dies on error.

I'll incorporate the wanted changes into that patch.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2016-03-01 Thread Michael Paquier
On Tue, Mar 1, 2016 at 5:13 PM, Kyotaro HORIGUCHI
 wrote:
> 0001-Change-behavior-...
>
>   Changes of PostmasterNode.pm and TestLib.pm to add some
>   features and change a behavior.

+   # Preserve temporary directory for this test if failure
+   $File::Temp::KEEP_ALL = 1 unless Test::More->builder->is_passing;
+1. Having the data folders being removed even in case of a failure is
really annoying.

+   # Preserve temporary directory for this test if failure
+   $File::Temp::KEEP_ALL = 1 unless Test::More->builder->is_passing;
s/if failure/in the event of a failure/?

+   return wantarray ? ($stdout, $stderr) : $stdout;
So you are willing to extend that so as you could perform conparison
tests on the error strings returned. Why no, it looks useful, though
now there is no test in need of it I think. So without a proper need I
think that we could live without.

> 0002-Prefix-test-numbers-to-node-
>
>   This is rather a example usage of 0001- patch (except for
>   stderr stuff). 00n_xxx test leaves temporary directories with
>   the names of 00n_(master|standby)_ on failure. If this is
>   considered reasonable, I'll make same patches for the other
>   /t/nnn_*.pl tests.

-my $node_master = get_new_node('master');
+my $node_master = get_new_node('001_master');
I am not a fan of appending the test number in the node name. For one,
this complicates the log file name associated with a node by
duplicating the test number in its name. Also, it is possible to
easily get the name of the data folder for a node by looking at the
logs.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2016-03-01 Thread Kyotaro HORIGUCHI
Hello,

At Fri, 26 Feb 2016 15:43:14 -0300, Alvaro Herrera  
wrote in <20160226184314.GA205945@alvherre.pgsql>
> Kyotaro HORIGUCHI wrote:
> 
> > So, I'd like to propose four (or five) changes to this harness.
> > 
> >  - prove_check to remove all in tmp_check
> > 
> >  - TestLib to preserve temporary directories/files if the current
> >test fails.
> > 
> >  - PostgresNode::get_new_node to create data directory with
> >meaningful basenames.
> > 
> >  - PostgresNode::psql to return a list of ($stdout, $stderr) if
> >requested. (The previous behavior is not changed)
> > 
> >  - (recovery/t/00x_* gives test number to node name)
> > 
> > As a POC, the attached diff will appliy on the 0001 and (fixed)
> > 0003 patches.
> 
> These changes all seem very reasonable to me.  I'm not so sure about the
> last one.  Perhaps the framework ought to generate an appropriate subdir
> name using the test file name plus the node name, so that instead of
> tmp_ it becomes tmp_001_master_ or something like that?  Having
> be a coding convention doesn't look real nice to me.

Thank you for mentioning this.

Sorry, the last one accidentally contained garbage code to
intentionally raise an error. The last one names the nodes as
such like '001_master_24f8'. Maybe prefixing "tmp_" would be
better.

> I didn't try to apply your patch but I'm fairly certain it would
> conflict with what's here now; can you please rebase and resend?

This was a very small patch made on the old, uncommited
patches. I remade the patch and split it into two parts.

0001-Change-behavior-...

  Changes of PostmasterNode.pm and TestLib.pm to add some
  features and change a behavior.

0002-Prefix-test-numbers-to-node-

  This is rather a example usage of 0001- patch (except for
  stderr stuff). 00n_xxx test leaves temporary directories with
  the names of 00n_(master|standby)_ on failure. If this is
  considered reasonable, I'll make same patches for the other
  /t/nnn_*.pl tests.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 4e2c5123e249ef3278da8d5260bbe1e4799988b2 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Tue, 1 Mar 2016 16:09:25 +0900
Subject: [PATCH 1/2] Change behavior of PostgresNode.pm and TestLib.pm

Modify PostgresNode.new to use a temporary directory with the name
after the node name instead of tmp_dir. Modify PostgresNode.psql
to allow callers to get stderr output. Make TestLib to preserve the
temporary directory if a test has been failed. Modify TestLib.tempdir
to allow callers to prefix the tempdir name with an arbitrary prefix.
---
 src/test/perl/PostgresNode.pm |  9 +++--
 src/test/perl/TestLib.pm  | 10 +-
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index a8e6f0c..ad27361 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -115,7 +115,7 @@ sub new
 	my $self   = {
 		_port => $pgport,
 		_host => $pghost,
-		_basedir  => TestLib::tempdir,
+		_basedir  => TestLib::tempdir($name),
 		_name => $name,
 		_logfile  => "$TestLib::log_path/${testname}_${name}.log" };
 
@@ -805,10 +805,15 @@ sub psql
 		print " Begin standard error\n";
 		print $stderr;
 		print " End standard error\n";
+		if (wantarray)
+		{
+			chomp $stderr;
+			$stderr =~ s/\r//g if $Config{osname} eq 'msys';
+		}
 	}
 	chomp $stdout;
 	$stdout =~ s/\r//g if $Config{osname} eq 'msys';
-	return $stdout;
+	return wantarray ? ($stdout, $stderr) : $stdout;
 }
 
 =pod
diff --git a/src/test/perl/TestLib.pm b/src/test/perl/TestLib.pm
index 3d11cbb..da0e617 100644
--- a/src/test/perl/TestLib.pm
+++ b/src/test/perl/TestLib.pm
@@ -107,13 +107,21 @@ INIT
 	autoflush TESTLOG 1;
 }
 
+END
+{
+	# Preserve temporary directory for this test if failure
+	$File::Temp::KEEP_ALL = 1 unless Test::More->builder->is_passing;
+}
+
 #
 # Helper functions
 #
 sub tempdir
 {
+	my ($prefix) = @_;
+	$prefix = "tmp_test" if (!$prefix);
 	return File::Temp::tempdir(
-		'tmp_test',
+		$prefix.'_',
 		DIR => $tmp_check,
 		CLEANUP => 1);
 }
-- 
1.8.3.1

>From fab2f28637a4392ef7e0bfbdf9bcac51f5fb5fd7 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Tue, 1 Mar 2016 16:50:24 +0900
Subject: [PATCH 2/2] Prefix test numbers to node directories

Add prefix numbers to node names to make the replication tests leave
directories with reasonable names on failure.
---
 src/test/recovery/t/001_stream_rep.pl   | 6 +++---
 src/test/recovery/t/002_archiving.pl| 4 ++--
 src/test/recovery/t/003_recovery_targets.pl | 4 ++--
 src/test/recovery/t/004_timeline_switch.pl  | 6 +++---
 src/test/recovery/t/005_replay_delay.pl | 4 ++--
 5 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 7dcca65..6f2d13b 100644

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2016-02-26 Thread Alvaro Herrera
Kyotaro HORIGUCHI wrote:

> So, I'd like to propose four (or five) changes to this harness.
> 
>  - prove_check to remove all in tmp_check
> 
>  - TestLib to preserve temporary directories/files if the current
>test fails.
> 
>  - PostgresNode::get_new_node to create data directory with
>meaningful basenames.
> 
>  - PostgresNode::psql to return a list of ($stdout, $stderr) if
>requested. (The previous behavior is not changed)
> 
>  - (recovery/t/00x_* gives test number to node name)
> 
> As a POC, the attached diff will appliy on the 0001 and (fixed)
> 0003 patches.

These changes all seem very reasonable to me.  I'm not so sure about the
last one.  Perhaps the framework ought to generate an appropriate subdir
name using the test file name plus the node name, so that instead of
tmp_ it becomes tmp_001_master_ or something like that?  Having
be a coding convention doesn't look real nice to me.

I didn't try to apply your patch but I'm fairly certain it would
conflict with what's here now; can you please rebase and resend?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2016-02-04 Thread Noah Misch
On Thu, Feb 04, 2016 at 09:13:46PM +0300, Victor Wagner wrote:
> On Thu, 4 Feb 2016 18:33:27 +0300 Michael Paquier  
> wrote:

> > > Really, it is not so hard to add configure checks for perl modules.
> > > And we need to test not only for IPC::Run, but for Test::More too,
> > > because some Linux distributions put modules which come with perl
> > > into separate package.  
> > 
> > The last time we discussed about that on this list we concluded that
> > it was not really necessary to have such checks, for one it makes the
> > code more simple, and because this is leveraged by the presence of
> > --enable-tap-tests, tests which can get actually costly with
> > check-world. But this is digressing the subject of this thread, which
> > deals with the fact of having recovery tests integrated in core...
> 
> Of course, such configure tests should be run only if
> --enable-tap-tests is passed to the configure script
> 
> It would look  like
> 
> if test "$enable_tap_tests" = "yes"; then
>   AX_PROG_PERL_MODULES( Test::More, , AC_MSG_ERROR([Test::More is
>   necessary to run TAP Tests])
>   AX_PROG_PERL_MODULES( IPC::Run, , AC_MSG_ERROR([IPC::Run is
>   necessary to run TAP Tests])
> fi
> 
> in the configure.in
> 
> May be it is not strictly necessary, but it is really useful to see
> such problems as clear error message during configure stage, rather
> than successfully configure, compile, run tests and only then find out,
> that something is forgotten.
> 
> I don't see why having such tests in the configure.in, makes code more
> complex. It just prevents configure to finish successfully if
> --enable-tap-tests is specified and required modules are not available.

Even if detecting missing modules at "configure" time is the right thing, it
belongs in a distinct patch, discussed on a distinct thread.  The absence of
IPC::Run affects the proposed replication tests in the same way it affects
current TAP suites, so this thread has no business revisiting it.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2016-02-04 Thread Kyotaro HORIGUCHI
Hello, I'm studying this.

Two hunks in 0003 needed a fix but the other part applied cleanly
on master.

At Fri, 22 Jan 2016 15:17:51 +0900, Michael Paquier  
wrote in 

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2016-01-21 Thread Michael Paquier
On Mon, Dec 21, 2015 at 4:45 PM, Michael Paquier
 wrote:
> As this thread is stalling a bit, please find attached a series of
> patch gathering all the pending issues for this thread:
> - 0001, fix config_default.pl for MSVC builds to take into account TAP tests
> - 0002, append a node name in get_new_node (per Noah's request)
> - 0003, the actual recovery test suite
> Hopefully this facilitates future reviews.

Patch 2 has been pushed as c8642d9 (thanks Alvaro). The remaining two
patches still apply and pass cleanly.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-20 Thread Michael Paquier
On Sat, Dec 12, 2015 at 8:29 PM, Michael Paquier
 wrote:
> On Sat, Dec 12, 2015 at 11:37 AM, Noah Misch  wrote:
>> On Fri, Dec 11, 2015 at 09:34:34PM +0900, Michael Paquier wrote:
>>> On Fri, Dec 11, 2015 at 8:48 PM, Alvaro Herrera
>>>  wrote:
>>> > Michael Paquier wrote:
>>> >> On Fri, Dec 11, 2015 at 5:35 AM, Alvaro Herrera
>>> >>  wrote:
>>> >> I guess that to complete your idea we could allow PostgresNode to get
>>> >> a custom name for its log file through an optional parameter like
>>> >> logfile => 'myname' or similar. And if nothing is defined, process
>>> >> falls back to applname. So this would give the following:
>>> >> ${testname}_${logfile}.log
>>> >
>>> > Sure. I don't think we should the name only for the log file, though,
>>> > but also for things like the "## " informative messages we print here
>>> > and there.  That would make the log file simpler to follow.  Also, I'm
>>> > not sure about having it be optional.  (TBH I'm not sure about applname
>>> > either; why do we keep that one?)
>>>
>>> OK, so let's do this: the node name is a mandatory argument of
>>> get_new_node, which is passed to "new PostgresNode" like the port and
>>> the host, and it is then used in the log file name as well as in the
>>> information messages you are mentioning. That's a patch simple enough.
>>> Are you fine with this approach?
>>
>> Sounds reasonable so far.
>
> OK, done so.
>
>>> Regarding the application name, I still think it is useful to have it
>>> though. pg_rewind should actually use it, and the other patch adding
>>> the recovery routines will use it.
>>
>> Using the application_name connection parameter is fine, but I can't think of
>> a reason to set it to "node_".$node->port instead of $node->name.  And I 
>> can't
>> think of a use for the $node->applname field once you have $node->name.  What
>> use case would benefit?
>
> I have the applname stuff, and updated the log messages to use the
> node name for clarity.
>
> The patch to address those points is attached.

As this thread is stalling a bit, please find attached a series of
patch gathering all the pending issues for this thread:
- 0001, fix config_default.pl for MSVC builds to take into account TAP tests
- 0002, append a node name in get_new_node (per Noah's request)
- 0003, the actual recovery test suite
Hopefully this facilitates future reviews.
Regards,
-- 
Michael


0001-Fix-default-configuration-of-MSVC-builds-ignoring-TA.patch
Description: binary/octet-stream


0002-Assign-node-name-to-TAP-tests.patch
Description: binary/octet-stream


0003-Add-recovery-test-suite.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-12 Thread Michael Paquier
On Sat, Dec 12, 2015 at 11:37 AM, Noah Misch  wrote:
> On Fri, Dec 11, 2015 at 09:34:34PM +0900, Michael Paquier wrote:
>> On Fri, Dec 11, 2015 at 8:48 PM, Alvaro Herrera
>>  wrote:
>> > Michael Paquier wrote:
>> >> On Fri, Dec 11, 2015 at 5:35 AM, Alvaro Herrera
>> >>  wrote:
>> >> I guess that to complete your idea we could allow PostgresNode to get
>> >> a custom name for its log file through an optional parameter like
>> >> logfile => 'myname' or similar. And if nothing is defined, process
>> >> falls back to applname. So this would give the following:
>> >> ${testname}_${logfile}.log
>> >
>> > Sure. I don't think we should the name only for the log file, though,
>> > but also for things like the "## " informative messages we print here
>> > and there.  That would make the log file simpler to follow.  Also, I'm
>> > not sure about having it be optional.  (TBH I'm not sure about applname
>> > either; why do we keep that one?)
>>
>> OK, so let's do this: the node name is a mandatory argument of
>> get_new_node, which is passed to "new PostgresNode" like the port and
>> the host, and it is then used in the log file name as well as in the
>> information messages you are mentioning. That's a patch simple enough.
>> Are you fine with this approach?
>
> Sounds reasonable so far.

OK, done so.

>> Regarding the application name, I still think it is useful to have it
>> though. pg_rewind should actually use it, and the other patch adding
>> the recovery routines will use it.
>
> Using the application_name connection parameter is fine, but I can't think of
> a reason to set it to "node_".$node->port instead of $node->name.  And I can't
> think of a use for the $node->applname field once you have $node->name.  What
> use case would benefit?

I have the applname stuff, and updated the log messages to use the
node name for clarity.

The patch to address those points is attached.
Regards,
-- 
Michael


20151212_tap_node_name.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-11 Thread Alvaro Herrera
Michael Paquier wrote:
> On Fri, Dec 11, 2015 at 5:35 AM, Alvaro Herrera
>  wrote:

> > Since we now have the node name in the log file name, perhaps we no
> > longer need the port number in there
> 
> There is no node name in the log file name as of now, they are built
> using the port number, and the information of a node is dumped into
> the central log file when created (see dump_info).

Yeah, I realized this after posting.  What I thought was the node name,
based on some of the files I had laying around, was actually the test
name.

> I guess that to complete your idea we could allow PostgresNode to get
> a custom name for its log file through an optional parameter like
> logfile => 'myname' or similar. And if nothing is defined, process
> falls back to applname. So this would give the following:
> ${testname}_${logfile}.log

Sure. I don't think we should the name only for the log file, though,
but also for things like the "## " informative messages we print here
and there.  That would make the log file simpler to follow.  Also, I'm
not sure about having it be optional.  (TBH I'm not sure about applname
either; why do we keep that one?)

> It seems that we had better keep the test name as a prefix of the log
> file name though, to avoid an overlap with any other test in the same
> series. Thoughts?

Yes, agreed on that.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-11 Thread Michael Paquier
On Fri, Dec 11, 2015 at 8:48 PM, Alvaro Herrera
 wrote:
> Michael Paquier wrote:
>> On Fri, Dec 11, 2015 at 5:35 AM, Alvaro Herrera
>>  wrote:
>> I guess that to complete your idea we could allow PostgresNode to get
>> a custom name for its log file through an optional parameter like
>> logfile => 'myname' or similar. And if nothing is defined, process
>> falls back to applname. So this would give the following:
>> ${testname}_${logfile}.log
>
> Sure. I don't think we should the name only for the log file, though,
> but also for things like the "## " informative messages we print here
> and there.  That would make the log file simpler to follow.  Also, I'm
> not sure about having it be optional.  (TBH I'm not sure about applname
> either; why do we keep that one?)

OK, so let's do this: the node name is a mandatory argument of
get_new_node, which is passed to "new PostgresNode" like the port and
the host, and it is then used in the log file name as well as in the
information messages you are mentioning. That's a patch simple enough.
Are you fine with this approach?

Regarding the application name, I still think it is useful to have it
though. pg_rewind should actually use it, and the other patch adding
the recovery routines will use it.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-11 Thread Noah Misch
On Fri, Dec 11, 2015 at 09:34:34PM +0900, Michael Paquier wrote:
> On Fri, Dec 11, 2015 at 8:48 PM, Alvaro Herrera
>  wrote:
> > Michael Paquier wrote:
> >> On Fri, Dec 11, 2015 at 5:35 AM, Alvaro Herrera
> >>  wrote:
> >> I guess that to complete your idea we could allow PostgresNode to get
> >> a custom name for its log file through an optional parameter like
> >> logfile => 'myname' or similar. And if nothing is defined, process
> >> falls back to applname. So this would give the following:
> >> ${testname}_${logfile}.log
> >
> > Sure. I don't think we should the name only for the log file, though,
> > but also for things like the "## " informative messages we print here
> > and there.  That would make the log file simpler to follow.  Also, I'm
> > not sure about having it be optional.  (TBH I'm not sure about applname
> > either; why do we keep that one?)
> 
> OK, so let's do this: the node name is a mandatory argument of
> get_new_node, which is passed to "new PostgresNode" like the port and
> the host, and it is then used in the log file name as well as in the
> information messages you are mentioning. That's a patch simple enough.
> Are you fine with this approach?

Sounds reasonable so far.

> Regarding the application name, I still think it is useful to have it
> though. pg_rewind should actually use it, and the other patch adding
> the recovery routines will use it.

Using the application_name connection parameter is fine, but I can't think of
a reason to set it to "node_".$node->port instead of $node->name.  And I can't
think of a use for the $node->applname field once you have $node->name.  What
use case would benefit?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-10 Thread Alvaro Herrera
Noah Misch wrote:
> On Mon, Dec 07, 2015 at 02:34:39PM +0900, Michael Paquier wrote:

> > I don't directly see any limitation with the use of kill on Windows..
> > http://perldoc.perl.org/functions/kill.html
> > But indeed using directly pg_ctl kill seems like a better fit for the
> > PG infrastructure.
> 
> From http://perldoc.perl.org/perlwin32.html, "Using signals under this port
> should currently be considered unsupported."  Windows applications cannot
> handle SIGQUIT: https://msdn.microsoft.com/en-us/library/xdkz3x12.aspx.  The
> PostgreSQL backend does not generate or expect Windows signals; see its
> signal.c emulation facility.

Makes sense.  What we're doing now is what you suggested, so we should
be fine.

> > > Postmaster log file names became less informative.  Before the commit:
> > > Should nodes get a name, so we instead see master_57834.log?
> > 
> > I am not sure that this is necessary.
> 
> In general, you'd need to cross-reference the main log file to determine which
> postmaster log corresponds to which action within the test.  I did plenty of
> "grep $PATTERN src/bin/pg_rewind/tmp_check/log/master.log" while debugging
> that test.  I'd like to be able to use /*master*.log, not rely on timestamps
> or on scraping regress_log_002_databases to determine which logs are master
> logs.  Feel free to skip this point if I'm the only one minding, though.

Since we now have the node name in the log file name, perhaps we no
longer need the port number in there.  In fact, I find having the file
name change on every run (based on the port number) is slightly
annoying.  I vote we change it back to using the node name without the
port number.  (Also, some PostgresNode messages refer to the instance by
datadir and port number, which is unnecessary: it would be clearer to
use the name instead.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-10 Thread Michael Paquier
On Fri, Dec 11, 2015 at 5:35 AM, Alvaro Herrera
 wrote:
> Noah Misch wrote:
>> On Mon, Dec 07, 2015 at 02:34:39PM +0900, Michael Paquier wrote:
>> > > Postmaster log file names became less informative.  Before the commit:
>> > > Should nodes get a name, so we instead see master_57834.log?
>> >
>> > I am not sure that this is necessary.
>>
>> In general, you'd need to cross-reference the main log file to determine 
>> which
>> postmaster log corresponds to which action within the test.  I did plenty of
>> "grep $PATTERN src/bin/pg_rewind/tmp_check/log/master.log" while debugging
>> that test.  I'd like to be able to use /*master*.log, not rely on timestamps
>> or on scraping regress_log_002_databases to determine which logs are master
>> logs.  Feel free to skip this point if I'm the only one minding, though.
>
> Since we now have the node name in the log file name, perhaps we no
> longer need the port number in there

There is no node name in the log file name as of now, they are built
using the port number, and the information of a node is dumped into
the central log file when created (see dump_info).

> In fact, I find having the file
> name change on every run (based on the port number) is slightly
> annoying.  I vote we change it back to using the node name without the
> port number.  (Also, some PostgresNode messages refer to the instance by
> datadir and port number, which is unnecessary: it would be clearer to
> use the name instead.)

OK, so... What we have now as log file for a specific node is that:
${testname}_node_${port}.log
which is equivalent to that:
${testname}_${applname}.log

I guess that to complete your idea we could allow PostgresNode to get
a custom name for its log file through an optional parameter like
logfile => 'myname' or similar. And if nothing is defined, process
falls back to applname. So this would give the following:
${testname}_${logfile}.log

It seems that we had better keep the test name as a prefix of the log
file name though, to avoid an overlap with any other test in the same
series. Thoughts?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-09 Thread Alvaro Herrera
I've been giving RecoveryTest.pm a look. I wonder if we really need that
as a separate package.  My first thought was that we could have another
class that inherits from PostgresNode (say RecoveryNode).  But later it
occured to me that we could have the new functions just be part of
PostgresNode itself directly; so we would have some new PostgresNode
methods:
$node->enable_streaming
$node->enable_restoring
$node->enable_archiving
$node->wait (your RecoveryTest::wait_for_node; better name for this?)

and some additional constructors:
make_master
make_stream_standby
make_archive_standby

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-09 Thread Michael Paquier
On Thu, Dec 10, 2015 at 6:46 AM, Alvaro Herrera
 wrote:
> I've been giving RecoveryTest.pm a look. I wonder if we really need that
> as a separate package.  My first thought was that we could have another
> class that inherits from PostgresNode (say RecoveryNode).  But later it
> occured to me that we could have the new functions just be part of
> PostgresNode itself directly; so we would have some new PostgresNode
> methods:
> $node->enable_streaming
> $node->enable_restoring
> $node->enable_archiving

Sure.

> $node->wait (your RecoveryTest::wait_for_node; better name for this?)

wait_for_access?

> and some additional constructors:
> make_master
> make_stream_standby
> make_archive_standby

I have done that a little bit differently. Those are completely
remove, then init() and init_from_backup() are extended with a new set
of parameters to enable archive, streaming or restore on a node.

Which gives the patch attached.
-- 
Michael


20151210_recovery_suite.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-09 Thread Noah Misch
On Mon, Dec 07, 2015 at 02:34:39PM +0900, Michael Paquier wrote:
> On Mon, Dec 7, 2015 at 12:06 PM, Noah Misch  wrote:
> > On Wed, Dec 02, 2015 at 06:59:09PM -0300, Alvaro Herrera wrote:
> >> +sub DESTROY
> >> +{
> >> + my $self = shift;
> >> + return if not defined $self->{_pid};
> >> + print "# signalling QUIT to $self->{_pid}\n";
> >> + kill 'QUIT', $self->{_pid};
> >
> > On Windows, that kill() is the wrong thing.  I suspect "pg_ctl kill" will be
> > the right thing, but that warrants specific testing.
> 
> I don't directly see any limitation with the use of kill on Windows..
> http://perldoc.perl.org/functions/kill.html
> But indeed using directly pg_ctl kill seems like a better fit for the
> PG infrastructure.

>From http://perldoc.perl.org/perlwin32.html, "Using signals under this port
should currently be considered unsupported."  Windows applications cannot
handle SIGQUIT: https://msdn.microsoft.com/en-us/library/xdkz3x12.aspx.  The
PostgreSQL backend does not generate or expect Windows signals; see its
signal.c emulation facility.

> > Postmaster log file names became less informative.  Before the commit:
> > Should nodes get a name, so we instead see master_57834.log?
> 
> I am not sure that this is necessary.

In general, you'd need to cross-reference the main log file to determine which
postmaster log corresponds to which action within the test.  I did plenty of
"grep $PATTERN src/bin/pg_rewind/tmp_check/log/master.log" while debugging
that test.  I'd like to be able to use /*master*.log, not rely on timestamps
or on scraping regress_log_002_databases to determine which logs are master
logs.  Feel free to skip this point if I'm the only one minding, though.

> There is definitely a limitation
> regarding the fact that log files of nodes get overwritten after each
> test, hence I would tend with just appending the test name in front of
> the node_* files similarly to the other files.

They got appended, not overwritten.  I like how you changed that to not
happen, but it doesn't address what I was reporting.

nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-07 Thread Alvaro Herrera
Michael Paquier wrote:

> > If that's not a hard-coded PG version number then I don't know
> > what it is.  Maybe it would be better to use random() instead,
> > but surely this isn't good as-is.
> 
> We would definitely want something within the ephemeral port range, so
> we are up to that:
> rand() * 16384 + 49152;

Yes, this seems to produce the correct range.

Thanks Noah and Tom for the review, and thanks Michael for the patch.  I
pushed it.  A slight fix was to change the chomp() call; it was always
returning 1 (number of elements chomped) so it tried to kill init.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-07 Thread Alvaro Herrera
Michael Paquier wrote:
> On Thu, Dec 3, 2015 at 6:59 AM, Alvaro Herrera wrote:
> > I didn't push the changed for config_default you requested a few
> > messages upthread; it's not clear to me how setting it to undef affects
> > the whole thing.  If setting it to undef makes the MSVC toolchain run
> > the tap tests in the default config, then I can do it; let's be clear
> > about what branch to backpatch this to.  Also the "1;" at the end of
> > RewindTest.
> 
> Setting it to undef will prevent the tests to run, per vcregress.pl:
> die "Tap tests not enabled in configuration"
>   unless $config->{tap_tests};
> Also, setting it to undef will match the existing behavior on
> platforms where ./configure is used because the switch
> --enable-tap-tests needs to be used there. And I would believe that in
> most cases Windows environments are not going to have IPC::Run
> deployed.

But if I don't set it to anything, then it will be "initialized" as
undef, so it has the same effect.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-07 Thread Tom Lane
Alvaro Herrera  writes:
> Michael Paquier wrote:
>> We would definitely want something within the ephemeral port range, so
>> we are up to that:
>> rand() * 16384 + 49152;

> Yes, this seems to produce the correct range.

Speaking of ephemeral port range ... if get_new_node() were to run
past port 65535, which is certainly possible with this new code,
it would fail altogether.  Seems it needs to wrap around properly
within that port range.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-07 Thread Alvaro Herrera
Tom Lane wrote:
> Alvaro Herrera  writes:
> > Michael Paquier wrote:
> >> We would definitely want something within the ephemeral port range, so
> >> we are up to that:
> >> rand() * 16384 + 49152;
> 
> > Yes, this seems to produce the correct range.
> 
> Speaking of ephemeral port range ... if get_new_node() were to run
> past port 65535, which is certainly possible with this new code,
> it would fail altogether.  Seems it needs to wrap around properly
> within that port range.

Oh, of course.  Pushed fix.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-07 Thread Michael Paquier
On Tue, Dec 8, 2015 at 7:46 AM, Alvaro Herrera 
wrote:

> Michael Paquier wrote:
> > On Thu, Dec 3, 2015 at 6:59 AM, Alvaro Herrera wrote:
> > > I didn't push the changed for config_default you requested a few
> > > messages upthread; it's not clear to me how setting it to undef affects
> > > the whole thing.  If setting it to undef makes the MSVC toolchain run
> > > the tap tests in the default config, then I can do it; let's be clear
> > > about what branch to backpatch this to.  Also the "1;" at the end of
> > > RewindTest.
> >
> > Setting it to undef will prevent the tests to run, per vcregress.pl:
> > die "Tap tests not enabled in configuration"
> >   unless $config->{tap_tests};
> > Also, setting it to undef will match the existing behavior on
> > platforms where ./configure is used because the switch
> > --enable-tap-tests needs to be used there. And I would believe that in
> > most cases Windows environments are not going to have IPC::Run
> > deployed.
>
> But if I don't set it to anything, then it will be "initialized" as
> undef, so it has the same effect.
>

Yes, it will have the same effect. Still it is a problem to not list it in
default_config.pl as the other options, no? And that's as well the case
with GetFakeConfigure, which should list it I think for consistency with
the rest. See the attached for example.
-- 
Michael


20151208_tap_msvc_fix.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-06 Thread Noah Misch
On Wed, Dec 02, 2015 at 06:59:09PM -0300, Alvaro Herrera wrote:
> It seemed better to me to have the import list be empty, i.e. "use
> TestLib ()" and then qualify the routine names inside PostgresNode,
> instead of having to list the names of the routines to import, so I
> pushed it that way after running the tests a few more times.

I inspected commit 1caef31:

> --- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
> +++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl

> -like($recovery_conf, qr/^standby_mode = 'on[']$/m, 'recovery.conf sets 
> standby_mode');
> -like($recovery_conf, qr/^primary_conninfo = '.*port=$ENV{PGPORT}.*'$/m, 
> 'recovery.conf sets primary_conninfo');
> -
> -command_ok([ 'pg_basebackup', '-D', "$tempdir/backupxf", '-X', 'fetch' ],
> +like(
> + $recovery_conf,
> + qr/^standby_mode = 'on[']$/m,
> + 'recovery.conf sets standby_mode');
> +like(
> + $recovery_conf,
> + qr/^primary_conninfo = '.*port=$ENV{PGPORT}.*'$/m,
> + 'recovery.conf sets primary_conninfo');

This now elicits a diagnostic:

  Use of uninitialized value $ENV{"PGPORT"} in regexp compilation at 
t/010_pg_basebackup.pl line 175, <> line 1.

> --- a/src/bin/pg_controldata/t/001_pg_controldata.pl
> +++ b/src/bin/pg_controldata/t/001_pg_controldata.pl

> -standard_initdb "$tempdir/data";
> -command_like([ 'pg_controldata', "$tempdir/data" ],
> +
> +my $node = get_new_node();
> +$node->init;
> +$node->start;

Before the commit, this test did not start a server.

> --- /dev/null
> +++ b/src/test/perl/PostgresNode.pm

> +sub _update_pid
> +{
> + my $self = shift;
> +
> + # If we can open the PID file, read its first line and that's the PID we
> + # want.  If the file cannot be opened, presumably the server is not
> + # running; don't be noisy in that case.
> + open my $pidfile, $self->data_dir . "/postmaster.pid";
> + if (not defined $pidfile)

$ grep -h 'at /.*line' src/bin/*/tmp_check/log/* |sort|uniq -c
  1 cannot remove directory for 
/data/nmisch/src/pg/postgresql/src/bin/scripts/tmp_testNNCZ: Directory not 
empty at /home/nmisch/sw/cpan/lib/perl5/File/Temp.pm line 784
  1 cannot remove directory for 
/data/nmisch/src/pg/postgresql/src/bin/scripts/tmp_testNNCZ/pgdata/base/13264: 
Directory not empty at /home/nmisch/sw/cpan/lib/perl5/File/Temp.pm line 784
  1 cannot remove directory for 
/data/nmisch/src/pg/postgresql/src/bin/scripts/tmp_testNNCZ/pgdata/base: 
Directory not empty at /home/nmisch/sw/cpan/lib/perl5/File/Temp.pm line 784
  1 cannot remove directory for 
/data/nmisch/src/pg/postgresql/src/bin/scripts/tmp_testNNCZ/pgdata: Directory 
not empty at /home/nmisch/sw/cpan/lib/perl5/File/Temp.pm line 784
 28 readline() on closed filehandle $pidfile at 
/data/nmisch/src/pg/postgresql/src/bin/pg_rewind/../../../src/test/perl/PostgresNode.pm
 line 308.
 28 Use of uninitialized value in concatenation (.) or string at 
/data/nmisch/src/pg/postgresql/src/bin/pg_rewind/../../../src/test/perl/PostgresNode.pm
 line 309.

$pidfile is always defined.  Test the open() return value.

> + {
> + $self->{_pid} = undef;
> + print "# No postmaster PID\n";
> + return;
> + }
> +
> + $self->{_pid} = <$pidfile>;

chomp() that value to remove its trailing newline.

> + print "# Postmaster PID is $self->{_pid}\n";
> + close $pidfile;
> +}

> + my $devnull = $TestLib::windows_os ? "nul" : "/dev/null";

Unused variable.

> +sub DESTROY
> +{
> + my $self = shift;
> + return if not defined $self->{_pid};
> + print "# signalling QUIT to $self->{_pid}\n";
> + kill 'QUIT', $self->{_pid};

On Windows, that kill() is the wrong thing.  I suspect "pg_ctl kill" will be
the right thing, but that warrants specific testing.


Postmaster log file names became less informative.  Before the commit:

-rw---. 1 nmisch nmisch 211265 2015-12-06 22:35:59.931114215 + 
regress_log_001_basic
-rw---. 1 nmisch nmisch 268887 2015-12-06 22:36:21.871165951 + 
regress_log_002_databases
-rw---. 1 nmisch nmisch 206808 2015-12-06 22:36:41.861213736 + 
regress_log_003_extrafiles
-rw---. 1 nmisch nmisch   7464 2015-12-06 22:37:00.371256795 + 
master.log
-rw---. 1 nmisch nmisch   6648 2015-12-06 22:37:01.381259211 + 
standby.log
-rw---. 1 nmisch nmisch 208561 2015-12-06 22:37:02.381261374 + 
regress_log_004_pg_xlog_symlink

After:

-rw---. 1 nmisch nmisch 219581 2015-12-06 23:00:50.504643175 + 
regress_log_001_basic
-rw---. 1 nmisch nmisch 276315 2015-12-06 23:01:12.924697085 + 
regress_log_002_databases
-rw---. 1 nmisch nmisch 213940 2015-12-06 23:01:33.574747195 + 
regress_log_003_extrafiles
-rw---. 1 nmisch nmisch   4114 2015-12-06 23:01:40.914764850 + 
node_57834.log
-rw---. 1 nmisch nmisch   6166 2015-12-06 23:01:43.184770615 + 
node_57833.log
-rw---. 1 nmisch nmisch   5550 2015-12-06 23:01:52.504792997 + 
node_57835.log

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-06 Thread Michael Paquier
On Mon, Dec 7, 2015 at 12:06 PM, Noah Misch  wrote:
> On Wed, Dec 02, 2015 at 06:59:09PM -0300, Alvaro Herrera wrote:
>> --- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
>> +++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
>
>> -like($recovery_conf, qr/^standby_mode = 'on[']$/m, 'recovery.conf sets 
>> standby_mode');
>> -like($recovery_conf, qr/^primary_conninfo = '.*port=$ENV{PGPORT}.*'$/m, 
>> 'recovery.conf sets primary_conninfo');
>> -
>> -command_ok([ 'pg_basebackup', '-D', "$tempdir/backupxf", '-X', 'fetch' ],
>> +like(
>> + $recovery_conf,
>> + qr/^standby_mode = 'on[']$/m,
>> + 'recovery.conf sets standby_mode');
>> +like(
>> + $recovery_conf,
>> + qr/^primary_conninfo = '.*port=$ENV{PGPORT}.*'$/m,
>> + 'recovery.conf sets primary_conninfo');
>
> This now elicits a diagnostic:
>
>   Use of uninitialized value $ENV{"PGPORT"} in regexp compilation at 
> t/010_pg_basebackup.pl line 175, <> line 1.

Fixed.

>> --- a/src/bin/pg_controldata/t/001_pg_controldata.pl
>> +++ b/src/bin/pg_controldata/t/001_pg_controldata.pl
>
>> -standard_initdb "$tempdir/data";
>> -command_like([ 'pg_controldata', "$tempdir/data" ],
>> +
>> +my $node = get_new_node();
>> +$node->init;
>> +$node->start;
>
> Before the commit, this test did not start a server.

Fixed.

>> --- /dev/null
>> +++ b/src/test/perl/PostgresNode.pm
>
>> +sub _update_pid
>> +{
>> + my $self = shift;
>> +
>> + # If we can open the PID file, read its first line and that's the PID 
>> we
>> + # want.  If the file cannot be opened, presumably the server is not
>> + # running; don't be noisy in that case.
>> + open my $pidfile, $self->data_dir . "/postmaster.pid";
>> + if (not defined $pidfile)
>
> $ grep -h 'at /.*line' src/bin/*/tmp_check/log/* |sort|uniq -c
>   1 cannot remove directory for 
> /data/nmisch/src/pg/postgresql/src/bin/scripts/tmp_testNNCZ: Directory not 
> empty at /home/nmisch/sw/cpan/lib/perl5/File/Temp.pm line 784
>   1 cannot remove directory for 
> /data/nmisch/src/pg/postgresql/src/bin/scripts/tmp_testNNCZ/pgdata/base/13264:
>  Directory not empty at /home/nmisch/sw/cpan/lib/perl5/File/Temp.pm line 784
>   1 cannot remove directory for 
> /data/nmisch/src/pg/postgresql/src/bin/scripts/tmp_testNNCZ/pgdata/base: 
> Directory not empty at /home/nmisch/sw/cpan/lib/perl5/File/Temp.pm line 784
>   1 cannot remove directory for 
> /data/nmisch/src/pg/postgresql/src/bin/scripts/tmp_testNNCZ/pgdata: Directory 
> not empty at /home/nmisch/sw/cpan/lib/perl5/File/Temp.pm line 784
>  28 readline() on closed filehandle $pidfile at 
> /data/nmisch/src/pg/postgresql/src/bin/pg_rewind/../../../src/test/perl/PostgresNode.pm
>  line 308.
>  28 Use of uninitialized value in concatenation (.) or string at 
> /data/nmisch/src/pg/postgresql/src/bin/pg_rewind/../../../src/test/perl/PostgresNode.pm
>  line 309.
>
> $pidfile is always defined.  Test the open() return value.

That's something I have addressed here:
http://www.postgresql.org/message-id/cab7npqtop28zxv_sxfo2axgjoesfvllmo6syddafv0duvsf...@mail.gmail.com
I am including the fix as well here to do a group shot.

>> + {
>> + $self->{_pid} = undef;
>> + print "# No postmaster PID\n";
>> + return;
>> + }
>> +
>> + $self->{_pid} = <$pidfile>;
>
> chomp() that value to remove its trailing newline.

Right.

>> + print "# Postmaster PID is $self->{_pid}\n";
>> + close $pidfile;
>> +}
>
>> + my $devnull = $TestLib::windows_os ? "nul" : "/dev/null";
>
> Unused variable.

Removed.

>> +sub DESTROY
>> +{
>> + my $self = shift;
>> + return if not defined $self->{_pid};
>> + print "# signalling QUIT to $self->{_pid}\n";
>> + kill 'QUIT', $self->{_pid};
>
> On Windows, that kill() is the wrong thing.  I suspect "pg_ctl kill" will be
> the right thing, but that warrants specific testing.

I don't directly see any limitation with the use of kill on Windows..
http://perldoc.perl.org/functions/kill.html
But indeed using directly pg_ctl kill seems like a better fit for the
PG infrastructure.

> Postmaster log file names became less informative.  Before the commit:
> Should nodes get a name, so we instead see master_57834.log?

I am not sure that this is necessary. There is definitely a limitation
regarding the fact that log files of nodes get overwritten after each
test, hence I would tend with just appending the test name in front of
the node_* files similarly to the other files.

> See also the things Tom Lane identified in <27550.1449342...@sss.pgh.pa.us>.

Yep, I marked this email as something to address when it was sent.

On Sun, Dec 6, 2015 at 4:09 AM, Tom Lane  wrote:
> BTW, if we consider that gospel, why has PostgresNode.pm got this in its
> BEGIN block?
>
> # PGHOST is set once and for all through a single series of tests when
> # this module is loaded.
> $test_pghost =
>   

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-05 Thread Tom Lane
Andrew Dunstan  writes:
> On 11/29/2015 04:28 PM, Noah Misch wrote:
>> Never mutate the filesystem in a BEGIN block, because "perl -c" runs BEGIN
>> blocks.  (Likewise for the BEGIN block this patch adds to TestLib.)

> Yeah, those two lines might belong in an INIT block. "perldoc perlmod" 
> for details.

BTW, if we consider that gospel, why has PostgresNode.pm got this in its
BEGIN block?

# PGHOST is set once and for all through a single series of tests when
# this module is loaded.
$test_pghost =
  $TestLib::windows_os ? "127.0.0.1" : TestLib::tempdir_short;
$ENV{PGHOST} = $test_pghost;

On non-Windows machines, the call of tempdir_short will result in
filesystem activity, ie creation of a directory.  Offhand it looks like
all of the activity in this BEGIN block could go to an INIT block instead.

I'm also bemused by why there was any question about this being wrong:

# XXX: Should this use PG_VERSION_NUM?
$last_port_assigned = 90600 % 16384 + 49152;

If that's not a hard-coded PG version number then I don't know
what it is.  Maybe it would be better to use random() instead,
but surely this isn't good as-is.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-05 Thread Michael Paquier
On Sat, Dec 5, 2015 at 1:11 AM, Alvaro Herrera  wrote:
> Michael Paquier wrote:
>
>> By the way, if there are no objections, I am going to mark this patch
>> as committed in the CF app. Putting in the infrastructure is already a
>> good step forward, and I will add an entry in next CF to track the
>> patch to add tests for recovery itself. Alvaro, what do you think?
>
> Feel free to do that, but I'm likely to look into it before the next CF
> anyway.  The thread about this has been open since 2013, so that doesn't
> seem unfair.

Let's see then. For the time being I have marked this patch as
committed, and created a new entry for the set of tests regarding
standbys:
https://commitfest.postgresql.org/8/438/
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-04 Thread Alvaro Herrera
Michael Paquier wrote:

> By the way, if there are no objections, I am going to mark this patch
> as committed in the CF app. Putting in the infrastructure is already a
> good step forward, and I will add an entry in next CF to track the
> patch to add tests for recovery itself. Alvaro, what do you think?

Feel free to do that, but I'm likely to look into it before the next CF
anyway.  The thread about this has been open since 2013, so that doesn't
seem unfair.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-04 Thread Alvaro Herrera
Noah Misch wrote:

>   git checkout 807b9e0
>   (find src -name \*.pl -o -name \*.pm ) | sort -u | xargs perltidy 
> --profile=src/tools/pgindent/perltidyrc
> 
> perltidy v20090616 leaves the working directory clean, but perltidy v20150815
> introduces diffs:
> 
>  src/backend/catalog/genbki.pl| 15 ---
>  src/bin/pg_basebackup/t/010_pg_basebackup.pl |  2 +-
>  src/tools/msvc/Install.pm|  6 +++---
>  src/tools/msvc/Mkvcbuild.pm  |  2 +-
>  src/tools/msvc/Project.pm|  2 +-
>  src/tools/msvc/Solution.pm   |  5 ++---
>  src/tools/msvc/gendef.pl |  4 ++--
>  7 files changed, 18 insertions(+), 18 deletions(-)
> 
> You see a different result?

Oh, you're right -- on that commit, I get the same results as you with
v20090616 (no changes).  I don't have v20150815; the version packaged by
Debian is v20140328, and Install.pm is not changed by that one (so I
only get 15 lines changed, not 18).

I think my confusion stems from code that was introduced after the last
pgindent run.  I guess I could have tidied the original files prior to
patching.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-03 Thread Michael Paquier
On Thu, Dec 3, 2015 at 3:44 PM, Michael Paquier
 wrote:
> On Thu, Dec 3, 2015 at 6:59 AM, Alvaro Herrera wrote:
>> I didn't push the changed for config_default you requested a few
>> messages upthread; it's not clear to me how setting it to undef affects
>> the whole thing.  If setting it to undef makes the MSVC toolchain run
>> the tap tests in the default config, then I can do it; let's be clear
>> about what branch to backpatch this to.  Also the "1;" at the end of
>> RewindTest.
>
>
> Setting it to undef will prevent the tests to run, per vcregress.pl:
> die "Tap tests not enabled in configuration"
>   unless $config->{tap_tests};
> Also, setting it to undef will match the existing behavior on
> platforms where ./configure is used because the switch
> --enable-tap-tests needs to be used there. And I would believe that in
> most cases Windows environments are not going to have IPC::Run
> deployed.
>
> I have also rebased the recovery test suite as the attached, using the
> infrastructure that has been committed lately.

By the way, if there are no objections, I am going to mark this patch
as committed in the CF app. Putting in the infrastructure is already a
good step forward, and I will add an entry in next CF to track the
patch to add tests for recovery itself. Alvaro, what do you think?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-03 Thread Noah Misch
On Fri, Dec 04, 2015 at 02:47:36PM +0900, Michael Paquier wrote:
> On Fri, Dec 4, 2015 at 2:43 PM, Noah Misch  wrote:
> > On Wed, Dec 02, 2015 at 04:33:50PM -0300, Alvaro Herrera wrote:
> >> How did you figure
> >> that that was the version used, anyway?
> >
> > I asked Bruce at one point.
> 
> So we are trying to use the same version over the years to keep code
> consistent across back-branches?

No, I recall no policy discussion concerning this.

> Do you think we should try to use a
> newer version instead with each pgindent run? That would induce a
> rebasing cost when back-patching, but we cannot stay with the same
> version of perltidy forever either...

No.  I expect we'll implicitly change perltidy versions when Bruce upgrades
his OS.  Assuming future perltidy changes affect us not much more than past
changes (six years of perltidy development changed eighteen PostgreSQL source
lines), that's just fine.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-03 Thread Michael Paquier
On Fri, Dec 4, 2015 at 2:43 PM, Noah Misch  wrote:
> On Wed, Dec 02, 2015 at 04:33:50PM -0300, Alvaro Herrera wrote:
>> How did you figure
>> that that was the version used, anyway?
>
> I asked Bruce at one point.

So we are trying to use the same version over the years to keep code
consistent across back-branches? Do you think we should try to use a
newer version instead with each pgindent run? That would induce a
rebasing cost when back-patching, but we cannot stay with the same
version of perltidy forever either...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-03 Thread Noah Misch
On Wed, Dec 02, 2015 at 04:33:50PM -0300, Alvaro Herrera wrote:
> Noah Misch wrote:
> > On Tue, Dec 01, 2015 at 08:11:21PM -0300, Alvaro Herrera wrote:
> > > Finally, I ran perltidy on all the files, which strangely changed stuff
> > > that I didn't expect it to change.  I wonder if this is related to the
> > > perltidy version.
> > 
> > The last pgindent run (commit 807b9e0) used perltidy v20090616, and perltidy
> > behavior has changed slightly over time.  Install that version to do your 
> > own
> > perltidy runs.
> 
> I tried that version, but it seems to emit the same.

  git checkout 807b9e0
  (find src -name \*.pl -o -name \*.pm ) | sort -u | xargs perltidy 
--profile=src/tools/pgindent/perltidyrc

perltidy v20090616 leaves the working directory clean, but perltidy v20150815
introduces diffs:

 src/backend/catalog/genbki.pl| 15 ---
 src/bin/pg_basebackup/t/010_pg_basebackup.pl |  2 +-
 src/tools/msvc/Install.pm|  6 +++---
 src/tools/msvc/Mkvcbuild.pm  |  2 +-
 src/tools/msvc/Project.pm|  2 +-
 src/tools/msvc/Solution.pm   |  5 ++---
 src/tools/msvc/gendef.pl |  4 ++--
 7 files changed, 18 insertions(+), 18 deletions(-)

You see a different result?

> How did you figure
> that that was the version used, anyway?

I asked Bruce at one point.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-03 Thread Michael Paquier
On Thu, Dec 3, 2015 at 6:59 AM, Alvaro Herrera  wrote:
> I didn't push the changes for config_default you requested a few
> messages upthread; it's not clear to me how setting it to undef affects
> the whole thing.  If setting it to undef makes the MSVC toolchain run
> the tap tests in the default config, then I can do it; let's be clear
> about what branch to backpatch this to.  Also the "1;" at the end of
> RewindTest.

Setting it to undef will prevent the tests to run, per vcregress.pl:
sub tap_check
{
die "Tap tests not enabled in configuration"
  unless $config->{tap_tests};
Also, setting it to undef will match the existing behavior on
platforms where ./configure is used because the switch
--enable-tap-tests needs to be used there. And I would believe that in
most cases Windows environments are not going to have IPC::Run
deployed.

I have also rebased the recovery test suite as the attached, using the
infrastructure that has been committed lately.
Regards,
-- 
Michael
From de2121eeb50c5ae49b29a2ac21a16579eae2de98 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Thu, 3 Dec 2015 13:48:48 +0900
Subject: [PATCH 1/2] Fix tap_test configuration in MSVC builds

---
 src/tools/msvc/config_default.pl | 27 ++-
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/src/tools/msvc/config_default.pl b/src/tools/msvc/config_default.pl
index b9f2ff4..e50be7e 100644
--- a/src/tools/msvc/config_default.pl
+++ b/src/tools/msvc/config_default.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 
 our $config = {
-	asserts => 0,# --enable-cassert
+	asserts  => 0,# --enable-cassert
 	  # integer_datetimes=>1,   # --enable-integer-datetimes - on is now default
 	  # float4byval=>1, # --disable-float4-byval, on by default
 
@@ -13,18 +13,19 @@ our $config = {
 	# blocksize => 8, # --with-blocksize, 8kB by default
 	# wal_blocksize => 8, # --with-wal-blocksize, 8kB by default
 	# wal_segsize => 16,  # --with-wal-segsize, 16MB by default
-	ldap => 1,# --with-ldap
-	extraver => undef,# --with-extra-version=
-	nls  => undef,# --enable-nls=
-	tcl  => undef,# --with-tls=
-	perl => undef,# --with-perl
-	python   => undef,# --with-python=
-	openssl  => undef,# --with-openssl=
-	uuid => undef,# --with-ossp-uuid
-	xml  => undef,# --with-libxml=
-	xslt => undef,# --with-libxslt=
-	iconv=> undef,# (not in configure, path to iconv)
-	zlib => undef # --with-zlib=
+	ldap  => 1,# --with-ldap
+	extraver  => undef,# --with-extra-version=
+	nls   => undef,# --enable-nls=
+	tap_tests => undef,# --enable-tap-tests
+	tcl   => undef,# --with-tls=
+	perl  => undef,# --with-perl
+	python=> undef,# --with-python=
+	openssl   => undef,# --with-openssl=
+	uuid  => undef,# --with-ossp-uuid
+	xml   => undef,# --with-libxml=
+	xslt  => undef,# --with-libxslt=
+	iconv => undef,# (not in configure, path to iconv)
+	zlib  => undef # --with-zlib=
 };
 
 1;
-- 
2.6.3

From 4ee2a99db2a414f85e961110279f4b97309cd927 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Thu, 3 Dec 2015 15:41:12 +0900
Subject: [PATCH 2/2] Add recovery test suite

This includes basic tests maipulating standbys, be they archiving or
streaming nodes, and some basic sanity checks around them.
---
 src/test/Makefile   |   2 +-
 src/test/perl/RecoveryTest.pm   | 167 
 src/test/recovery/.gitignore|   3 +
 src/test/recovery/Makefile  |  17 +++
 src/test/recovery/README|  19 
 src/test/recovery/t/001_stream_rep.pl   |  58 ++
 src/test/recovery/t/002_archiving.pl|  43 +++
 src/test/recovery/t/003_recovery_targets.pl | 123 
 src/test/recovery/t/004_timeline_switch.pl  |  66 +++
 src/test/recovery/t/005_replay_delay.pl |  41 +++
 10 files changed, 538 insertions(+), 1 deletion(-)
 create mode 100644 src/test/perl/RecoveryTest.pm
 create mode 100644 src/test/recovery/.gitignore
 create mode 100644 src/test/recovery/Makefile
 create mode 100644 src/test/recovery/README
 create mode 100644 src/test/recovery/t/001_stream_rep.pl
 create mode 100644 src/test/recovery/t/002_archiving.pl
 create mode 100644 src/test/recovery/t/003_recovery_targets.pl
 create mode 100644 src/test/recovery/t/004_timeline_switch.pl
 create mode 100644 src/test/recovery/t/005_replay_delay.pl

diff --git a/src/test/Makefile b/src/test/Makefile
index b713c2c..7f7754f 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,7 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = regress isolation modules
+SUBDIRS = regress isolation modules recovery
 
 # We don't build or 

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-02 Thread Alvaro Herrera
Noah Misch wrote:
> On Tue, Dec 01, 2015 at 08:11:21PM -0300, Alvaro Herrera wrote:
> > Finally, I ran perltidy on all the files, which strangely changed stuff
> > that I didn't expect it to change.  I wonder if this is related to the
> > perltidy version.
> 
> The last pgindent run (commit 807b9e0) used perltidy v20090616, and perltidy
> behavior has changed slightly over time.  Install that version to do your own
> perltidy runs.

I tried that version, but it seems to emit the same.  How did you figure
that that was the version used, anyway?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-02 Thread Michael Paquier
On Thu, Dec 3, 2015 at 6:59 AM, Alvaro Herrera wrote:
> I didn't push the changed for config_default you requested a few
> messages upthread; it's not clear to me how setting it to undef affects
> the whole thing.  If setting it to undef makes the MSVC toolchain run
> the tap tests in the default config, then I can do it; let's be clear
> about what branch to backpatch this to.  Also the "1;" at the end of
> RewindTest.


Setting it to undef will prevent the tests to run, per vcregress.pl:
die "Tap tests not enabled in configuration"
  unless $config->{tap_tests};
Also, setting it to undef will match the existing behavior on
platforms where ./configure is used because the switch
--enable-tap-tests needs to be used there. And I would believe that in
most cases Windows environments are not going to have IPC::Run
deployed.

I have also rebased the recovery test suite as the attached, using the
infrastructure that has been committed lately.
Regards,
--
Michael
From de2121eeb50c5ae49b29a2ac21a16579eae2de98 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Thu, 3 Dec 2015 13:48:48 +0900
Subject: [PATCH 1/2] Fix tap_test configuration in MSVC builds

---
 src/tools/msvc/config_default.pl | 27 ++-
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/src/tools/msvc/config_default.pl b/src/tools/msvc/config_default.pl
index b9f2ff4..e50be7e 100644
--- a/src/tools/msvc/config_default.pl
+++ b/src/tools/msvc/config_default.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 
 our $config = {
-	asserts => 0,# --enable-cassert
+	asserts  => 0,# --enable-cassert
 	  # integer_datetimes=>1,   # --enable-integer-datetimes - on is now default
 	  # float4byval=>1, # --disable-float4-byval, on by default
 
@@ -13,18 +13,19 @@ our $config = {
 	# blocksize => 8, # --with-blocksize, 8kB by default
 	# wal_blocksize => 8, # --with-wal-blocksize, 8kB by default
 	# wal_segsize => 16,  # --with-wal-segsize, 16MB by default
-	ldap => 1,# --with-ldap
-	extraver => undef,# --with-extra-version=
-	nls  => undef,# --enable-nls=
-	tcl  => undef,# --with-tls=
-	perl => undef,# --with-perl
-	python   => undef,# --with-python=
-	openssl  => undef,# --with-openssl=
-	uuid => undef,# --with-ossp-uuid
-	xml  => undef,# --with-libxml=
-	xslt => undef,# --with-libxslt=
-	iconv=> undef,# (not in configure, path to iconv)
-	zlib => undef # --with-zlib=
+	ldap  => 1,# --with-ldap
+	extraver  => undef,# --with-extra-version=
+	nls   => undef,# --enable-nls=
+	tap_tests => undef,# --enable-tap-tests
+	tcl   => undef,# --with-tls=
+	perl  => undef,# --with-perl
+	python=> undef,# --with-python=
+	openssl   => undef,# --with-openssl=
+	uuid  => undef,# --with-ossp-uuid
+	xml   => undef,# --with-libxml=
+	xslt  => undef,# --with-libxslt=
+	iconv => undef,# (not in configure, path to iconv)
+	zlib  => undef # --with-zlib=
 };
 
 1;
-- 
2.6.3

From 4ee2a99db2a414f85e961110279f4b97309cd927 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Thu, 3 Dec 2015 15:41:12 +0900
Subject: [PATCH 2/2] Add recovery test suite

This includes basic tests maipulating standbys, be they archiving or
streaming nodes, and some basic sanity checks around them.
---
 src/test/Makefile   |   2 +-
 src/test/perl/RecoveryTest.pm   | 167 
 src/test/recovery/.gitignore|   3 +
 src/test/recovery/Makefile  |  17 +++
 src/test/recovery/README|  19 
 src/test/recovery/t/001_stream_rep.pl   |  58 ++
 src/test/recovery/t/002_archiving.pl|  43 +++
 src/test/recovery/t/003_recovery_targets.pl | 123 
 src/test/recovery/t/004_timeline_switch.pl  |  66 +++
 src/test/recovery/t/005_replay_delay.pl |  41 +++
 10 files changed, 538 insertions(+), 1 deletion(-)
 create mode 100644 src/test/perl/RecoveryTest.pm
 create mode 100644 src/test/recovery/.gitignore
 create mode 100644 src/test/recovery/Makefile
 create mode 100644 src/test/recovery/README
 create mode 100644 src/test/recovery/t/001_stream_rep.pl
 create mode 100644 src/test/recovery/t/002_archiving.pl
 create mode 100644 src/test/recovery/t/003_recovery_targets.pl
 create mode 100644 src/test/recovery/t/004_timeline_switch.pl
 create mode 100644 src/test/recovery/t/005_replay_delay.pl

diff --git a/src/test/Makefile b/src/test/Makefile
index b713c2c..7f7754f 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,7 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = regress isolation modules
+SUBDIRS = regress isolation modules recovery
 
 # We don't build or execute examples/, locale/, or thread/ by 

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-02 Thread Alvaro Herrera
Michael Paquier wrote:

> Well, Alvaro has whispered me a more elegant method by using TestLib()
> to only import a portion of the routines and avoid the redefinition
> errors. Hence, patch 0001 attached creates equivalents of command_*
> for PostgresNode and tests use it without setting PGPORT. Patch 0002
> is a run of perltidy on the whole.

It seemed better to me to have the import list be empty, i.e. "use
TestLib ()" and then qualify the routine names inside PostgresNode,
instead of having to list the names of the routines to import, so I
pushed it that way after running the tests a few more times.

(Another option would be to have those routines be in EXPORT_OK instead
of EXPORT, but then every other user of TestLib would have to
explicitely declare that it wants those routines imported.  Maybe this
is a good change; if anyone wants to push for that, patches welcome.)

I didn't push the changed for config_default you requested a few
messages upthread; it's not clear to me how setting it to undef affects
the whole thing.  If setting it to undef makes the MSVC toolchain run
the tap tests in the default config, then I can do it; let's be clear
about what branch to backpatch this to.  Also the "1;" at the end of
RewindTest.

Thanks for the patch.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-01 Thread Alvaro Herrera
Michael Paquier wrote:

> OK... I have merged TestLib and PostgresNode of the previous patch
> into PostgresNode into the way suggested by Noah. TestBase has been
> renamed back to TestLib, and includes as well the base test functions
> like command_ok.

Great, thanks.  Here's one more version, hopefully the last one.

- I discovered that not setting PGPORT was causing some of the tests
  that fail (using command_fails) to fail to test what was being tested.
  The problem is that the command was failing with "could not connect to
  server" instead of failing because of trying to cluster a nonexistant
  table, etc.  Unfortunately the only way to verify this is by looking
  at the regress_log_xxx_foobar file.  Not ideal, but not this patch's
  fault.

- I changed the routines moved to PostgresNode so that they are instance
  methods, i.e. $node->poll_until_query; also psql and
  issues_query_like.  The latter also sets "local $PGPORT" so that the
  command is run with the correct port.

- It would be nice to have command_ok and command_fails in PostgresNode
  too; that would remove the need for setting $ENV{PGPORT} but it's
  possible to run commands outside a node too, so we'd need duplicates,
  which would be worse.

- I changed start/stop/restart so that they keep track of the postmaster
  PID; also added a DESTROY sub to PostgresNode that sends SIGQUIT.
  This means that when the test finishes, the server gets an immediate
  stop signal.  We were getting a lot of errors in the server log about
  failing to write to the stats file otherwise, until the node noticed
  that the datadir was gone.

- I removed the @active_nodes array, which is now unnecessary (per
  above).

- I moved the "delete $ENV{PGxxx}" back to a BEGIN block in TestLib.
  I did it because it's possible to write test scripts without
  PostgresNode, and it's not nice to have those pick up the environment.
  This still affects anything using PostgresNode because that one uses
  TestLib.

Finally, I ran perltidy on all the files, which strangely changed stuff
that I didn't expect it to change.  I wonder if this is related to the
perltidy version.  Do you see further changes if you run perltidy on the
patched tree?  This is my version:

$ perltidy --version
This is perltidy, v20140328 

Copyright 2000-2014, Steve Hancock

Perltidy is free software and may be copied under the terms of the GNU
General Public License, which is included in the distribution files.

Complete documentation for perltidy can be found using 'man perltidy'
or on the internet at http://perltidy.sourceforge.net.


-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/bin/initdb/t/001_initdb.pl b/src/bin/initdb/t/001_initdb.pl
index 299dcf5..f64186d 100644
--- a/src/bin/initdb/t/001_initdb.pl
+++ b/src/bin/initdb/t/001_initdb.pl
@@ -4,6 +4,7 @@
 
 use strict;
 use warnings;
+use PostgresNode;
 use TestLib;
 use Test::More tests => 14;
 
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index dc96bbf..86cc14e 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -2,6 +2,7 @@ use strict;
 use warnings;
 use Cwd;
 use Config;
+use PostgresNode;
 use TestLib;
 use Test::More tests => 51;
 
@@ -9,8 +10,15 @@ program_help_ok('pg_basebackup');
 program_version_ok('pg_basebackup');
 program_options_handling_ok('pg_basebackup');
 
-my $tempdir = tempdir;
-start_test_server $tempdir;
+my $tempdir = TestLib::tempdir;
+
+my $node = get_new_node();
+# Initialize node without replication settings
+$node->init(hba_permit_replication => 0);
+$node->start;
+my $pgdata = $node->data_dir;
+
+$ENV{PGPORT} = $node->port;
 
 command_fails(['pg_basebackup'],
 	'pg_basebackup needs target directory specified');
@@ -26,19 +34,19 @@ if (open BADCHARS, ">>$tempdir/pgdata/FOO\xe0\xe0\xe0BAR")
 	close BADCHARS;
 }
 
-configure_hba_for_replication "$tempdir/pgdata";
-system_or_bail 'pg_ctl', '-D', "$tempdir/pgdata", 'reload';
+$node->set_replication_conf();
+system_or_bail 'pg_ctl', '-D', $pgdata, 'reload';
 
 command_fails(
 	[ 'pg_basebackup', '-D', "$tempdir/backup" ],
 	'pg_basebackup fails because of WAL configuration');
 
-open CONF, ">>$tempdir/pgdata/postgresql.conf";
+open CONF, ">>$pgdata/postgresql.conf";
 print CONF "max_replication_slots = 10\n";
 print CONF "max_wal_senders = 10\n";
 print CONF "wal_level = archive\n";
 close CONF;
-restart_test_server;
+$node->restart;
 
 command_ok([ 'pg_basebackup', '-D', "$tempdir/backup" ],
 	'pg_basebackup runs');
@@ -81,13 +89,13 @@ command_fails(
 
 # Tar format doesn't support filenames longer than 100 bytes.
 my $superlongname = "superlongname_" . ("x" x 100);
-my $superlongpath = "$tempdir/pgdata/$superlongname";
+my $superlongpath = "$pgdata/$superlongname";
 
 open FILE, ">$superlongpath" or die "unable to create file $superlongpath";
 

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-01 Thread Michael Paquier
On Wed, Dec 2, 2015 at 8:11 AM, Alvaro Herrera  wrote:
> - I discovered that not setting PGPORT was causing some of the tests
>   that fail (using command_fails) to fail to test what was being tested.
>   The problem is that the command was failing with "could not connect to
>   server" instead of failing because of trying to cluster a nonexistant
>   table, etc.  Unfortunately the only way to verify this is by looking
>   at the regress_log_xxx_foobar file.  Not ideal, but not this patch's
>   fault.

Nice catch.

> - I changed the routines moved to PostgresNode so that they are instance
>   methods, i.e. $node->poll_until_query; also psql and
>   issues_query_like.  The latter also sets "local $PGPORT" so that the
>   command is run with the correct port.

OK.

> - It would be nice to have command_ok and command_fails in PostgresNode
>   too; that would remove the need for setting $ENV{PGPORT} but it's
>   possible to run commands outside a node too, so we'd need duplicates,
>   which would be worse.

I am fine to let it the way your patch does it. There are already many changes.

> - I removed the @active_nodes array, which is now unnecessary (per
>   above).

So that's basically replaced by @all_nodes.

> - I moved the "delete $ENV{PGxxx}" back to a BEGIN block in TestLib.
>   I did it because it's possible to write test scripts without
>   PostgresNode, and it's not nice to have those pick up the environment.
>   This still affects anything using PostgresNode because that one uses
>   TestLib.

OK.

> Finally, I ran perltidy on all the files, which strangely changed stuff
> that I didn't expect it to change.  I wonder if this is related to the
> perltidy version.  Do you see further changes if you run perltidy on the
> patched tree?

SimpleTee.pm shows some diffs, though it doesn't seem that this patch
should care about that. The rest is showing no diffs. And I have used
perltidy v20140711.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-01 Thread Alvaro Herrera
Michael Paquier wrote:
> On Wed, Dec 2, 2015 at 8:11 AM, Alvaro Herrera  
> wrote:

> > - It would be nice to have command_ok and command_fails in PostgresNode
> >   too; that would remove the need for setting $ENV{PGPORT} but it's
> >   possible to run commands outside a node too, so we'd need duplicates,
> >   which would be worse.
> 
> I am fine to let it the way your patch does it. There are already many 
> changes.

Idea: we can have a bare command_ok exported by TestLib just as
currently, and instance method PostgresNode->command_ok that first sets
local $ENV{PGPORT} and then calls the other one.

> > - I removed the @active_nodes array, which is now unnecessary (per
> >   above).
> 
> So that's basically replaced by @all_nodes.

@all_nodes is only used to look for unused port numbers.

> > Finally, I ran perltidy on all the files, which strangely changed stuff
> > that I didn't expect it to change.  I wonder if this is related to the
> > perltidy version.  Do you see further changes if you run perltidy on the
> > patched tree?
> 
> SimpleTee.pm shows some diffs, though it doesn't seem that this patch
> should care about that. The rest is showing no diffs. And I have used
> perltidy v20140711.

Yes, the patch doesn't modify SimpleTee -- I just used "find" to indent
perl files.  What I don't understand is why doesn't my perltidy run
match what was in master currently.  It should be a no-op ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-01 Thread Michael Paquier
On Wed, Dec 2, 2015 at 1:04 PM, Michael Paquier
 wrote:
> On Wed, Dec 2, 2015 at 12:01 PM, Alvaro Herrera
>  wrote:
>> Michael Paquier wrote:
>>> On Wed, Dec 2, 2015 at 8:11 AM, Alvaro Herrera  
>>> wrote:
>>
>>> > - It would be nice to have command_ok and command_fails in PostgresNode
>>> >   too; that would remove the need for setting $ENV{PGPORT} but it's
>>> >   possible to run commands outside a node too, so we'd need duplicates,
>>> >   which would be worse.
>>>
>>> I am fine to let it the way your patch does it. There are already many 
>>> changes.
>>
>> Idea: we can have a bare command_ok exported by TestLib just as
>> currently, and instance method PostgresNode->command_ok that first sets
>> local $ENV{PGPORT} and then calls the other one.
>
> Hm. That would be cleaner and make the code more consistent. Now as
> TestLib exports command_ok, command_like and command_fails, we would
> get redefinition errors when compiling the code if those routines are
> not named differently in PostgresNode. If you want to have the names
> consistent, then I guess that the only way would be to remove those
> routines from the export list of TestLib and call them directly as for
> example TestLib::command_ok(). See for example the patch attached that
> applies on top on your patch 2 that adds a set of routines in
> PostgresNode with a slightly different name.

Well, Alvaro has whispered me a more elegant method by using TestLib()
to only import a portion of the routines and avoid the redefinition
errors. Hence, patch 0001 attached creates equivalents of command_*
for PostgresNode and tests use it without setting PGPORT. Patch 0002
is a run of perltidy on the whole.
-- 
Michael
From 970423c6e5c3c30f6ecd82d8691437a2aa4acc87 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Wed, 2 Dec 2015 15:12:03 +0900
Subject: [PATCH 1/2] Rework TAP test infrastructure to ease node management

This is aimed to being used extensively by modules and features to
test features that require a bit more than a standalone node.
---
 src/bin/initdb/t/001_initdb.pl |   1 +
 src/bin/pg_basebackup/t/010_pg_basebackup.pl   | 104 +++---
 src/bin/pg_controldata/t/001_pg_controldata.pl |  11 +-
 src/bin/pg_ctl/t/001_start_stop.pl |   2 +
 src/bin/pg_ctl/t/002_status.pl |  13 +-
 src/bin/pg_rewind/RewindTest.pm| 191 +++---
 src/bin/pg_rewind/t/003_extrafiles.pl  |   3 +-
 src/bin/pg_rewind/t/004_pg_xlog_symlink.pl |   4 +-
 src/bin/scripts/t/010_clusterdb.pl |  23 +-
 src/bin/scripts/t/011_clusterdb_all.pl |  13 +-
 src/bin/scripts/t/020_createdb.pl  |  13 +-
 src/bin/scripts/t/030_createlang.pl|  21 +-
 src/bin/scripts/t/040_createuser.pl|  17 +-
 src/bin/scripts/t/050_dropdb.pl|  13 +-
 src/bin/scripts/t/060_droplang.pl  |  11 +-
 src/bin/scripts/t/070_dropuser.pl  |  13 +-
 src/bin/scripts/t/080_pg_isready.pl|   9 +-
 src/bin/scripts/t/090_reindexdb.pl |  23 +-
 src/bin/scripts/t/091_reindexdb_all.pl |  10 +-
 src/bin/scripts/t/100_vacuumdb.pl  |  17 +-
 src/bin/scripts/t/101_vacuumdb_all.pl  |  10 +-
 src/bin/scripts/t/102_vacuumdb_stages.pl   |  13 +-
 src/test/perl/PostgresNode.pm  | 471 +
 src/test/perl/RecursiveCopy.pm |  42 +++
 src/test/perl/TestLib.pm   | 317 ++---
 src/test/ssl/ServerSetup.pm|  34 +-
 src/test/ssl/t/001_ssltests.pl |  32 +-
 27 files changed, 911 insertions(+), 520 deletions(-)
 create mode 100644 src/test/perl/PostgresNode.pm
 create mode 100644 src/test/perl/RecursiveCopy.pm

diff --git a/src/bin/initdb/t/001_initdb.pl b/src/bin/initdb/t/001_initdb.pl
index 299dcf5..f64186d 100644
--- a/src/bin/initdb/t/001_initdb.pl
+++ b/src/bin/initdb/t/001_initdb.pl
@@ -4,6 +4,7 @@
 
 use strict;
 use warnings;
+use PostgresNode;
 use TestLib;
 use Test::More tests => 14;
 
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index dc96bbf..46c8a71 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -2,6 +2,7 @@ use strict;
 use warnings;
 use Cwd;
 use Config;
+use PostgresNode;
 use TestLib;
 use Test::More tests => 51;
 
@@ -9,12 +10,17 @@ program_help_ok('pg_basebackup');
 program_version_ok('pg_basebackup');
 program_options_handling_ok('pg_basebackup');
 
-my $tempdir = tempdir;
-start_test_server $tempdir;
+my $tempdir = TestLib::tempdir;
 
-command_fails(['pg_basebackup'],
+my $node = get_new_node();
+# Initialize node without replication settings
+$node->init(hba_permit_replication => 0);
+$node->start;
+my $pgdata = $node->data_dir;
+

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-01 Thread Noah Misch
On Tue, Dec 01, 2015 at 08:11:21PM -0300, Alvaro Herrera wrote:
> Finally, I ran perltidy on all the files, which strangely changed stuff
> that I didn't expect it to change.  I wonder if this is related to the
> perltidy version.

The last pgindent run (commit 807b9e0) used perltidy v20090616, and perltidy
behavior has changed slightly over time.  Install that version to do your own
perltidy runs.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-12-01 Thread Michael Paquier
On Wed, Dec 2, 2015 at 12:01 PM, Alvaro Herrera
 wrote:
> Michael Paquier wrote:
>> On Wed, Dec 2, 2015 at 8:11 AM, Alvaro Herrera  
>> wrote:
>
>> > - It would be nice to have command_ok and command_fails in PostgresNode
>> >   too; that would remove the need for setting $ENV{PGPORT} but it's
>> >   possible to run commands outside a node too, so we'd need duplicates,
>> >   which would be worse.
>>
>> I am fine to let it the way your patch does it. There are already many 
>> changes.
>
> Idea: we can have a bare command_ok exported by TestLib just as
> currently, and instance method PostgresNode->command_ok that first sets
> local $ENV{PGPORT} and then calls the other one.

Hm. That would be cleaner and make the code more consistent. Now as
TestLib exports command_ok, command_like and command_fails, we would
get redefinition errors when compiling the code if those routines are
not named differently in PostgresNode. If you want to have the names
consistent, then I guess that the only way would be to remove those
routines from the export list of TestLib and call them directly as for
example TestLib::command_ok(). See for example the patch attached that
applies on top on your patch 2 that adds a set of routines in
PostgresNode with a slightly different name.

>> > Finally, I ran perltidy on all the files, which strangely changed stuff
>> > that I didn't expect it to change.  I wonder if this is related to the
>> > perltidy version.  Do you see further changes if you run perltidy on the
>> > patched tree?
>>
>> SimpleTee.pm shows some diffs, though it doesn't seem that this patch
>> should care about that. The rest is showing no diffs. And I have used
>> perltidy v20140711.
>
> Yes, the patch doesn't modify SimpleTee -- I just used "find" to indent
> perl files.  What I don't understand is why doesn't my perltidy run
> match what was in master currently.  It should be a no-op ...

Well I don't get it either :)
I did a run on top of your two patches and saw no differences.
-- 
Michael
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index bc4afce..96dd103 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -19,11 +19,10 @@ $node->init(hba_permit_replication => 0);
 $node->start;
 my $pgdata = $node->data_dir;
 
-$ENV{PGPORT} = $node->port;
-
-command_fails(['pg_basebackup'],
+$node->node_command_fails(
+	['pg_basebackup'],
 	'pg_basebackup needs target directory specified');
-command_fails(
+$node->node_command_fails(
 	[ 'pg_basebackup', '-D', "$tempdir/backup" ],
 	'pg_basebackup fails because of hba');
 
@@ -38,7 +37,7 @@ if (open BADCHARS, ">>$tempdir/pgdata/FOO\xe0\xe0\xe0BAR")
 $node->set_replication_conf();
 system_or_bail 'pg_ctl', '-D', $pgdata, 'reload';
 
-command_fails(
+$node->node_command_fails(
 	[ 'pg_basebackup', '-D', "$tempdir/backup" ],
 	'pg_basebackup fails because of WAL configuration');
 
@@ -49,7 +48,7 @@ print CONF "wal_level = archive\n";
 close CONF;
 $node->restart;
 
-command_ok([ 'pg_basebackup', '-D', "$tempdir/backup" ],
+$node->node_command_ok([ 'pg_basebackup', '-D', "$tempdir/backup" ],
 	'pg_basebackup runs');
 ok(-f "$tempdir/backup/PG_VERSION", 'backup was created');
 
@@ -58,34 +57,34 @@ is_deeply(
 	[ sort qw(. .. archive_status) ],
 	'no WAL files copied');
 
-command_ok(
+$node->node_command_ok(
 	[   'pg_basebackup', '-D', "$tempdir/backup2", '--xlogdir',
 		"$tempdir/xlog2" ],
 	'separate xlog directory');
 ok(-f "$tempdir/backup2/PG_VERSION", 'backup was created');
 ok(-d "$tempdir/xlog2/", 'xlog directory was created');
 
-command_ok([ 'pg_basebackup', '-D', "$tempdir/tarbackup", '-Ft' ],
+$node->node_command_ok([ 'pg_basebackup', '-D', "$tempdir/tarbackup", '-Ft' ],
 	'tar format');
 ok(-f "$tempdir/tarbackup/base.tar", 'backup tar was created');
 
-command_fails(
+$node->node_command_fails(
 	[ 'pg_basebackup', '-D', "$tempdir/backup_foo", '-Fp', "-T=/foo" ],
 	'-T with empty old directory fails');
-command_fails(
+$node->node_command_fails(
 	[ 'pg_basebackup', '-D', "$tempdir/backup_foo", '-Fp', "-T/foo=" ],
 	'-T with empty new directory fails');
-command_fails(
+$node->node_command_fails(
 	[   'pg_basebackup', '-D', "$tempdir/backup_foo", '-Fp',
 		"-T/foo=/bar=/baz" ],
 	'-T with multiple = fails');
-command_fails(
+$node->node_command_fails(
 	[ 'pg_basebackup', '-D', "$tempdir/backup_foo", '-Fp', "-Tfoo=/bar" ],
 	'-T with old directory not absolute fails');
-command_fails(
+$node->node_command_fails(
 	[ 'pg_basebackup', '-D', "$tempdir/backup_foo", '-Fp', "-T/foo=bar" ],
 	'-T with new directory not absolute fails');
-command_fails(
+$node->node_command_fails(
 	[ 'pg_basebackup', '-D', "$tempdir/backup_foo", '-Fp', "-Tfoo" ],
 	'-T with invalid format fails');
 
@@ -95,7 +94,7 @@ my $superlongpath = "$pgdata/$superlongname";
 
 open FILE, ">$superlongpath" or die "unable 

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-30 Thread Michael Paquier
On Mon, Nov 30, 2015 at 11:51 PM, Alvaro Herrera
 wrote:
> Michael Paquier wrote:
>> On Mon, Nov 30, 2015 at 6:28 AM, Noah Misch  wrote:
>
>> > The proposed code is short on guidance about when to put a function in 
>> > TestLib
>> > versus TestBase.  TestLib has no header comment.  The TestBase header 
>> > comment
>> > would permit, for example, command_ok() in that module.  I would try 
>> > instead
>> > keeping TestLib as the base module and moving into PostgresNode the 
>> > functions
>> > that deal with PostgreSQL clusters (get_new_node teardown_node psql
>> > poll_query_until issues_sql_like).
>>
>> PostgresNode is wanted to be a base representation of how of node is,
>> not of how to operate on it. The ways to perform the tests, which
>> works on a node, is wanted as a higher-level operation.
>>
>> Logging and base configuration of a test set is a lower level of
>> operations than PostgresNode, because cluster nodes need actually to
>> perform system calls, some of those system calls like run_log allowing
>> to log in the centralized log file. I have tried to make the headers
>> of those modules more verbose, please see attached.
>
> I'm not terribly convinced by this argument TBH.  Perhaps we can have
> PostgresNode be one package, and the logging routines be another
> package, and we create a higher-level package whose @ISA=(PostgresNode,
> LoggingWhatever) and then we move the routines suggested by Noah into
> that new package.  Then the tests use that instead of PostgresNode
> directly.

OK... I have merged TestLib and PostgresNode of the previous patch
into PostgresNode into the way suggested by Noah. TestBase has been
renamed back to TestLib, and includes as well the base test functions
like command_ok.
Regards,
-- 
Michael
diff --git a/src/bin/initdb/t/001_initdb.pl b/src/bin/initdb/t/001_initdb.pl
index 299dcf5..f64186d 100644
--- a/src/bin/initdb/t/001_initdb.pl
+++ b/src/bin/initdb/t/001_initdb.pl
@@ -4,6 +4,7 @@
 
 use strict;
 use warnings;
+use PostgresNode;
 use TestLib;
 use Test::More tests => 14;
 
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index dc96bbf..c7eb250 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -2,6 +2,7 @@ use strict;
 use warnings;
 use Cwd;
 use Config;
+use PostgresNode;
 use TestLib;
 use Test::More tests => 51;
 
@@ -9,8 +10,15 @@ program_help_ok('pg_basebackup');
 program_version_ok('pg_basebackup');
 program_options_handling_ok('pg_basebackup');
 
-my $tempdir = tempdir;
-start_test_server $tempdir;
+my $tempdir = TestLib::tempdir;
+
+my $node = get_new_node();
+# Initialize node without replication settings
+$node->init(hba_permit_replication => 0);
+$node->start;
+my $pgdata = $node->data_dir;
+
+$ENV{PGPORT} = $node->port;
 
 command_fails(['pg_basebackup'],
 	'pg_basebackup needs target directory specified');
@@ -26,19 +34,19 @@ if (open BADCHARS, ">>$tempdir/pgdata/FOO\xe0\xe0\xe0BAR")
 	close BADCHARS;
 }
 
-configure_hba_for_replication "$tempdir/pgdata";
-system_or_bail 'pg_ctl', '-D', "$tempdir/pgdata", 'reload';
+$node->set_replication_conf();
+system_or_bail 'pg_ctl', '-D', $pgdata, 'reload';
 
 command_fails(
 	[ 'pg_basebackup', '-D', "$tempdir/backup" ],
 	'pg_basebackup fails because of WAL configuration');
 
-open CONF, ">>$tempdir/pgdata/postgresql.conf";
+open CONF, ">>$pgdata/postgresql.conf";
 print CONF "max_replication_slots = 10\n";
 print CONF "max_wal_senders = 10\n";
 print CONF "wal_level = archive\n";
 close CONF;
-restart_test_server;
+$node->restart;
 
 command_ok([ 'pg_basebackup', '-D', "$tempdir/backup" ],
 	'pg_basebackup runs');
@@ -81,13 +89,13 @@ command_fails(
 
 # Tar format doesn't support filenames longer than 100 bytes.
 my $superlongname = "superlongname_" . ("x" x 100);
-my $superlongpath = "$tempdir/pgdata/$superlongname";
+my $superlongpath = "$pgdata/$superlongname";
 
 open FILE, ">$superlongpath" or die "unable to create file $superlongpath";
 close FILE;
 command_fails([ 'pg_basebackup', '-D', "$tempdir/tarbackup_l1", '-Ft' ],
 	'pg_basebackup tar with long name fails');
-unlink "$tempdir/pgdata/$superlongname";
+unlink "$pgdata/$superlongname";
 
 # The following tests test symlinks. Windows doesn't have symlinks, so
 # skip on Windows.
@@ -98,7 +106,7 @@ SKIP: {
 	# to our physical temp location.  That way we can use shorter names
 	# for the tablespace directories, which hopefully won't run afoul of
 	# the 99 character length limit.
-	my $shorter_tempdir = tempdir_short . "/tempdir";
+	my $shorter_tempdir = TestLib::tempdir_short . "/tempdir";
 	symlink "$tempdir", $shorter_tempdir;
 
 	mkdir "$tempdir/tblspc1";
@@ -120,7 +128,7 @@ SKIP: {
 			"-T$shorter_tempdir/tblspc1=$tempdir/tbackup/tblspc1" ],
 		'plain format with tablespaces succeeds with tablespace mapping');
 	ok(-d "$tempdir/tbackup/tblspc1", 'tablespace was relocated');
-	

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-30 Thread Alvaro Herrera
Michael Paquier wrote:
> On Mon, Nov 30, 2015 at 6:28 AM, Noah Misch  wrote:

> > The proposed code is short on guidance about when to put a function in 
> > TestLib
> > versus TestBase.  TestLib has no header comment.  The TestBase header 
> > comment
> > would permit, for example, command_ok() in that module.  I would try instead
> > keeping TestLib as the base module and moving into PostgresNode the 
> > functions
> > that deal with PostgreSQL clusters (get_new_node teardown_node psql
> > poll_query_until issues_sql_like).
> 
> PostgresNode is wanted to be a base representation of how of node is,
> not of how to operate on it. The ways to perform the tests, which
> works on a node, is wanted as a higher-level operation.
> 
> Logging and base configuration of a test set is a lower level of
> operations than PostgresNode, because cluster nodes need actually to
> perform system calls, some of those system calls like run_log allowing
> to log in the centralized log file. I have tried to make the headers
> of those modules more verbose, please see attached.

I'm not terribly convinced by this argument TBH.  Perhaps we can have
PostgresNode be one package, and the logging routines be another
package, and we create a higher-level package whose @ISA=(PostgresNode,
LoggingWhatever) and then we move the routines suggested by Noah into
that new package.  Then the tests use that instead of PostgresNode
directly.


> > > --- a/src/bin/pg_rewind/t/001_basic.pl
> > > +++ b/src/bin/pg_rewind/t/001_basic.pl
> > > @@ -1,9 +1,11 @@
> > > +# Basic pg_rewind test.
> > > +
> > >  use strict;
> > >  use warnings;
> > > -use TestLib;
> > > -use Test::More tests => 8;
> > >
> > >  use RewindTest;
> > > +use TestLib;
> > > +use Test::More tests => 8;
> >
> > Revert all changes to this file.  Audit the rest of the patch for whitespace
> > change unrelated to the subject.
> 
> Done.

I perltidied several files, though not consistently.  Regarding this
particular hunk, what is going on here is that I moved "use strict;use
warnings" as one stanza, followed by all the other "use" lines as
another stanza, alphabetically.  It was previously a bit messy, with
@EXPORTS and other stuff in between "use" lines.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-29 Thread Andrew Dunstan



On 11/29/2015 04:28 PM, Noah Misch wrote:

+BEGIN
+{
+   $windows_os = $Config{osname} eq 'MSWin32' || $Config{osname} eq 'msys';
+
+   # Determine output directories, and create them.  The base path is the
+   # TESTDIR environment variable, which is normally set by the invoking
+   # Makefile.
+   $tmp_check = $ENV{TESTDIR} ? "$ENV{TESTDIR}/tmp_check" : "tmp_check";
+   $log_path = "$tmp_check/log";
+
+   mkdir $tmp_check;
+   mkdir $log_path;
Never mutate the filesystem in a BEGIN block, because "perl -c" runs BEGIN
blocks.  (Likewise for the BEGIN block this patch adds to TestLib.)



Yeah, those two lines might belong in an INIT block. "perldoc perlmod" 
for details.



cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-29 Thread Noah Misch
On Fri, Nov 27, 2015 at 07:53:10PM -0300, Alvaro Herrera wrote:
> Michael Paquier wrote:
> > The result of a couple of hours of hacking is attached:
> > - 0001 is the refactoring adding PostgresNode and RecursiveCopy. I have
> > also found that it is quite advantageous to move some of the routines that
> > are synonyms of system() and the stuff used for logging into another
> > low-level library that PostgresNode depends on, that I called TestBase in
> > this patch.

> Here's another version of this.  I changed the packages a bit more.  For
> starters, I moved the routines around a bit; some of your choices seemed
> more about keeping stuff where it was originally rather than moving it
> to where it made sense.  These are the routines in each module:
> 
> TestBase:  system_or_bail system_log run_log slurp_dir slurp_file
> append_to_file
> 
> TestLib:get_new_node teardown_node psql poll_query_until command_ok
> command_fails command_exit_is program_help_ok program_version_ok
> program_options_handling_ok command_like issues_sql_like

The proposed code is short on guidance about when to put a function in TestLib
versus TestBase.  TestLib has no header comment.  The TestBase header comment
would permit, for example, command_ok() in that module.  I would try instead
keeping TestLib as the base module and moving into PostgresNode the functions
that deal with PostgreSQL clusters (get_new_node teardown_node psql
poll_query_until issues_sql_like).

> I tried to get rid of teardown_node by having a DESTROY method for
> PostgresNode; that method would call "pg_ctl stop -m immediate".  That
> would have been much cleaner.  However, when a test fails this doesn't
> work sanely because the END block for File::Temp runs earlier than that
> DESTROY block, which means the datadir is already gone by the time
> pg_ctl stop runs, so the node stop doesn't work at all.  (Perhaps we
> could fix this by noting postmaster's PID at start time, and then
> sending a signal directly instead of relying on pg_ctl).

You could disable File::Temp cleanup and handle cleanup yourself at the
desired time.  (I haven't reviewed whether the goal of removing teardown_node
is otherwise good.)

> +my $node = get_new_node();
> +# Initialize node without replication settings
> +$node->initNode(0);
> +$node->startNode();
> +my $pgdata = $node->getDataDir();
> +
> +$ENV{PGPORT} = $node->getPort();

Starting a value retrieval method name with "get" is not Perlish.  The TAP
suites currently follow "man perlstyle" in using underscored_lower_case method
names.  No PostgreSQL Perl code uses lowerFirstCamelCase, though some uses
CamelCase.  The word "Node" is redundant.  Use this style:

  $node->init(0);
  $node->start;
  my $pgdata = $node->data_dir;
  $ENV{PGPORT} = $node->port;

As a matter of opinion, I recommend giving "init" key/value arguments instead
of the single Boolean argument.  The method could easily need more options in
the future, and this makes the call site self-documenting:

  $node->init(hba_permit_replication => 0);

> - 'pg_controldata with nonexistent directory fails');
> +   'pg_controldata with nonexistent directory fails');

perltidy will undo this whitespace-only change.

> --- a/src/bin/pg_rewind/t/001_basic.pl
> +++ b/src/bin/pg_rewind/t/001_basic.pl
> @@ -1,9 +1,11 @@
> +# Basic pg_rewind test.
> +
>  use strict;
>  use warnings;
> -use TestLib;
> -use Test::More tests => 8;
>  
>  use RewindTest;
> +use TestLib;
> +use Test::More tests => 8;

Revert all changes to this file.  Audit the rest of the patch for whitespace
change unrelated to the subject.

> - 'fails with nonexistent table');
> +   'fails with nonexistent table');

> -'CREATE TABLE test1 (a int); CREATE INDEX test1x ON test1 (a); CLUSTER test1 
> USING test1x';
> + 'CREATE TABLE test1 (a int); CREATE INDEX test1x ON test1 (a); CLUSTER 
> test1 USING test1x';

perltidy will undo these whitespace-only changes.

> +# cluster -a is not compatible with -d, hence enforce environment variables

s/cluster -a/clusterdb -a/

> -issues_sql_like(
> +$ENV{PGPORT} = $node->getPort();
> +
> +issues_sql_like($node,

perltidy will move $node to its own line.

> -command_fails([ 'createuser', 'user1' ], 'fails if role already exists');
> +command_fails([ 'createuser', 'user1' ],
> +   'fails if role already exists');

perltidy will undo this whitespace-only change.

> @@ -0,0 +1,252 @@
> +# PostgresNode, simple node representation for regression tests.
> +#
> +# Regression tests should use this basic class infrastructure to define nodes
> +# that need used in the test modules/scripts.
> +package PostgresNode;

Consider just saying, "Class representing a data directory and postmaster."

> + my $self   = {
> + _port => undef,
> + _host => undef,
> + _basedir  => undef,
> + _applname => undef,
> + _logfile  => undef };
> +
> + # Set 

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-29 Thread Michael Paquier
On Sun, Nov 29, 2015 at 12:13 AM, Alvaro Herrera wrote:
> Here's your recovery test patch rebased, for your (and others'!)
> perusal.  It passes for me.  (Test 003 is unchanged.)

Are you planning to push that as well? It does not have much coverage
but I guess that's quite good for a first shot, and that can serve as
example for future tests.

Still, the first patch adds enough infrastructure to allow any other
module to have more complex regression test scenarios, the first two
targets coming immediately to my mind being the quorum syncrep patch
and pg_rewind and its timeline switch manipulation. So that's more
than welcome!
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-29 Thread Michael Paquier
On Mon, Nov 30, 2015 at 6:28 AM, Noah Misch  wrote:
>
> On Fri, Nov 27, 2015 at 07:53:10PM -0300, Alvaro Herrera wrote:
> > Michael Paquier wrote:
> > > The result of a couple of hours of hacking is attached:
> > > - 0001 is the refactoring adding PostgresNode and RecursiveCopy. I have
> > > also found that it is quite advantageous to move some of the routines that
> > > are synonyms of system() and the stuff used for logging into another
> > > low-level library that PostgresNode depends on, that I called TestBase in
> > > this patch.
>
> > Here's another version of this.  I changed the packages a bit more.  For
> > starters, I moved the routines around a bit; some of your choices seemed
> > more about keeping stuff where it was originally rather than moving it
> > to where it made sense.  These are the routines in each module:
> >
> > TestBase:  system_or_bail system_log run_log slurp_dir slurp_file
> > append_to_file
> >
> > TestLib:get_new_node teardown_node psql poll_query_until command_ok
> > command_fails command_exit_is program_help_ok program_version_ok
> > program_options_handling_ok command_like issues_sql_like
>
> The proposed code is short on guidance about when to put a function in TestLib
> versus TestBase.  TestLib has no header comment.  The TestBase header comment
> would permit, for example, command_ok() in that module.  I would try instead
> keeping TestLib as the base module and moving into PostgresNode the functions
> that deal with PostgreSQL clusters (get_new_node teardown_node psql
> poll_query_until issues_sql_like).

PostgresNode is wanted to be a base representation of how of node is,
not of how to operate on it. The ways to perform the tests, which
works on a node, is wanted as a higher-level operation.

Logging and base configuration of a test set is a lower level of
operations than PostgresNode, because cluster nodes need actually to
perform system calls, some of those system calls like run_log allowing
to log in the centralized log file. I have tried to make the headers
of those modules more verbose, please see attached.

>
> > +my $node = get_new_node();
> > +# Initialize node without replication settings
> > +$node->initNode(0);
> > +$node->startNode();
> > +my $pgdata = $node->getDataDir();
> > +
> > +$ENV{PGPORT} = $node->getPort();
>
> Starting a value retrieval method name with "get" is not Perlish.  The TAP
> suites currently follow "man perlstyle" in using underscored_lower_case method
> names.  No PostgreSQL Perl code uses lowerFirstCamelCase, though some uses
> CamelCase.  The word "Node" is redundant.  Use this style:
>
>   $node->init(0);
>   $node->start;
>   my $pgdata = $node->data_dir;
>   $ENV{PGPORT} = $node->port;

I have switched the style this way.

> As a matter of opinion, I recommend giving "init" key/value arguments instead
> of the single Boolean argument.  The method could easily need more options in
> the future, and this makes the call site self-documenting:
>
>   $node->init(hba_permit_replication => 0);

Done.

>
> > - 'pg_controldata with nonexistent directory fails');
> > +   'pg_controldata with nonexistent directory fails');
>
> perltidy will undo this whitespace-only change.

Cleaned up.

>
> > --- a/src/bin/pg_rewind/t/001_basic.pl
> > +++ b/src/bin/pg_rewind/t/001_basic.pl
> > @@ -1,9 +1,11 @@
> > +# Basic pg_rewind test.
> > +
> >  use strict;
> >  use warnings;
> > -use TestLib;
> > -use Test::More tests => 8;
> >
> >  use RewindTest;
> > +use TestLib;
> > +use Test::More tests => 8;
>
> Revert all changes to this file.  Audit the rest of the patch for whitespace
> change unrelated to the subject.

Done.

>
>
> > - 'fails with nonexistent table');
> > +   'fails with nonexistent table');
>
> > -'CREATE TABLE test1 (a int); CREATE INDEX test1x ON test1 (a); CLUSTER 
> > test1 USING test1x';
> > + 'CREATE TABLE test1 (a int); CREATE INDEX test1x ON test1 (a); 
> > CLUSTER test1 USING test1x';
>
> perltidy will undo these whitespace-only changes.

Cleaned up.

>
> > +# cluster -a is not compatible with -d, hence enforce environment variables
>
> s/cluster -a/clusterdb -a/

Fixed.

>
> > -command_fails([ 'createuser', 'user1' ], 'fails if role already exists');
> > +command_fails([ 'createuser', 'user1' ],
> > +   'fails if role already exists');
>
> perltidy will undo this whitespace-only change.
>
> > @@ -0,0 +1,252 @@
> > +# PostgresNode, simple node representation for regression tests.
> > +#
> > +# Regression tests should use this basic class infrastructure to define 
> > nodes
> > +# that need used in the test modules/scripts.
> > +package PostgresNode;
>
> Consider just saying, "Class representing a data directory and postmaster."

OK, I have changed this description:
+# PostgresNode, class representing a data directory and postmaster.
+#
+# This contains a basic set of routines able to work on a PostgreSQL node,
+# allowing to start, 

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-28 Thread Michael Paquier
On Sat, Nov 28, 2015 at 7:53 AM, Alvaro Herrera wrote:
> I moved all the initialization code (deleting stuff from environment,
> detecting Windows, opening SimpleTie filedescs etc) into BEGIN blocks,
> which run earlier than any other code.

Ah, OK. Thanks. That makes visibly the whole set of modules more consistent.

> Hmm. I just noticed RewindTest sets $ENV{PGDATABASE} outside BEGIN.  Not
> sure what to think of that.  Could instead pass the database name in
> $node->getConnStr() calls, like run_pg_rewind() is already doing.

Yes, let's remove that and pass the database name to getConnStr().

> I tried all the t/ tests we have and all of them pass for me.  If I'm
> able, I will push this on my Sunday late evening, so that I can fix
> whatever gets red on Monday first thing ...

I have done as well additional tests on Windows and this patch is
showing a green status.

A separate issue, but as long as we are working on this set of tests:
I have noticed that config_default.pl is missing the flag tap_tests in
its list. See the patch attached. Could you apply that as well and
backpatch?

I have as well noticed that RewindTest.pm is missing "1;" on its last
line. When this is loaded this would lead to compilation errors.
-- 
Michael
diff --git a/src/tools/msvc/config_default.pl b/src/tools/msvc/config_default.pl
index b9f2ff4..e50be7e 100644
--- a/src/tools/msvc/config_default.pl
+++ b/src/tools/msvc/config_default.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 
 our $config = {
-	asserts => 0,# --enable-cassert
+	asserts  => 0,# --enable-cassert
 	  # integer_datetimes=>1,   # --enable-integer-datetimes - on is now default
 	  # float4byval=>1, # --disable-float4-byval, on by default
 
@@ -13,18 +13,19 @@ our $config = {
 	# blocksize => 8, # --with-blocksize, 8kB by default
 	# wal_blocksize => 8, # --with-wal-blocksize, 8kB by default
 	# wal_segsize => 16,  # --with-wal-segsize, 16MB by default
-	ldap => 1,# --with-ldap
-	extraver => undef,# --with-extra-version=
-	nls  => undef,# --enable-nls=
-	tcl  => undef,# --with-tls=
-	perl => undef,# --with-perl
-	python   => undef,# --with-python=
-	openssl  => undef,# --with-openssl=
-	uuid => undef,# --with-ossp-uuid
-	xml  => undef,# --with-libxml=
-	xslt => undef,# --with-libxslt=
-	iconv=> undef,# (not in configure, path to iconv)
-	zlib => undef # --with-zlib=
+	ldap  => 1,# --with-ldap
+	extraver  => undef,# --with-extra-version=
+	nls   => undef,# --enable-nls=
+	tap_tests => undef,# --enable-tap-tests
+	tcl   => undef,# --with-tls=
+	perl  => undef,# --with-perl
+	python=> undef,# --with-python=
+	openssl   => undef,# --with-openssl=
+	uuid  => undef,# --with-ossp-uuid
+	xml   => undef,# --with-libxml=
+	xslt  => undef,# --with-libxslt=
+	iconv => undef,# (not in configure, path to iconv)
+	zlib  => undef # --with-zlib=
 };
 
 1;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-28 Thread Alvaro Herrera
Michael Paquier wrote:
> On Sat, Nov 28, 2015 at 7:53 AM, Alvaro Herrera wrote:

> > Hmm. I just noticed RewindTest sets $ENV{PGDATABASE} outside BEGIN.  Not
> > sure what to think of that.  Could instead pass the database name in
> > $node->getConnStr() calls, like run_pg_rewind() is already doing.
> 
> Yes, let's remove that and pass the database name to getConnStr().

Ok.


> A separate issue, but as long as we are working on this set of tests:
> I have noticed that config_default.pl is missing the flag tap_tests in
> its list. See the patch attached. Could you apply that as well and
> backpatch?

> I have as well noticed that RewindTest.pm is missing "1;" on its last
> line. When this is loaded this would lead to compilation errors.

Sure.


> > I tried all the t/ tests we have and all of them pass for me.  If I'm
> > able, I will push this on my Sunday late evening, so that I can fix
> > whatever gets red on Monday first thing ...
> 
> I have done as well additional tests on Windows and this patch is
> showing a green status.

Great.

Here's your recovery test patch rebased, for your (and others'!)
perusal.  It passes for me.  (Test 003 is unchanged.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/test/Makefile b/src/test/Makefile
index b713c2c..7f7754f 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,7 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = regress isolation modules
+SUBDIRS = regress isolation modules recovery
 
 # We don't build or execute examples/, locale/, or thread/ by default,
 # but we do want "make clean" etc to recurse into them.  Likewise for ssl/,
diff --git a/src/test/recovery/.gitignore b/src/test/recovery/.gitignore
new file mode 100644
index 000..499fa7d
--- /dev/null
+++ b/src/test/recovery/.gitignore
@@ -0,0 +1,3 @@
+# Generated by test suite
+/regress_log/
+/tmp_check/
diff --git a/src/test/recovery/Makefile b/src/test/recovery/Makefile
new file mode 100644
index 000..16c063a
--- /dev/null
+++ b/src/test/recovery/Makefile
@@ -0,0 +1,17 @@
+#-
+#
+# Makefile for src/test/recovery
+#
+# Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/recovery/Makefile
+#
+#-
+
+subdir = src/test/recovery
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
diff --git a/src/test/recovery/README b/src/test/recovery/README
new file mode 100644
index 000..20b98e0
--- /dev/null
+++ b/src/test/recovery/README
@@ -0,0 +1,19 @@
+src/test/recovery/README
+
+Regression tests for recovery and replication
+=
+
+This directory contains a test suite for recovery and replication,
+testing mainly the interactions of recovery.conf with cluster
+instances by providing a simple set of routines that can be used
+to define a custom cluster for a test, including backup, archiving,
+and streaming configuration.
+
+Running the tests
+=
+
+make check
+
+NOTE: This creates a temporary installation, and some tests may
+create one or multiple nodes, be they master or standby(s) for the
+purpose of the tests.
diff --git a/src/test/recovery/RecoveryTest.pm b/src/test/recovery/RecoveryTest.pm
new file mode 100644
index 000..3706847
--- /dev/null
+++ b/src/test/recovery/RecoveryTest.pm
@@ -0,0 +1,177 @@
+# Set of common routines for recovery regression tests for a PostgreSQL
+# cluster. This includes methods that can be used by the various set of
+# tests present to set up cluster nodes and configure them according to
+# the test scenario wanted.
+#
+# This module makes use of PostgresNode for node manipulation, performing
+# higher-level operations to create standby nodes or setting them up
+# for archiving and replication.
+#
+# Nodes are identified by their port number and have one allocated when
+# created, hence it is unique for each node of the cluster as it is run
+# locally. PGHOST is equally set to a unique value for the duration of
+# each test.
+
+package RecoveryTest;
+
+use strict;
+use warnings;
+
+use Cwd;
+use Exporter 'import';
+use IPC::Run qw(run start);
+use PostgresNode;
+use RecursiveCopy;
+use TestBase;
+use TestLib;
+use Test::More;
+
+our @EXPORT = qw(
+	enable_archiving
+	enable_restoring
+	enable_streaming
+	make_master
+	make_archive_standby
+	make_stream_standby
+);
+
+# Set of handy routines able to set up a node with different characteristics
+# Enable streaming replication
+sub enable_streaming
+{
+	my $node_root = shift; # Instance to link to
+	my $node_standby = shift;
+	my $root_connstr = $node_root->getConnStr();
+	my $applname = 

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-27 Thread Alvaro Herrera
Michael Paquier wrote:

> The result of a couple of hours of hacking is attached:
> - 0001 is the refactoring adding PostgresNode and RecursiveCopy. I have
> also found that it is quite advantageous to move some of the routines that
> are synonyms of system() and the stuff used for logging into another
> low-level library that PostgresNode depends on, that I called TestBase in
> this patch. This way, all the infrastructure depends on the same logging
> management. Existing tests have been refactored to fit into the new code,
> and this leads to a couple of simplifications particularly in pg_rewind
> tests because there is no more need to have there routines for environment
> cleanup and logging. I have done tests on OSX and Windows using it and
> tests are passing. I have as well tested that ssl tests were working.

Here's another version of this.  I changed the packages a bit more.  For
starters, I moved the routines around a bit; some of your choices seemed
more about keeping stuff where it was originally rather than moving it
to where it made sense.  These are the routines in each module:

TestBase:  system_or_bail system_log run_log slurp_dir slurp_file
append_to_file

TestLib:get_new_node teardown_node psql poll_query_until command_ok
command_fails command_exit_is program_help_ok program_version_ok
program_options_handling_ok command_like issues_sql_like

I tried to get rid of teardown_node by having a DESTROY method for
PostgresNode; that method would call "pg_ctl stop -m immediate".  That
would have been much cleaner.  However, when a test fails this doesn't
work sanely because the END block for File::Temp runs earlier than that
DESTROY block, which means the datadir is already gone by the time
pg_ctl stop runs, so the node stop doesn't work at all.  (Perhaps we
could fix this by noting postmaster's PID at start time, and then
sending a signal directly instead of relying on pg_ctl).

I moved all the initialization code (deleting stuff from environment,
detecting Windows, opening SimpleTie filedescs etc) into BEGIN blocks,
which run earlier than any other code.

I perltidy'ed PostgresNode (and all the other files actually), to have
the style match the rest of our code.  I also updated some code to be
more Perlish.

I added a lot of error checking in RecursiveCopy.

You had a "cat" call somewhere, which I replaced with slurp_file.


I considered updating RewindTest so that it didn't have to export the
node global variables, but decided not to, not because of the huge code
churn for the t/*.pl files but because of the problem with the DESTROY
method above: it didn't actually buy anything.

Hmm. I just noticed RewindTest sets $ENV{PGDATABASE} outside BEGIN.  Not
sure what to think of that.  Could instead pass the database name in
$node->getConnStr() calls, like run_pg_rewind() is already doing.

I tried all the t/ tests we have and all of them pass for me.  If I'm
able, I will push this on my Sunday late evening, so that I can fix
whatever gets red on Monday first thing ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/bin/initdb/t/001_initdb.pl b/src/bin/initdb/t/001_initdb.pl
index 299dcf5..3b5d7af 100644
--- a/src/bin/initdb/t/001_initdb.pl
+++ b/src/bin/initdb/t/001_initdb.pl
@@ -4,10 +4,11 @@
 
 use strict;
 use warnings;
+use TestBase;
 use TestLib;
 use Test::More tests => 14;
 
-my $tempdir = TestLib::tempdir;
+my $tempdir = TestBase::tempdir;
 my $xlogdir = "$tempdir/pgxlog";
 my $datadir = "$tempdir/data";
 
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index dc96bbf..65ed4de 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -2,6 +2,8 @@ use strict;
 use warnings;
 use Cwd;
 use Config;
+use PostgresNode;
+use TestBase;
 use TestLib;
 use Test::More tests => 51;
 
@@ -9,8 +11,15 @@ program_help_ok('pg_basebackup');
 program_version_ok('pg_basebackup');
 program_options_handling_ok('pg_basebackup');
 
-my $tempdir = tempdir;
-start_test_server $tempdir;
+my $tempdir = TestBase::tempdir;
+
+my $node = get_new_node();
+# Initialize node without replication settings
+$node->initNode(0);
+$node->startNode();
+my $pgdata = $node->getDataDir();
+
+$ENV{PGPORT} = $node->getPort();
 
 command_fails(['pg_basebackup'],
 	'pg_basebackup needs target directory specified');
@@ -26,19 +35,19 @@ if (open BADCHARS, ">>$tempdir/pgdata/FOO\xe0\xe0\xe0BAR")
 	close BADCHARS;
 }
 
-configure_hba_for_replication "$tempdir/pgdata";
-system_or_bail 'pg_ctl', '-D', "$tempdir/pgdata", 'reload';
+$node->setReplicationConf();
+system_or_bail 'pg_ctl', '-D', $pgdata, 'reload';
 
 command_fails(
 	[ 'pg_basebackup', '-D', "$tempdir/backup" ],
 	'pg_basebackup fails because of WAL configuration');
 
-open CONF, ">>$tempdir/pgdata/postgresql.conf";
+open CONF, ">>$pgdata/postgresql.conf";
 print 

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-25 Thread Michael Paquier
On Wed, Nov 25, 2015 at 11:00 AM, Michael Paquier  wrote:

> On Wed, Nov 25, 2015 at 10:55 AM, Alvaro Herrera  > wrote:
>
>> Michael Paquier wrote:
>> > On Wed, Nov 25, 2015 at 6:22 AM, Alvaro Herrera <
>> alvhe...@2ndquadrant.com>
>> > wrote:
>> >
>> > > Michael Paquier wrote:
>>
>> > > This looks great as a starting point.  I think we should make TestLib
>> > > depend on PostgresNode instead of the other way around.  I will have a
>> > > look at that (I realize this means messing with the existing tests).
>> >
>> > Makes sense. My thoughts following that is that we should keep a track
>> of
>> > the nodes started as an array which is part of TestLib, with PGHOST set
>> > once at startup using tempdir_short. That's surely an refactoring patch
>> > somewhat independent of the recovery test suite. I would not mind
>> writing
>> > something among those lines if needed.
>>
>> OK, please do.
>>
>> We can split this up in two patches: one introducing PostgresNode
>> (+ RecursiveCopy) together with the refactoring of existing test code,
>> and a subsequent one introducing RecoveryTest and the corresponding
>> subdir.  Sounds good?
>>
>
> Yeah, that matches my line of thoughts. Will do so.
>

The result of a couple of hours of hacking is attached:
- 0001 is the refactoring adding PostgresNode and RecursiveCopy. I have
also found that it is quite advantageous to move some of the routines that
are synonyms of system() and the stuff used for logging into another
low-level library that PostgresNode depends on, that I called TestBase in
this patch. This way, all the infrastructure depends on the same logging
management. Existing tests have been refactored to fit into the new code,
and this leads to a couple of simplifications particularly in pg_rewind
tests because there is no more need to have there routines for environment
cleanup and logging. I have done tests on OSX and Windows using it and
tests are passing. I have as well tested that ssl tests were working.
- 0002 adds the recovery tests with RecoveryTest.pm now located in
src/test/recovery.
Regards,
-- 
Michael
From 686a58ae7edabc4d48767115283d2390c24c305a Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Wed, 25 Nov 2015 21:39:31 +0900
Subject: [PATCH 1/2] Refactor TAP tests to use common node management system

All the existing TAP tests now use a new module called PostgresNode that
centralizes logging, backup and definitions of a Postgres node used in
the tests. Some low-level content of TestLib is split into a second
module called TestBase which contains routines used by PostgresNode for
logging and running commands.

This is in preparation for a more advanced facility dedicated at testing
recovery scenarios directly in core.
---
 src/bin/initdb/t/001_initdb.pl |   3 +-
 src/bin/pg_basebackup/t/010_pg_basebackup.pl   |  29 ++-
 src/bin/pg_controldata/t/001_pg_controldata.pl |  13 +-
 src/bin/pg_ctl/t/001_start_stop.pl |   5 +-
 src/bin/pg_ctl/t/002_status.pl |  17 +-
 src/bin/pg_rewind/RewindTest.pm| 122 +---
 src/bin/pg_rewind/t/003_extrafiles.pl  |   4 +-
 src/bin/pg_rewind/t/004_pg_xlog_symlink.pl |   3 +
 src/bin/scripts/t/010_clusterdb.pl |  23 ++-
 src/bin/scripts/t/011_clusterdb_all.pl |  13 +-
 src/bin/scripts/t/020_createdb.pl  |  12 +-
 src/bin/scripts/t/030_createlang.pl|  19 +-
 src/bin/scripts/t/040_createuser.pl|  20 +-
 src/bin/scripts/t/050_dropdb.pl|  12 +-
 src/bin/scripts/t/060_droplang.pl  |  10 +-
 src/bin/scripts/t/070_dropuser.pl  |  10 +-
 src/bin/scripts/t/080_pg_isready.pl|   8 +-
 src/bin/scripts/t/090_reindexdb.pl |  19 +-
 src/bin/scripts/t/091_reindexdb_all.pl |   9 +-
 src/bin/scripts/t/100_vacuumdb.pl  |  18 +-
 src/bin/scripts/t/101_vacuumdb_all.pl  |  10 +-
 src/bin/scripts/t/102_vacuumdb_stages.pl   |  12 +-
 src/test/perl/PostgresNode.pm  | 233 +++
 src/test/perl/RecursiveCopy.pm |  42 
 src/test/perl/TestBase.pm  | 109 +++
 src/test/perl/TestLib.pm   | 254 +++--
 src/test/ssl/ServerSetup.pm|  24 +--
 src/test/ssl/t/001_ssltests.pl |  32 ++--
 28 files changed, 702 insertions(+), 383 deletions(-)
 create mode 100644 src/test/perl/PostgresNode.pm
 create mode 100644 src/test/perl/RecursiveCopy.pm
 create mode 100644 src/test/perl/TestBase.pm

diff --git a/src/bin/initdb/t/001_initdb.pl b/src/bin/initdb/t/001_initdb.pl
index 299dcf5..3b5d7af 100644
--- a/src/bin/initdb/t/001_initdb.pl
+++ b/src/bin/initdb/t/001_initdb.pl
@@ -4,10 +4,11 @@
 
 use strict;
 use warnings;
+use TestBase;
 use TestLib;
 use Test::More tests => 14;
 
-my $tempdir = TestLib::tempdir;

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-24 Thread Alvaro Herrera
Michael Paquier wrote:

> - Manage node information using package/class PostgresNode.pm and have
> RecoveryTest use it. I have actually made PostgresNode bare-bone and simple
> on purpose: one can initialize the node, append configuration parameters to
> it and manage it through start/stop/restart (we may want to add reload and
> promote actually if needed).

This looks great as a starting point.  I think we should make TestLib
depend on PostgresNode instead of the other way around.  I will have a
look at that (I realize this means messing with the existing tests).

> I have also arrived at the conclusion that it is not really worth
> adding a node status flag in PostgresNode because the port number
> saved there is sufficient when doing free port lookup, and the list of
> nodes used in a recovery test are saved in an array.

I don't disagree with this in principle, but I think the design that you
get a new PostgresNode object by calling get_free_port is strange.  I
think the port lookup code should be part of either TestLib or
PostgresNode, not RecoveryTest.

> - Add new module RecursiveCopy to be used for base backups. This removes
> the dependency with Archive::Tar. PostgresNode makes use of that when
> initializing a node from a backup.

Great.

> - Tests have been updated to use the PostgresNode objects instead of the
> port number as identifier. That's more portable.

Makes sense.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-24 Thread Michael Paquier
On Wed, Nov 25, 2015 at 6:22 AM, Alvaro Herrera 
wrote:

> Michael Paquier wrote:
>
> > - Manage node information using package/class PostgresNode.pm and have
> > RecoveryTest use it. I have actually made PostgresNode bare-bone and
> simple
> > on purpose: one can initialize the node, append configuration parameters
> to
> > it and manage it through start/stop/restart (we may want to add reload
> and
> > promote actually if needed).
>
> This looks great as a starting point.  I think we should make TestLib
> depend on PostgresNode instead of the other way around.  I will have a
> look at that (I realize this means messing with the existing tests).
>

Makes sense. My thoughts following that is that we should keep a track of
the nodes started as an array which is part of TestLib, with PGHOST set
once at startup using tempdir_short. That's surely an refactoring patch
somewhat independent of the recovery test suite. I would not mind writing
something among those lines if needed.


> > I have also arrived at the conclusion that it is not really worth
> > adding a node status flag in PostgresNode because the port number
> > saved there is sufficient when doing free port lookup, and the list of
> > nodes used in a recovery test are saved in an array.
>
> I don't disagree with this in principle, but I think the design that you
> get a new PostgresNode object by calling get_free_port is strange.  I
> think the port lookup code should be part of either TestLib or
> PostgresNode, not RecoveryTest.
>

I'd vote for TestLib. I have written PostgresNode this way to allow users
to set up arbitrary port numbers if they'd like to do so. That's more
flexible.
-- 
Michael


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-24 Thread Alvaro Herrera
Michael Paquier wrote:
> On Wed, Nov 25, 2015 at 6:22 AM, Alvaro Herrera 
> wrote:
> 
> > Michael Paquier wrote:

> > This looks great as a starting point.  I think we should make TestLib
> > depend on PostgresNode instead of the other way around.  I will have a
> > look at that (I realize this means messing with the existing tests).
> 
> Makes sense. My thoughts following that is that we should keep a track of
> the nodes started as an array which is part of TestLib, with PGHOST set
> once at startup using tempdir_short. That's surely an refactoring patch
> somewhat independent of the recovery test suite. I would not mind writing
> something among those lines if needed.

OK, please do.

We can split this up in two patches: one introducing PostgresNode
(+ RecursiveCopy) together with the refactoring of existing test code,
and a subsequent one introducing RecoveryTest and the corresponding
subdir.  Sounds good?

> > > I have also arrived at the conclusion that it is not really worth
> > > adding a node status flag in PostgresNode because the port number
> > > saved there is sufficient when doing free port lookup, and the list of
> > > nodes used in a recovery test are saved in an array.
> >
> > I don't disagree with this in principle, but I think the design that you
> > get a new PostgresNode object by calling get_free_port is strange.  I
> > think the port lookup code should be part of either TestLib or
> > PostgresNode, not RecoveryTest.
> 
> I'd vote for TestLib. I have written PostgresNode this way to allow users
> to set up arbitrary port numbers if they'd like to do so. That's more
> flexible.

That works for me.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-24 Thread Michael Paquier
On Wed, Nov 25, 2015 at 10:55 AM, Alvaro Herrera 
wrote:

> Michael Paquier wrote:
> > On Wed, Nov 25, 2015 at 6:22 AM, Alvaro Herrera <
> alvhe...@2ndquadrant.com>
> > wrote:
> >
> > > Michael Paquier wrote:
>
> > > This looks great as a starting point.  I think we should make TestLib
> > > depend on PostgresNode instead of the other way around.  I will have a
> > > look at that (I realize this means messing with the existing tests).
> >
> > Makes sense. My thoughts following that is that we should keep a track of
> > the nodes started as an array which is part of TestLib, with PGHOST set
> > once at startup using tempdir_short. That's surely an refactoring patch
> > somewhat independent of the recovery test suite. I would not mind writing
> > something among those lines if needed.
>
> OK, please do.
>
> We can split this up in two patches: one introducing PostgresNode
> (+ RecursiveCopy) together with the refactoring of existing test code,
> and a subsequent one introducing RecoveryTest and the corresponding
> subdir.  Sounds good?
>

Yeah, that matches my line of thoughts. Will do so.
-- 
Michael


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-24 Thread Michael Paquier
On Tue, Nov 24, 2015 at 2:14 PM, Michael Paquier
wrote:

> I'll rework this patch and will update a new version soon.
>

So, attached is a new patch addressing all the comments received. The new
version has the following changes:
- Print more verbosely stderr output in case of error in psql
- Add recovery test suite to SUBDIRS in src/test/Makefile
- Add strict and warnings to what is used in the new modules of this patch
- Manage node information using package/class PostgresNode.pm and have
RecoveryTest use it. I have actually made PostgresNode bare-bone and simple
on purpose: one can initialize the node, append configuration parameters to
it and manage it through start/stop/restart (we may want to add reload and
promote actually if needed). However, more complex configuration is left to
RecoveryTest.pm, which is in charge of appending the configuration
dedicated to streaming, archiving, etc though a set of routines working on
PostgresNode objects. I have also arrived at the conclusion that it is not
really worth adding a node status flag in PostgresNode because the port
number saved there is sufficient when doing free port lookup, and the list
of nodes used in a recovery test are saved in an array.
- Add new module RecursiveCopy to be used for base backups. This removes
the dependency with Archive::Tar. PostgresNode makes use of that when
initializing a node from a backup.
- Tests have been updated to use the PostgresNode objects instead of the
port number as identifier. That's more portable.

Hopefully I have missed nothing.
Regards,
-- 
Michael
diff --git a/src/bin/pg_rewind/RewindTest.pm b/src/bin/pg_rewind/RewindTest.pm
index a4c1737..ea219d7 100644
--- a/src/bin/pg_rewind/RewindTest.pm
+++ b/src/bin/pg_rewind/RewindTest.pm
@@ -125,38 +125,6 @@ sub check_query
 	}
 }
 
-# Run a query once a second, until it returns 't' (i.e. SQL boolean true).
-sub poll_query_until
-{
-	my ($query, $connstr) = @_;
-
-	my $max_attempts = 30;
-	my $attempts = 0;
-	my ($stdout, $stderr);
-
-	while ($attempts < $max_attempts)
-	{
-		my $cmd = [ 'psql', '-At', '-c', "$query", '-d', "$connstr" ];
-		my $result = run $cmd, '>', \$stdout, '2>', \$stderr;
-
-		chomp($stdout);
-		$stdout =~ s/\r//g if $Config{osname} eq 'msys';
-		if ($stdout eq "t")
-		{
-			return 1;
-		}
-
-		# Wait a second before retrying.
-		sleep 1;
-		$attempts++;
-	}
-
-	# The query result didn't change in 30 seconds. Give up. Print the stderr
-	# from the last attempt, hopefully that's useful for debugging.
-	diag $stderr;
-	return 0;
-}
-
 sub append_to_file
 {
 	my ($filename, $str) = @_;
diff --git a/src/test/Makefile b/src/test/Makefile
index b713c2c..7f7754f 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,7 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = regress isolation modules
+SUBDIRS = regress isolation modules recovery
 
 # We don't build or execute examples/, locale/, or thread/ by default,
 # but we do want "make clean" etc to recurse into them.  Likewise for ssl/,
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
new file mode 100644
index 000..90ad3bd
--- /dev/null
+++ b/src/test/perl/PostgresNode.pm
@@ -0,0 +1,162 @@
+# PostgresNode, simple node representation for regression tests
+#
+# Regression tests should use this basic class infrastructure to define
+# nodes that need to be used in the complex scenarios. This object is wanted
+# simple with only a basic set of routines able to configure, initialize
+# and manage a node.
+
+package PostgresNode;
+
+use strict;
+use warnings;
+
+use RecursiveCopy;
+use TestLib;
+
+sub new {
+	my $class = shift;
+	my $pghost = shift;
+	my $pgport = shift;
+	my $self = {
+		_port => undef,
+		_host => undef,
+		_basedir => undef,
+		_connstr => undef,
+		_applname => undef
+			};
+
+	# Set up each field
+	$self->{_port} = $pgport;
+	$self->{_host} = $pghost;
+	$self->{_basedir} = TestLib::tempdir;
+	$self->{_connstr} = "port=$pgport host=$pghost";
+	$self->{_applname} = "node_$pgport";
+	bless $self, $class;
+	return $self;
+}
+
+# Get routines for various variables
+sub getPort {
+	my( $self ) = @_;
+	return $self->{_port};
+}
+sub getHost {
+	my( $self ) = @_;
+	return $self->{_host};
+}
+sub getConnStr {
+	my( $self ) = @_;
+	return $self->{_connstr};
+}
+sub getDataDir {
+	my ( $self ) = @_;
+	return $self->{_basedir} . '/pgdata';
+}
+sub getApplName {
+	my ( $self ) = @_;
+	return $self->{_applname};
+}
+sub getArchiveDir {
+	my ( $self ) = @_;
+	return $self->{_basedir} . '/archives';
+}
+sub getBackupDir {
+	my ( $self ) = @_;
+	return $self->{_basedir} . '/backup';
+}
+
+# Dump node information
+sub dumpNodeInfo {
+	my ( $self ) = @_;
+	print 'Data directory: ' . $self->getDataDir() . "\n";
+	print 'Backup directory: ' . $self->getBackupDir() . "\n";
+	print 'Archive directory: ' . $self->getArchiveDir() . "\n";
+	print 'Connection string: ' . 

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-23 Thread Michael Paquier
Thanks for the review.

On Tue, Nov 24, 2015 at 6:15 AM, Alvaro Herrera 
wrote:

> I just noticed that RecoveryTest.pm is lacking "use strict; use
> warnings;".  With those added, there's a number of problems reported:
>
Most of them are easily fixable by adding the correct "my" lines; but at
> least @array and $current_dir require more code to be written.
>

Oops.


> TBH all that business with arrays that are kept in sync looks too
> contrived to me.  Could we have a Perl object representing each node
> instead?
>

Not really to be honest.


> That would require a "PostgresNode" package (or similar).  The
> RecoveryTest.pm would have a single %nodes hash.  Also, you don't need
> @active_nodes, just a flag in PostgresNode, and have the stop routine do
> nothing if node is not marked active.  Also: if you pass the "root node"
> when creating a node that will become a standby, you don't need to pass
> it when calling, say, enable_streaming; the root node becomes an
> instance variable.  (Hmm, actually, if we do that, I wonder what if in
> the future we want to test node promotion and a standby is repointed to
> a new master.  Maybe we don't want to have this knowledge in the Perl
> code at all.)
>

I think I'll get the idea. In short all the parametrization will just
happen at object level, as well as basic actions on the nodes like start,
stop, restart etc.


> In get_free_port, isn't it easier to use pg_isready rather than psql?
>

Will switch.


> I've been messing with 003 because I think it's a bit too repetitive.
> Will finish it after you post a fixed version of RecoveryTest.pm.
>

Sure, thanks.

I'll rework this patch and will update a new version soon.

Thanks again for the review.
-- 
Michael


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-23 Thread Michael Paquier
On Tue, Nov 24, 2015 at 6:27 AM, Alvaro Herrera
 wrote:
> Michael Paquier wrote:
>> On Thu, Nov 19, 2015 at 12:21 AM, Alvaro Herrera
>>  wrote:
>
>> > Hi, I just started looking this over a bit.  The first thing I noticed
>> > is that it adds a dependency on Archive::Tar which isn't already used
>> > anywhere else.  Did anybody check whether this exists back in 5.8
>> > installations?
>>
>> Actually I didn't and that's a good point, we have decided to support
>> TAP down to 5.8.9. The only reason why I introduced this dependency is
>> that there is no easy native way to copy an entire folder in perl, and
>> that's for handling base backups. There are things like File::NCopy of
>> File::Copy::Recursive however it does not seem like a good idea to
>> depend on other modules that IPC::Run. Would it be better to have an
>> in-core module dedicated to that similar to SimpleTee.pm? Or are you
>> guys fine to accept a dependency with another module?
>
> It would be a lot better to not have to rely on another module existing
> everywhere.  I'd rather have another simple module, following
> SimpleTee's example.  Since this doesn't have to be terribly generic, it
> should be reasonably short, I hope.

Sure, that would be a simple function that does directory lookup and
recursive calls. I'll move ahead with that then and reuse it in the
recovery logic.

>> > Why is "recovery" added to ALWAYS_SUBDIRS in src/test/Makefile instead
>> > of to SUBDIRS?  Seems a strange choice.
>>
>> Because I thought that it should not be part of the main regression
>> suite, like ssl/. Feel free to correct me if my feeling is wrong.
>
> As I understand, the problem with "ssl" is that it messes with
> system-wide settings, which is not the case here.  I'm inclined to move
> it to SUBDIRS.  As an example, "modules" is not part of the main
> regression suite either.

OK, I'll move it back to it then.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-23 Thread Alvaro Herrera
I just noticed that RecoveryTest.pm is lacking "use strict; use
warnings;".  With those added, there's a number of problems reported:

Global symbol "%datadir_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 66.
Global symbol "%backup_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 67.
Global symbol "%archive_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 68.
Global symbol "%connstr_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 69.
Global symbol "%applname_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 70.
Global symbol "%datadir_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 92.
Global symbol "%connstr_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 93.
Global symbol "%applname_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 93.
Global symbol "%archive_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 104.
Global symbol "%datadir_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 111.
Global symbol "%archive_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 121.
Global symbol "%datadir_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 130.
Global symbol "%datadir_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 185.
Global symbol "%datadir_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 197.
Global symbol "@array" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 220.
Global symbol "%backup_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 243.
Global symbol "%archive_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 244.
Global symbol "%datadir_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 246.
Global symbol "%datadir_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 257.
Global symbol "%backup_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 258.
Global symbol "%archive_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 259.
Global symbol "%connstr_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 260.
Global symbol "%applname_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 261.
Global symbol "%backup_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 272.
Global symbol "%datadir_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 287.
Global symbol "%backup_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 288.
Global symbol "%archive_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 289.
Global symbol "%datadir_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 292.
Global symbol "$current_dir" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 294.
Global symbol "%datadir_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 302.
Global symbol "%datadir_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 313.
Global symbol "%datadir_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 320.
Global symbol "%datadir_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 367.
Global symbol "%backup_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 377.
Global symbol "%datadir_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 390.
Global symbol "%backup_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 391.
Global symbol "%archive_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 392.
Global symbol "%connstr_nodes" requires explicit package name at 
/pgsql/source/master/src/test/perl/RecoveryTest.pm line 393.
Global symbol "%applname_nodes" requires explicit package 

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-23 Thread Alvaro Herrera
Michael Paquier wrote:
> On Thu, Nov 19, 2015 at 12:21 AM, Alvaro Herrera
>  wrote:

> > Hi, I just started looking this over a bit.  The first thing I noticed
> > is that it adds a dependency on Archive::Tar which isn't already used
> > anywhere else.  Did anybody check whether this exists back in 5.8
> > installations?
> 
> Actually I didn't and that's a good point, we have decided to support
> TAP down to 5.8.9. The only reason why I introduced this dependency is
> that there is no easy native way to copy an entire folder in perl, and
> that's for handling base backups. There are things like File::NCopy of
> File::Copy::Recursive however it does not seem like a good idea to
> depend on other modules that IPC::Run. Would it be better to have an
> in-core module dedicated to that similar to SimpleTee.pm? Or are you
> guys fine to accept a dependency with another module?

It would be a lot better to not have to rely on another module existing
everywhere.  I'd rather have another simple module, following
SimpleTee's example.  Since this doesn't have to be terribly generic, it
should be reasonably short, I hope.

> > Why is "recovery" added to ALWAYS_SUBDIRS in src/test/Makefile instead
> > of to SUBDIRS?  Seems a strange choice.
> 
> Because I thought that it should not be part of the main regression
> suite, like ssl/. Feel free to correct me if my feeling is wrong.

As I understand, the problem with "ssl" is that it messes with
system-wide settings, which is not the case here.  I'm inclined to move
it to SUBDIRS.  As an example, "modules" is not part of the main
regression suite either.

> > In my days of Perl, it was starting to become frowned upon to call
> > subroutines without parenthesizing arguments.  Is that no longer the
> > case?  Because I notice there are many places in this patch and pre-
> > existing that call psql with an argument list without parens.  And it's
> > a bit odd because I couldn't find any other subroutine that we're using
> > in that way.
> 
> Hm, yeah. If we decide about a perl coding policy I would be happy to
> follow it. Personally I prefer usually using parenthesis however if we
> decide to make the calls consistent we had better address that as a
> separate patch.

Some votes against, some votes for.  Ultimately, it seems that this
depends on the committer.  I don't really care all that much about this
TBH.

> > In 005_replay_delay there's a 2s delay configured; then we test whether
> > something is replayed in 1s.  I hate tests that run for a long time, but
> > is 2s good enough considering that some of our test animals in buildfarm
> > are really slow?
> 
> A call to poll_query_until ensures that we wait for the standby to
> replay once the minimum replay threshold is reached. Even with a slow
> machine the first query would still see only 10 rows at the first try,
> and then wait for the standby to replay before checking if 20 rows are
> visible. Or I am not following your point.

Ah, I see.  Maybe it's fine then, or else I'm not following your point
;-)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-19 Thread David Steele
On 11/19/15 11:05 AM, Robert Haas wrote:
> On Wed, Nov 18, 2015 at 10:21 AM, Alvaro Herrera
>  wrote:
>> In my days of Perl, it was starting to become frowned upon to call
>> subroutines without parenthesizing arguments.  Is that no longer the
>> case?  Because I notice there are many places in this patch and pre-
>> existing that call psql with an argument list without parens.  And it's
>> a bit odd because I couldn't find any other subroutine that we're using
>> in that way.
> 
> I've been coding in Perl for more than 20 years and have never heard
> of such a rule.

I follow the convention of using parentheses for all function calls in
Perl, though this stems more from my greater familiarity with languages
that require them than any adherence to vague Perl conventions.

I do think it makes the code [more] readable.

-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-19 Thread Robert Haas
On Wed, Nov 18, 2015 at 10:21 AM, Alvaro Herrera
 wrote:
> In my days of Perl, it was starting to become frowned upon to call
> subroutines without parenthesizing arguments.  Is that no longer the
> case?  Because I notice there are many places in this patch and pre-
> existing that call psql with an argument list without parens.  And it's
> a bit odd because I couldn't find any other subroutine that we're using
> in that way.

I've been coding in Perl for more than 20 years and have never heard
of such a rule.

Maybe I am not part of the "in" crowd.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-19 Thread Mike Blackwell
On Thu, Nov 19, 2015 at 10:05 AM, Robert Haas  wrote:

> On Wed, Nov 18, 2015 at 10:21 AM, Alvaro Herrera
>  wrote:
> > In my days of Perl, it was starting to become frowned upon to call
> > subroutines without parenthesizing arguments.  Is that no longer the
> > case?
>

​As I understand it, there are several reasons not to make function calls
in Perl without parenthesis.  Whether they are good reasons is a question
for the user.  Modern Perl  chapter
5 covers most of them.

__
*Mike Blackwell | Technical Analyst, Distribution Services/Rollout
Management | RR Donnelley*
1750 Wallace Ave | St Charles, IL 60174-3401
Office: 630.313.7818
mike.blackw...@rrd.com
http://www.rrdonnelley.com



* *
​


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-18 Thread Alvaro Herrera
Hi, I just started looking this over a bit.  The first thing I noticed
is that it adds a dependency on Archive::Tar which isn't already used
anywhere else.  Did anybody check whether this exists back in 5.8
installations?

Why is "recovery" added to ALWAYS_SUBDIRS in src/test/Makefile instead
of to SUBDIRS?  Seems a strange choice.

Instead of adding 
print "# Error output: $stderr\n" if $stderr ne "";
to sub psql, I think it would be better to add line separators, which
would be clearer if the error output ever turns into a multiline error
messages.  It would still show as empty if no stderr is produced; so I
think something like
if ($stderr ne '')
{
print " Begin standard error\n"
print $stderr;
print " End standard error\n";
}
or something like that.

In my days of Perl, it was starting to become frowned upon to call
subroutines without parenthesizing arguments.  Is that no longer the
case?  Because I notice there are many places in this patch and pre-
existing that call psql with an argument list without parens.  And it's
a bit odd because I couldn't find any other subroutine that we're using
in that way.

In 005_replay_delay there's a 2s delay configured; then we test whether
something is replayed in 1s.  I hate tests that run for a long time, but
is 2s good enough considering that some of our test animals in buildfarm
are really slow?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-18 Thread Erik Rijkers

On 2015-11-18 16:21, Alvaro Herrera wrote:

Hi, I just started looking this over a bit.  The first thing I noticed
is that it adds a dependency on Archive::Tar which isn't already used
anywhere else.  Did anybody check whether this exists back in 5.8
installations?


Apparently it did not yet exist in core then, Module::CoreList says 
5.9.3:


$ perl -MModule::CoreList -e ' print 
Module::CoreList->first_release('Archive::Tar'), "\n";'

5.009003





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-18 Thread Noah Misch
On Wed, Nov 18, 2015 at 01:21:45PM -0200, Alvaro Herrera wrote:
> In my days of Perl, it was starting to become frowned upon to call
> subroutines without parenthesizing arguments.  Is that no longer the
> case?

I've not witnessed those frowns.

> Because I notice there are many places in this patch and pre-
> existing that call psql with an argument list without parens.  And it's
> a bit odd because I couldn't find any other subroutine that we're using
> in that way.

TestLib.pm has unparenthesized calls to "standard_initdb", "start" and "run".
070_dropuser.pl has such calls to "start_test_server" and "psql".

> In 005_replay_delay there's a 2s delay configured; then we test whether
> something is replayed in 1s.  I hate tests that run for a long time, but
> is 2s good enough considering that some of our test animals in buildfarm
> are really slow?

That test will be unreliable, agreed.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-11-18 Thread Michael Paquier
On Thu, Nov 19, 2015 at 12:21 AM, Alvaro Herrera
 wrote:
>
> Hi, I just started looking this over a bit.  The first thing I noticed
> is that it adds a dependency on Archive::Tar which isn't already used
> anywhere else.  Did anybody check whether this exists back in 5.8
> installations?

Actually I didn't and that's a good point, we have decided to support
TAP down to 5.8.9. The only reason why I introduced this dependency is
that there is no easy native way to copy an entire folder in perl, and
that's for handling base backups. There are things like File::NCopy of
File::Copy::Recursive however it does not seem like a good idea to
depend on other modules that IPC::Run. Would it be better to have an
in-core module dedicated to that similar to SimpleTee.pm? Or are you
guys fine to accept a dependency with another module?

> Why is "recovery" added to ALWAYS_SUBDIRS in src/test/Makefile instead
> of to SUBDIRS?  Seems a strange choice.

Because I thought that it should not be part of the main regression
suite, like ssl/. Feel free to correct me if my feeling is wrong.

> Instead of adding
> print "# Error output: $stderr\n" if $stderr ne "";
> to sub psql, I think it would be better to add line separators, which
> would be clearer if the error output ever turns into a multiline error
> messages.  It would still show as empty if no stderr is produced; so I
> think something like
> if ($stderr ne '')
> {
> print " Begin standard error\n"
> print $stderr;
> print " End standard error\n";
> }
> or something like that.

Yes, that would be better.

> In my days of Perl, it was starting to become frowned upon to call
> subroutines without parenthesizing arguments.  Is that no longer the
> case?  Because I notice there are many places in this patch and pre-
> existing that call psql with an argument list without parens.  And it's
> a bit odd because I couldn't find any other subroutine that we're using
> in that way.

Hm, yeah. If we decide about a perl coding policy I would be happy to
follow it. Personally I prefer usually using parenthesis however if we
decide to make the calls consistent we had better address that as a
separate patch.

> In 005_replay_delay there's a 2s delay configured; then we test whether
> something is replayed in 1s.  I hate tests that run for a long time, but
> is 2s good enough considering that some of our test animals in buildfarm
> are really slow?

A call to poll_query_until ensures that we wait for the standby to
replay once the minimum replay threshold is reached. Even with a slow
machine the first query would still see only 10 rows at the first try,
and then wait for the standby to replay before checking if 20 rows are
visible. Or I am not following your point.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-11 Thread Amir Rohan
On 10/11/2015 01:19 PM, Michael Paquier wrote:
> On Sun, Oct 11, 2015 at 4:44 PM, Amir Rohan wrote:
>> On 10/11/2015 02:47 AM, Michael Paquier wrote:
>> I apologize -- that didn't came out right.
>> What I meant to suggest was "open an issue" to track
>> any works that needs to be done. But I guess that's
>> not the PG way.
> 
> No problem. I was not clear either. We could create a new item in the
> TODO list (https://wiki.postgresql.org/wiki/Todo) and link it to
> dedicated page on the wiki where all the potential tests would be
> listed.
> 

It couldn't hurt but also may be just a waste of your time.
I'm just realizing how central an issue tracker is to how I work and
how much not having one irritates me. Tough luck for me I guess.

Regards,
Amir



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-11 Thread Amir Rohan
On 10/11/2015 02:47 AM, Michael Paquier wrote:
> On Sat, Oct 10, 2015 at 10:52 PM, Amir Rohan  wrote:
>> On 10/10/2015 04:32 PM, Michael Paquier wrote:
>> I was arguing that it's an on-going task that would do
>> better if it had a TODO list, instead of "ideas for tests"
>> being scattered across 50-100 messages spanning a year or
>> more in one thread or another. You may disagree.
> 
> Let's be clear. I am fully in line with your point.
> 

I apologize -- that didn't came out right.
What I meant to suggest was "open an issue" to track
any works that needs to be done. But I guess that's
not the PG way.

Amir




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-11 Thread Michael Paquier
On Sun, Oct 11, 2015 at 4:44 PM, Amir Rohan wrote:
> On 10/11/2015 02:47 AM, Michael Paquier wrote:
> I apologize -- that didn't came out right.
> What I meant to suggest was "open an issue" to track
> any works that needs to be done. But I guess that's
> not the PG way.

No problem. I was not clear either. We could create a new item in the
TODO list (https://wiki.postgresql.org/wiki/Todo) and link it to
dedicated page on the wiki where all the potential tests would be
listed.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-10 Thread Amir Rohan
On 10/10/2015 04:32 PM, Michael Paquier wrote:
> On Sat, Oct 10, 2015 at 9:04 PM, Amir Rohan wrote:
>> Now that v9 fixes the problem, here's a summary from going over the
>> entire thread one last time:
> 
> Thanks a lot for the summary of the events.
> 
>> # Windows and TAP sets
>> Noah (2015-03) mentioned TAP doesn't work on windows, and hoped
>> this would include some work on that.
>> IIUC, the facilities and tests do run on windows, but focus was there
>> and not the preexisting TAP suite.
> 
> They do work on Windows, see 13d856e.
> 

Thanks, I did not know that.

>> # Test coverage (in the future)
>> Andres wanted a test for xid/multixid wraparound which also raises
>> the question of the tests that will need to be written in the future.
> 
> I recall that this would have needed extra functions on the backend...
> 
>> The patch focuses on providing facilities, while providing new coverage
>> for several features. There should be a TODO list on the wiki (bug
>> tracker, actually), where the list of tests to be written can be managed.
>> Some were mentioned in the thread (multi/xid wraparound
>> hot_standby_feedback, max_standby_archive_delay and
>> max_standby_streaming_delay? recovery_target_action? some in your
>> original list?), but threads
>> are precisely where these things get lost in the cracks.
> 
> Sure, that's an on-going task.
> 
>> # Directory structure
>> I suggested keeping backup/log/PGDATA per instance, rejected.
> 
> I guess that I am still flexible on this one, the node information
> (own PGDATA, connection string, port, etc.) is logged as well so this
> is not a big deal to me...
> 
>> # Parallel tests and port collisions
>> Lots about this. Final result is no port races are possible because
>> dedicated dirs are used per test, per instance. And because tcp
>> isn't used for connections on any platform (can you confirm that's
>> true on windows as well? I'm not familiar with sspi and what OSI
>> layer it lives on)
> 
> On Windows you remain with the problem that all nodes initialized
> using TestLib.pm will listen to 127.0.0.1, sspi being used to ensure
> that the connection at user level is secure (additional entries in
> pg_hba.conf are added).
> 
>> # decouple cleanup from node shutdown
>> Added (in latest patches?)
> 
> Yes this was added.
> 
>> Michael, is there anything else to do here or shall I mark this for
>> committer review?
> 
> I have nothing else. Thanks a lot!
> 

Ok, marked for committer, I hope I'm following "correct" cf procedure.

Regards,
Amir




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-10 Thread Michael Paquier
On Sat, Oct 10, 2015 at 10:52 PM, Amir Rohan  wrote:
> On 10/10/2015 04:32 PM, Michael Paquier wrote:
> I was arguing that it's an on-going task that would do
> better if it had a TODO list, instead of "ideas for tests"
> being scattered across 50-100 messages spanning a year or
> more in one thread or another. You may disagree.

Let's be clear. I am fully in line with your point.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-10 Thread Amir Rohan
On 10/10/2015 02:43 PM, Michael Paquier wrote:
> On Fri, Oct 9, 2015 at 8:53 PM, Michael Paquier
>  wrote:
>> On Fri, Oct 9, 2015 at 8:47 PM, Amir Rohan wrote:
>>> Ok, I've put myself down as reviewer in cfapp. I don't think I can
>>> provide any more useful feedback that would actually result in changes
>>> at this point, but I'll read through the entire discussion once last
>>> time and write down final comments/notes. After that I have no problem
>>> marking this for a committer to look at.
>>
>> OK. If you have any comments or remarks, please do not hesitate at all!
> 
> So, to let everybody know the issue, Amir has reported me offlist a
> bug in one of the tests that can be reproduced more easily on a slow
> machine:
> 

Yeah, I usually stick to the list for discussion, but I ran an earlier
version without issues and thought this might be a problem with my
system as I've changed things a bit this week.

Now that v9 fixes the probkem, here's a summary from going over the
entire thread one last time:

# Windows and TAP sets
Noah (2015-03) mentioned TAP doesn't work on windows, and hoped
this would include some work on that.

IIUC, the facilities and tests do run on windows, but focus was there
and not the preexisting TAP suite.

# Test coverage (in the future)
Andres wanted a test for xid/multixid wraparound which also raises
the question of the tests that will need to be written in the future.

The patch focuses on providing facilities, while providing new coverage
for several features. There should be a TODO list on the wiki (bug
tracker, actually), where the list of tests to be written can be managed.

Some were mentioned in the thread (multi/xid wraparound
hot_standby_feedback, max_standby_archive_delay and
max_standby_streaming_delay? recovery_target_action? some in your
original list?), but threads
are precisely where these things get lost in the cracks.

# Interactive use vs. TAP tests

Early on the goal was also to provide something for interactive use
in order to test scenarios. The shift has focused to the TAP tests
and some of the choices in the API reflect that. Interactive use
is possible, but wasn't a central requirement.

# Directory structure

I suggested keeping backup/log/PGDATA per instance, rejected.

# Parallel tests and port collisions

Lots about this. Final result is no port races are possible because
dedicated dirs are used per test, per instance. And because tcp
isn't used for connections on any platform (can you confirm that's
true on windows as well? I'm not familiar with sspi and what OSI
layer it lives on)

# Allow test to specify shutdown mode

Added

# decouple cleanup from node shutdown

Added (in latest patches?)

# Conveniences for test writing vs. running

My suggestions weren't picked up, but for one thing setting CLEANUP=0
in the lib (which means editing it...) can be useful for writers.

# blocking until server ready

pg_isready wrapper added.

# Multiple masters

back and forth, but supported in latest version.

That's it. I've ran the latest (v9) tests works and passed on my system
(fedora 64bit) and also under docker with --cpu-quota=1, which
simulates a slow machine.

Michael, is there anything else to do here or shall I mark this for
committer review?

Regards,
Amir









-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-10 Thread Michael Paquier
On Fri, Oct 9, 2015 at 8:53 PM, Michael Paquier
 wrote:
> On Fri, Oct 9, 2015 at 8:47 PM, Amir Rohan wrote:
>> Ok, I've put myself down as reviewer in cfapp. I don't think I can
>> provide any more useful feedback that would actually result in changes
>> at this point, but I'll read through the entire discussion once last
>> time and write down final comments/notes. After that I have no problem
>> marking this for a committer to look at.
>
> OK. If you have any comments or remarks, please do not hesitate at all!

So, to let everybody know the issue, Amir has reported me offlist a
bug in one of the tests that can be reproduced more easily on a slow
machine:

> Amir wrote:
> Before posting the summary, I ran the latest v8 patch on today's git
> master (9c42727) and got some errors:
> t/004_timeline_switch.pl ...
> 1..1
> # ERROR:  invalid input syntax for type pg_lsn: ""
> # LINE 1: SELECT ''::pg_lsn <= pg_last_xlog_replay_location()
> #^
> # No tests run!

And here is my reply:
This is a timing issue and can happen when standby1, the promoted
standby which standby2 reconnects to to check that recovery works with
a timeline jump, is still in recovery after being restarted. There is
a small windows where this is possible, and this gets easier to
reproduce on slow machines (did so on a VM). So the issue was in test
004. I have updated the script to check pg_is_in_recovery() to be sure
that the node exits recovery before querying it with
pg_current_xlog_location.

It is worth noticing that the following change has saved me a lot of pain:
--- a/src/test/perl/TestLib.pm
+++ b/src/test/perl/TestLib.pm
@@ -259,6 +259,7 @@ sub psql
my ($stdout, $stderr);
print("# Running SQL command: $sql\n");
run [ 'psql', '-X', '-A', '-t', '-q', '-d', $dbname, '-f',
'-'], '<', \$sql, '>', \$stdout, '2>', \$stderr or die;
+   print "# Error output: $stderr\n" if $stderr ne "";
Perhaps we should consider backpatching it, it helped me find out the
issue I faced.

Attached is an updated patch fixing 004.
Regards,
-- 
Michael
diff --git a/src/bin/pg_rewind/RewindTest.pm b/src/bin/pg_rewind/RewindTest.pm
index a4c1737..ea219d7 100644
--- a/src/bin/pg_rewind/RewindTest.pm
+++ b/src/bin/pg_rewind/RewindTest.pm
@@ -125,38 +125,6 @@ sub check_query
 	}
 }
 
-# Run a query once a second, until it returns 't' (i.e. SQL boolean true).
-sub poll_query_until
-{
-	my ($query, $connstr) = @_;
-
-	my $max_attempts = 30;
-	my $attempts = 0;
-	my ($stdout, $stderr);
-
-	while ($attempts < $max_attempts)
-	{
-		my $cmd = [ 'psql', '-At', '-c', "$query", '-d', "$connstr" ];
-		my $result = run $cmd, '>', \$stdout, '2>', \$stderr;
-
-		chomp($stdout);
-		$stdout =~ s/\r//g if $Config{osname} eq 'msys';
-		if ($stdout eq "t")
-		{
-			return 1;
-		}
-
-		# Wait a second before retrying.
-		sleep 1;
-		$attempts++;
-	}
-
-	# The query result didn't change in 30 seconds. Give up. Print the stderr
-	# from the last attempt, hopefully that's useful for debugging.
-	diag $stderr;
-	return 0;
-}
-
 sub append_to_file
 {
 	my ($filename, $str) = @_;
diff --git a/src/test/Makefile b/src/test/Makefile
index b713c2c..d6e51eb 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -17,7 +17,7 @@ SUBDIRS = regress isolation modules
 # We don't build or execute examples/, locale/, or thread/ by default,
 # but we do want "make clean" etc to recurse into them.  Likewise for ssl/,
 # because the SSL test suite is not secure to run on a multi-user system.
-ALWAYS_SUBDIRS = examples locale thread ssl
+ALWAYS_SUBDIRS = examples locale thread ssl recovery
 
 # We want to recurse to all subdirs for all standard targets, except that
 # installcheck and install should not recurse into the subdirectory "modules".
diff --git a/src/test/perl/RecoveryTest.pm b/src/test/perl/RecoveryTest.pm
new file mode 100644
index 000..b60bf5c
--- /dev/null
+++ b/src/test/perl/RecoveryTest.pm
@@ -0,0 +1,412 @@
+package RecoveryTest;
+
+# Set of common routines for recovery regression tests for a PostgreSQL
+# cluster. This includes global variables and methods that can be used
+# by the various set of tests present to set up cluster nodes and
+# configure them according to the test scenario wanted.
+#
+# Cluster nodes can be freely created using initdb or using the existing
+# base backup of another node, with minimum configuration done when the
+# node is created for the first time like having a proper port number.
+# It is then up to the test to decide what to do with the newly-created
+# node.
+#
+# Environment configuration of each node is available through a set
+# of global variables provided by this package, hashed depending on the
+# port number of a node:
+# - connstr_nodes connection string to connect to this node
+# - datadir_nodes to get the data folder of a given node
+# - archive_nodes for the location of the WAL archives of a node
+# - backup_nodes for the location of base backups of a node
+# - 

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-10 Thread Michael Paquier
On Sat, Oct 10, 2015 at 9:04 PM, Amir Rohan wrote:
> Now that v9 fixes the probkem, here's a summary from going over the
> entire thread one last time:

Thanks a lot for the summary of the events.

> # Windows and TAP sets
> Noah (2015-03) mentioned TAP doesn't work on windows, and hoped
> this would include some work on that.
> IIUC, the facilities and tests do run on windows, but focus was there
> and not the preexisting TAP suite.

They do work on Windows, see 13d856e.

> # Test coverage (in the future)
> Andres wanted a test for xid/multixid wraparound which also raises
> the question of the tests that will need to be written in the future.

I recall that this would have needed extra functions on the backend...

> The patch focuses on providing facilities, while providing new coverage
> for several features. There should be a TODO list on the wiki (bug
> tracker, actually), where the list of tests to be written can be managed.
> Some were mentioned in the thread (multi/xid wraparound
> hot_standby_feedback, max_standby_archive_delay and
> max_standby_streaming_delay? recovery_target_action? some in your
> original list?), but threads
> are precisely where these things get lost in the cracks.

Sure, that's an on-going task.

> # Directory structure
> I suggested keeping backup/log/PGDATA per instance, rejected.

I guess that I am still flexible on this one, the node information
(own PGDATA, connection string, port, etc.) is logged as well so this
is not a big deal to me...

> # Parallel tests and port collisions
> Lots about this. Final result is no port races are possible because
> dedicated dirs are used per test, per instance. And because tcp
> isn't used for connections on any platform (can you confirm that's
> true on windows as well? I'm not familiar with sspi and what OSI
> layer it lives on)

On Windows you remain with the problem that all nodes initialized
using TestLib.pm will listen to 127.0.0.1, sspi being used to ensure
that the connection at user level is secure (additional entries in
pg_hba.conf are added).

> # decouple cleanup from node shutdown
> Added (in latest patches?)

Yes this was added.

> Michael, is there anything else to do here or shall I mark this for
> committer review?

I have nothing else. Thanks a lot!
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-10 Thread Amir Rohan
On 10/10/2015 04:32 PM, Michael Paquier wrote:
> On Sat, Oct 10, 2015 at 9:04 PM, Amir Rohan wrote:
>> The patch focuses on providing facilities, while providing new coverage
>> for several features. There should be a TODO list on the wiki (bug
>> tracker, actually), where the list of tests to be written can be managed.
>> Some were mentioned in the thread (multi/xid wraparound
>> hot_standby_feedback, max_standby_archive_delay and
>> max_standby_streaming_delay? recovery_target_action? some in your
>> original list?), but threads
>> are precisely where these things get lost in the cracks.
> 
> Sure, that's an on-going task.
>  

I was arguing that it's an on-going task that would do
better if it had a TODO list, instead of "ideas for tests"
being scattered across 50-100 messages spanning a year or
more in one thread or another. You may disagree.

Amir



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-09 Thread Michael Paquier
On Fri, Oct 9, 2015 at 8:25 AM, Michael Paquier
 wrote:
> On Thu, Oct 8, 2015 at 11:28 PM, Amir Rohan wrote:
>> Wouldn't this work?
>> 1) start timer
>> 2) Grab pg_stat_replication.sent_location from master
>> 3) pg_switch_xlog() # I /think/ we want this, could be wrong
>
> For a warm standby, you would want that, but this depends on two factors:
> - The moment master completes archiving of this segment
> - The moment standby restores it.
> On slow machines, those two things become by far the bottleneck,
> imagine a PI restricted on I/O with a low-class SD card in the worst
> case (I maintain one, with a good card, still the I/O is a
> bottleneck).
>
>> 4) Poll slave's pg_last_xlog_replay_location() until LSN shows up
>> 5) stop timer
>
> That's not really solid, there is an interval of time between the
> moment the LSN position is taken from the master and the standby. An
> accurate method is to log/store on master when a given WAL position
> has been flushed to disk, and do the same on slave at replay for this
> LSN position. In any case this is doing to flood badly the logs of
> both nodes, and as the backend cares about the performance of
> operations in this code path we won't want to do that anyway.
>
> To make it short, it seems to me that simply waiting until the LSN a
> test is waiting for has been replayed is just but fine for this set of
> tests to ensure their run consistency, let's not forget that this is
> the goal here.

In terms of features, it seems that this patch has everything it needs
to allow one to design tests to work on both Linux and Windows, and it
is careful regarding CVE-2014-0067. Thoughts about moving that as
"Ready for committer"?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-09 Thread Amir Rohan
On 10/09/2015 02:12 PM, Michael Paquier wrote:
> On Fri, Oct 9, 2015 at 8:25 AM, Michael Paquier
>  wrote:
>> On Thu, Oct 8, 2015 at 11:28 PM, Amir Rohan wrote:
>>> Wouldn't this work?
>>> 1) start timer
>>> 2) Grab pg_stat_replication.sent_location from master
>>> 3) pg_switch_xlog() # I /think/ we want this, could be wrong
>>
>> For a warm standby, you would want that, but this depends on two factors:
>> - The moment master completes archiving of this segment
>> - The moment standby restores it.
>> On slow machines, those two things become by far the bottleneck,
>> imagine a PI restricted on I/O with a low-class SD card in the worst
>> case (I maintain one, with a good card, still the I/O is a
>> bottleneck).
>>
>>> 4) Poll slave's pg_last_xlog_replay_location() until LSN shows up
>>> 5) stop timer
>>
>> That's not really solid, there is an interval of time between the
>> moment the LSN position is taken from the master and the standby. An
>> accurate method is to log/store on master when a given WAL position
>> has been flushed to disk, and do the same on slave at replay for this
>> LSN position. In any case this is doing to flood badly the logs of
>> both nodes, and as the backend cares about the performance of
>> operations in this code path we won't want to do that anyway.
>>
>> To make it short, it seems to me that simply waiting until the LSN a
>> test is waiting for has been replayed is just but fine for this set of
>> tests to ensure their run consistency, let's not forget that this is
>> the goal here.
> 
> In terms of features, it seems that this patch has everything it needs
> to allow one to design tests to work on both Linux and Windows, and it
> is careful regarding CVE-2014-0067. Thoughts about moving that as
> "Ready for committer"?
> 

Ok, I've put myself down as reviewer in cfapp. I don't think I can
provide any more useful feedback that would actually result in changes
at this point, but I'll read through the entire discussion once last
time and write down final comments/notes. After that I have no problem
marking this for a committer to look at.

Amir



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-09 Thread Michael Paquier
On Fri, Oct 9, 2015 at 8:47 PM, Amir Rohan wrote:
> Ok, I've put myself down as reviewer in cfapp. I don't think I can
> provide any more useful feedback that would actually result in changes
> at this point, but I'll read through the entire discussion once last
> time and write down final comments/notes. After that I have no problem
> marking this for a committer to look at.

OK. If you have any comments or remarks, please do not hesitate at all!
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-08 Thread Amir Rohan
On 10/08/2015 08:19 AM, Michael Paquier wrote:
> On Wed, Oct 7, 2015 at 5:44 PM, Amir Rohan wrote:
>> On 10/07/2015 10:29 AM, Michael Paquier wrote:
>>> On Wed, Oct 7, 2015 at 4:16 PM, Amir Rohan wrote:
 Also, the removal of poll_query_until from pg_rewind looks suspiciously
 like a copy-paste gone bad. Pardon if I'm missing something.
>>>
>>> Perhaps. Do you have a suggestion regarding that? It seems to me that
>>> this is more useful in TestLib.pm as-is.
>>>
>>
>> My mistake, the patch only shows some internal function being deleted
>> but RewindTest.pm (obviously) imports TestLib. You're right, TestLib is
>> a better place for it.
> 
> OK. Here is a new patch version. I have removed the restriction
> preventing to call make_master multiple times in the same script (one
> may actually want to test some stuff related to logical decoding or
> FDW for example, who knows...), forcing PGHOST to always use the same
> value after it has been initialized. I have added a sanity check
> though, it is not possible to create a node based on a base backup if
> no master are defined. This looks like a cheap insurance... I also
> refactored a bit the code, using the new init_node_info to fill in the
> fields of a newly-initialized node, and I removed get_free_port,
> init_node, init_node_from_backup, enable_restoring and
> enable_streaming from the list of routines exposed to the users, those
> can be used directly with make_master, make_warm_standby and
> make_hot_standby. We could add them again if need be, somebody may
> want to be able to get a free port, set up a node without those
> generic routines, just that it does not seem necessary now.
> Regards,
> 

If you'd like, I can write up some tests for cascading replication which
are currently missing.

Someone mentioned a daisy chain setup which sounds fun. Anything else in
particular? Also, it would be nice to have some canned way to measure
end-to-end replication latency for variable number of nodes.
What about going back through the commit log and writing some regression
tests for the real stinkers, if someone care to volunteer some candidate
bugs

Amir





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-08 Thread Michael Paquier
On Thu, Oct 8, 2015 at 3:59 PM, Amir Rohan  wrote:
> On 10/08/2015 08:19 AM, Michael Paquier wrote:
>> On Wed, Oct 7, 2015 at 5:44 PM, Amir Rohan wrote:
>>> On 10/07/2015 10:29 AM, Michael Paquier wrote:
 On Wed, Oct 7, 2015 at 4:16 PM, Amir Rohan wrote:
> Also, the removal of poll_query_until from pg_rewind looks suspiciously
> like a copy-paste gone bad. Pardon if I'm missing something.

 Perhaps. Do you have a suggestion regarding that? It seems to me that
 this is more useful in TestLib.pm as-is.

>>>
>>> My mistake, the patch only shows some internal function being deleted
>>> but RewindTest.pm (obviously) imports TestLib. You're right, TestLib is
>>> a better place for it.
>>
>> OK. Here is a new patch version. I have removed the restriction
>> preventing to call make_master multiple times in the same script (one
>> may actually want to test some stuff related to logical decoding or
>> FDW for example, who knows...), forcing PGHOST to always use the same
>> value after it has been initialized. I have added a sanity check
>> though, it is not possible to create a node based on a base backup if
>> no master are defined. This looks like a cheap insurance... I also
>> refactored a bit the code, using the new init_node_info to fill in the
>> fields of a newly-initialized node, and I removed get_free_port,
>> init_node, init_node_from_backup, enable_restoring and
>> enable_streaming from the list of routines exposed to the users, those
>> can be used directly with make_master, make_warm_standby and
>> make_hot_standby. We could add them again if need be, somebody may
>> want to be able to get a free port, set up a node without those
>> generic routines, just that it does not seem necessary now.
>> Regards,
>>
>
> If you'd like, I can write up some tests for cascading replication which
> are currently missing.

001 is testing cascading, like that node1 -> node2 -> node3.

> Someone mentioned a daisy chain setup which sounds fun. Anything else in
> particular? Also, it would be nice to have some canned way to measure
> end-to-end replication latency for variable number of nodes.

Hm. Do you mean comparing the LSN position between two nodes even if
both nodes are not connected to each other? What would you use it for?

> What about going back through the commit log and writing some regression
> tests for the real stinkers, if someone care to volunteer some candidate
> bugs

I have drafted a list with a couple of items upthread:
http://www.postgresql.org/message-id/cab7npqsgffsphocrhfoasdanipvn6wsh2nykf1kayrm+9_m...@mail.gmail.com
So based on the existing patch the list becomes as follows:
- wal_retrieve_retry_interval with a high value, say setting to for
example 2/3s and loop until it is applied by checking it is it has
been received by the standby every second.
- recovery_target_action
- archive_cleanup_command
- recovery_end_command
- pg_xlog_replay_pause and pg_xlog_replay_resume
In the list of things that could have a test, I recall that we should
test as well 2PC with the recovery delay, look at a1105c3d. This could
be included in 005.
The advantage of implementing that now is that we could see if the
existing routines are solid enough or not. Still, looking at what the
patch has now I think that we had better get a committer look at it,
and if the core portion gets integrated we could already use it for
the patch implementing quorum synchronous replication and in doing
more advanced tests with pg_rewind regarding the timeline handling
(both patches of this CF). I don't mind adding more now, though I
think that the set of sample tests included in this version is enough
as a base implementation of the facility and shows what it can do.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-08 Thread Amir Rohan
On 10/08/2015 10:39 AM, Michael Paquier wrote:
> On Thu, Oct 8, 2015 at 3:59 PM, Amir Rohan  wrote:
>> On 10/08/2015 08:19 AM, Michael Paquier wrote:
>>> On Wed, Oct 7, 2015 at 5:44 PM, Amir Rohan wrote:
 On 10/07/2015 10:29 AM, Michael Paquier wrote:
> On Wed, Oct 7, 2015 at 4:16 PM, Amir Rohan wrote:
>> Also, the removal of poll_query_until from pg_rewind looks suspiciously
>> like a copy-paste gone bad. Pardon if I'm missing something.
>
> Perhaps. Do you have a suggestion regarding that? It seems to me that
> this is more useful in TestLib.pm as-is.
>

 My mistake, the patch only shows some internal function being deleted
 but RewindTest.pm (obviously) imports TestLib. You're right, TestLib is
 a better place for it.
>>>
>>> OK. Here is a new patch version. I have removed the restriction
>>> preventing to call make_master multiple times in the same script (one
>>> may actually want to test some stuff related to logical decoding or
>>> FDW for example, who knows...), forcing PGHOST to always use the same
>>> value after it has been initialized. I have added a sanity check
>>> though, it is not possible to create a node based on a base backup if
>>> no master are defined. This looks like a cheap insurance... I also
>>> refactored a bit the code, using the new init_node_info to fill in the
>>> fields of a newly-initialized node, and I removed get_free_port,
>>> init_node, init_node_from_backup, enable_restoring and
>>> enable_streaming from the list of routines exposed to the users, those
>>> can be used directly with make_master, make_warm_standby and
>>> make_hot_standby. We could add them again if need be, somebody may
>>> want to be able to get a free port, set up a node without those
>>> generic routines, just that it does not seem necessary now.
>>> Regards,
>>>
>>
>> If you'd like, I can write up some tests for cascading replication which
>> are currently missing.
> 
> 001 is testing cascading, like that node1 -> node2 -> node3.
> 
>> Someone mentioned a daisy chain setup which sounds fun. Anything else in
>> particular? Also, it would be nice to have some canned way to measure
>> end-to-end replication latency for variable number of nodes.
> 
> Hm. Do you mean comparing the LSN position between two nodes even if
> both nodes are not connected to each other? What would you use it for?
> 

In a cascading replication setup, the typical _time_ it takes for a
COMMIT on master to reach the slave (assuming constant WAL generation
rate) is an important operational metric.

It would be useful to catch future regressions for that metric,
which may happen even when a patch doesn't outright break cascading
replication. Just automating the measurement could be useful if
there's no pg facility that tracks performance over time in
a regimented fashion. I've seen multiple projects which consider
a "benchmark suite" to be part of its testing strategy.


As for the "daisy chain" thing, it was (IIRC) mentioned in a josh berkus
talk I caught on youtube. It's possible to setup cascading replication,
take down the master, and then reinsert it as replicating slave, so that
you end up with *all* servers replicating from the
ancestor in the chain, and no master. I think it was more
a fun hack then anything, but also an interesting corner case to
investigate.

>> What about going back through the commit log and writing some regression
>> tests for the real stinkers, if someone care to volunteer some candidate
>> bugs
> 
> I have drafted a list with a couple of items upthread:
> http://www.postgresql.org/message-id/cab7npqsgffsphocrhfoasdanipvn6wsh2nykf1kayrm+9_m...@mail.gmail.com
> So based on the existing patch the list becomes as follows:
> - wal_retrieve_retry_interval with a high value, say setting to for
> example 2/3s and loop until it is applied by checking it is it has
> been received by the standby every second.
> - recovery_target_action
> - archive_cleanup_command
> - recovery_end_command
> - pg_xlog_replay_pause and pg_xlog_replay_resume
> In the list of things that could have a test, I recall that we should
> test as well 2PC with the recovery delay, look at a1105c3d. This could
> be included in 005.

a1105c3 Mar 23 Fix copy & paste error in 4f1b890b137.  Andres Freund
4f1b890 Mar 15 Merge the various forms of transaction commit & abort
records.  Andres Freund

Is that the right commit?

> The advantage of implementing that now is that we could see if the
> existing routines are solid enough or not. 

I can do this if you point me at a self-contained thread/#issue.

> Still, looking at what the
> patch has now I think that we had better get a committer look at it,
> and if the core portion gets integrated we could already use it for
> the patch implementing quorum synchronous replication and in doing
> more advanced tests with pg_rewind regarding the timeline handling
> (both patches of this CF). 
>
> I don't mind adding more now, though I
> 

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-08 Thread Michael Paquier
On Thu, Oct 8, 2015 at 6:03 PM, Amir Rohan wrote:
> On 10/08/2015 10:39 AM, Michael Paquier wrote:
>>> Someone mentioned a daisy chain setup which sounds fun. Anything else in
>>> particular? Also, it would be nice to have some canned way to measure
>>> end-to-end replication latency for variable number of nodes.
>>
>> Hm. Do you mean comparing the LSN position between two nodes even if
>> both nodes are not connected to each other? What would you use it for?
>>
>
> In a cascading replication setup, the typical _time_ it takes for a
> COMMIT on master to reach the slave (assuming constant WAL generation
> rate) is an important operational metric.

Hm. You mean the exact amount of time it gets to be sure that a given
WAL position has been flushed on a cascading standby, be it across
multiple layers. Er, that's a bit tough without patching the backend
where I guess we would need to keep a track of when a LSN position has
been flushed. And calls of gettimeofday are expensive, so that does
not sound like a plausible alternative here to me...

> It would be useful to catch future regressions for that metric,
> which may happen even when a patch doesn't outright break cascading
> replication. Just automating the measurement could be useful if
> there's no pg facility that tracks performance over time in
> a regimented fashion. I've seen multiple projects which consider
> a "benchmark suite" to be part of its testing strategy.

Ah, OK. I see. That's a bit out of scope of this patch, and that's
really OS-dependent, but as long as the comparisons can be done on the
same OS it would make sense.

> As for the "daisy chain" thing, it was (IIRC) mentioned in a josh berkus
> talk I caught on youtube. It's possible to setup cascading replication,
> take down the master, and then reinsert it as replicating slave, so that
> you end up with *all* servers replicating from the
> ancestor in the chain, and no master. I think it was more
> a fun hack then anything, but also an interesting corner case to
> investigate.

Ah, yes. I recall this one. I am sure it made the audience smile. All
the nodes link to each other in closed circle.

>>> What about going back through the commit log and writing some regression
>>> tests for the real stinkers, if someone care to volunteer some candidate
>>> bugs
>>
>> I have drafted a list with a couple of items upthread:
>> http://www.postgresql.org/message-id/cab7npqsgffsphocrhfoasdanipvn6wsh2nykf1kayrm+9_m...@mail.gmail.com
>> So based on the existing patch the list becomes as follows:
>> - wal_retrieve_retry_interval with a high value, say setting to for
>> example 2/3s and loop until it is applied by checking it is it has
>> been received by the standby every second.
>> - recovery_target_action
>> - archive_cleanup_command
>> - recovery_end_command
>> - pg_xlog_replay_pause and pg_xlog_replay_resume
>> In the list of things that could have a test, I recall that we should
>> test as well 2PC with the recovery delay, look at a1105c3d. This could
>> be included in 005.
>
> a1105c3 Mar 23 Fix copy & paste error in 4f1b890b137.  Andres Freund
> 4f1b890 Mar 15 Merge the various forms of transaction commit & abort
> records.  Andres Freund
>
> Is that the right commit?

That's this one. a1105c3 was actually rather tricky... The idea is to
simply check the WAL replay delay with COMMIT PREPARED.

>> The advantage of implementing that now is that we could see if the
>> existing routines are solid enough or not.
>
> I can do this if you point me at a self-contained thread/#issue.

Hm. This patch is already 900 lines, perhaps it would be wiser not to
make it more complicated for now..
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-08 Thread Amir Rohan
On 10/08/2015 04:47 PM, Michael Paquier wrote:
> On Thu, Oct 8, 2015 at 6:03 PM, Amir Rohan wrote:
>> On 10/08/2015 10:39 AM, Michael Paquier wrote:
 Someone mentioned a daisy chain setup which sounds fun. Anything else in
 particular? Also, it would be nice to have some canned way to measure
 end-to-end replication latency for variable number of nodes.
>>>
>>> Hm. Do you mean comparing the LSN position between two nodes even if
>>> both nodes are not connected to each other? What would you use it for?
>>>
>>
>> In a cascading replication setup, the typical _time_ it takes for a
>> COMMIT on master to reach the slave (assuming constant WAL generation
>> rate) is an important operational metric.
> 
> Hm. You mean the exact amount of time it gets to be sure that a given
> WAL position has been flushed on a cascading standby, be it across
> multiple layers. Er, that's a bit tough without patching the backend
> where I guess we would need to keep a track of when a LSN position has
> been flushed. And calls of gettimeofday are expensive, so that does
> not sound like a plausible alternative here to me...
> 

Wouldn't this work?

1) start timer
2) Grab pg_stat_replication.sent_location from master
3) pg_switch_xlog() # I /think/ we want this, could be wrong
4) Poll slave's pg_last_xlog_replay_location() until LSN shows up
5) stop timer

Amir





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-08 Thread Michael Paquier
On Thu, Oct 8, 2015 at 11:28 PM, Amir Rohan wrote:
> Wouldn't this work?
> 1) start timer
> 2) Grab pg_stat_replication.sent_location from master
> 3) pg_switch_xlog() # I /think/ we want this, could be wrong

For a warm standby, you would want that, but this depends on two factors:
- The moment master completes archiving of this segment
- The moment standby restores it.
On slow machines, those two things become by far the bottleneck,
imagine a PI restricted on I/O with a low-class SD card in the worst
case (I maintain one, with a good card, still the I/O is a
bottleneck).

> 4) Poll slave's pg_last_xlog_replay_location() until LSN shows up
> 5) stop timer

That's not really solid, there is an interval of time between the
moment the LSN position is taken from the master and the standby. An
accurate method is to log/store on master when a given WAL position
has been flushed to disk, and do the same on slave at replay for this
LSN position. In any case this is doing to flood badly the logs of
both nodes, and as the backend cares about the performance of
operations in this code path we won't want to do that anyway.

To make it short, it seems to me that simply waiting until the LSN a
test is waiting for has been replayed is just but fine for this set of
tests to ensure their run consistency, let's not forget that this is
the goal here.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-07 Thread Amir Rohan
On 10/07/2015 09:27 AM, Michael Paquier wrote:
> On Wed, Oct 7, 2015 at 7:51 AM, Michael Paquier
>  wrote:
>> On Wed, Oct 7, 2015 at 7:43 AM, Michael Paquier
>>  wrote:
>>> On Wed, Oct 7, 2015 at 5:58 AM, Robert Haas wrote:
 On Sat, Oct 3, 2015 at 7:38 AM, Michael Paquier wrote:
 It seems that these days 'make check' creates a directory in /tmp
 called /tmp/pg_regress-RANDOMSTUFF.  Listening on TCP ports is
 disabled, and the socket goes inside this directory with a name like
 .s.PGSQL.PORT.  You can connect with psql -h
 /tmp/pg_regress-RANDOMSTUFF -p PORT, but not over TCP.  This basically
 removes the risk of TCP port number collisions, as well as the risk of
 your temporary instance being hijacked by a malicious user on the same
 machine.
>>>
>>> Right, that's for example /var/folders/ on OSX, and this is defined
>>> once per test run via $tempdir_short. PGHOST is set to that as well.
>>
>> Er, mistake here. That's actually once per standard_initdb, except
>> that all the tests I have included in my patch run it just once to
>> create a master node. It seems that it would be wiser to set one
>> socket dir per node then, remove the port assignment stuff, and use
>> tempdir_short as a key to define a node as well as in the connection
>> string to this node. I'll update the patch later today...
> 
> So, my conclusion regarding multiple calls of make_master is that we
> should not allow to do it. On Unix/Linux we could have a separate unix
> socket directory for each node, but not on Windows where
> listen_addresses is set to look after 127.0.0.1. On Unix/Linux, PGHOST
> is set by the master node to a tempdir once and for all. Hence, to
> make the code more consistent, I think that we should keep the port
> lookup machinery here. An updated patch is attached.
> 

If parallel tests are supported, get_free_port is still racy even
with last_port_found because it's:
1) process-local.
2) even if it were shared, there's the race window between the
available-check and the listen() I mentioned upthread.

If parallel tests are explicitly disallowed, a comment to that
effect (and a note on things known to break) might help someone
down the road.

Also, the removal of poll_query_until from pg_rewind looks suspiciously
like a copy-paste gone bad. Pardon if I'm missing something.

Amir



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-07 Thread Michael Paquier
On Wed, Oct 7, 2015 at 4:16 PM, Amir Rohan  wrote:
> On 10/07/2015 09:27 AM, Michael Paquier wrote:
>> On Wed, Oct 7, 2015 at 7:51 AM, Michael Paquier
>>  wrote:
>>> On Wed, Oct 7, 2015 at 7:43 AM, Michael Paquier
>>>  wrote:
 On Wed, Oct 7, 2015 at 5:58 AM, Robert Haas wrote:
> On Sat, Oct 3, 2015 at 7:38 AM, Michael Paquier wrote:
> It seems that these days 'make check' creates a directory in /tmp
> called /tmp/pg_regress-RANDOMSTUFF.  Listening on TCP ports is
> disabled, and the socket goes inside this directory with a name like
> .s.PGSQL.PORT.  You can connect with psql -h
> /tmp/pg_regress-RANDOMSTUFF -p PORT, but not over TCP.  This basically
> removes the risk of TCP port number collisions, as well as the risk of
> your temporary instance being hijacked by a malicious user on the same
> machine.

 Right, that's for example /var/folders/ on OSX, and this is defined
 once per test run via $tempdir_short. PGHOST is set to that as well.
>>>
>>> Er, mistake here. That's actually once per standard_initdb, except
>>> that all the tests I have included in my patch run it just once to
>>> create a master node. It seems that it would be wiser to set one
>>> socket dir per node then, remove the port assignment stuff, and use
>>> tempdir_short as a key to define a node as well as in the connection
>>> string to this node. I'll update the patch later today...
>>
>> So, my conclusion regarding multiple calls of make_master is that we
>> should not allow to do it. On Unix/Linux we could have a separate unix
>> socket directory for each node, but not on Windows where
>> listen_addresses is set to look after 127.0.0.1. On Unix/Linux, PGHOST
>> is set by the master node to a tempdir once and for all. Hence, to
>> make the code more consistent, I think that we should keep the port
>> lookup machinery here. An updated patch is attached.
>>
> If parallel tests are supported, get_free_port is still racy even
> with last_port_found because it's:
> 1) process-local.
> 2) even if it were shared, there's the race window between the
> available-check and the listen() I mentioned upthread.
>
> If parallel tests are explicitly disallowed, a comment to that
> effect (and a note on things known to break) might help someone
> down the road.

Actually, no, port lookup will not map and parallel tests would work
fine thinking more about it, each set of tests uses its own PGHOST to
a private unix socket directory so even if multiple tests use the same
port number they won't interact with each other because they connect
to different socket paths. MinGW is a problem though, and an existing
one in the perl test scripts, I recall that it can use make -j and
that's on Windows where PGHOST is mapping to 127.0.0.1 only.

> Also, the removal of poll_query_until from pg_rewind looks suspiciously
> like a copy-paste gone bad. Pardon if I'm missing something.

Perhaps. Do you have a suggestion regarding that? It seems to me that
this is more useful in TestLib.pm as-is.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-07 Thread Craig Ringer
On 25 September 2015 at 14:29, Michael Paquier
 wrote:

> I also arrived at the conclusion that it would be
> better to place the new package file in src/test/perl instead of
> src/test/recovery to allow any users of the TAP tests to have it in their
> PERL5LIB path and to be able to call the new routines to create and
> manipulate nodes.

While it's Python not Perl, you might find it interesting that support
for the replication protocol is being added to psycopg2, the Python
driver for PostgreSQL. I've been reviewing the patch at
https://github.com/psycopg/psycopg2/pull/322 .

I'm using it to write protocol validation for a logical decoding
plugin at the moment, so that the decoding plugin's output can be
validated in a consistent way for easily controlled inputs.

Perhaps it's worth teaching DBD::Pg to do this? Or adopting psycopg2
for some optional protocol tests...

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-07 Thread Michael Paquier
On Wed, Oct 7, 2015 at 7:51 AM, Michael Paquier
 wrote:
> On Wed, Oct 7, 2015 at 7:43 AM, Michael Paquier
>  wrote:
>> On Wed, Oct 7, 2015 at 5:58 AM, Robert Haas wrote:
>>> On Sat, Oct 3, 2015 at 7:38 AM, Michael Paquier wrote:
>>> It seems that these days 'make check' creates a directory in /tmp
>>> called /tmp/pg_regress-RANDOMSTUFF.  Listening on TCP ports is
>>> disabled, and the socket goes inside this directory with a name like
>>> .s.PGSQL.PORT.  You can connect with psql -h
>>> /tmp/pg_regress-RANDOMSTUFF -p PORT, but not over TCP.  This basically
>>> removes the risk of TCP port number collisions, as well as the risk of
>>> your temporary instance being hijacked by a malicious user on the same
>>> machine.
>>
>> Right, that's for example /var/folders/ on OSX, and this is defined
>> once per test run via $tempdir_short. PGHOST is set to that as well.
>
> Er, mistake here. That's actually once per standard_initdb, except
> that all the tests I have included in my patch run it just once to
> create a master node. It seems that it would be wiser to set one
> socket dir per node then, remove the port assignment stuff, and use
> tempdir_short as a key to define a node as well as in the connection
> string to this node. I'll update the patch later today...

So, my conclusion regarding multiple calls of make_master is that we
should not allow to do it. On Unix/Linux we could have a separate unix
socket directory for each node, but not on Windows where
listen_addresses is set to look after 127.0.0.1. On Unix/Linux, PGHOST
is set by the master node to a tempdir once and for all. Hence, to
make the code more consistent, I think that we should keep the port
lookup machinery here. An updated patch is attached.
-- 
Michael
diff --git a/src/bin/pg_rewind/RewindTest.pm b/src/bin/pg_rewind/RewindTest.pm
index a4c1737..ea219d7 100644
--- a/src/bin/pg_rewind/RewindTest.pm
+++ b/src/bin/pg_rewind/RewindTest.pm
@@ -125,38 +125,6 @@ sub check_query
 	}
 }
 
-# Run a query once a second, until it returns 't' (i.e. SQL boolean true).
-sub poll_query_until
-{
-	my ($query, $connstr) = @_;
-
-	my $max_attempts = 30;
-	my $attempts = 0;
-	my ($stdout, $stderr);
-
-	while ($attempts < $max_attempts)
-	{
-		my $cmd = [ 'psql', '-At', '-c', "$query", '-d', "$connstr" ];
-		my $result = run $cmd, '>', \$stdout, '2>', \$stderr;
-
-		chomp($stdout);
-		$stdout =~ s/\r//g if $Config{osname} eq 'msys';
-		if ($stdout eq "t")
-		{
-			return 1;
-		}
-
-		# Wait a second before retrying.
-		sleep 1;
-		$attempts++;
-	}
-
-	# The query result didn't change in 30 seconds. Give up. Print the stderr
-	# from the last attempt, hopefully that's useful for debugging.
-	diag $stderr;
-	return 0;
-}
-
 sub append_to_file
 {
 	my ($filename, $str) = @_;
diff --git a/src/test/Makefile b/src/test/Makefile
index b713c2c..d6e51eb 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -17,7 +17,7 @@ SUBDIRS = regress isolation modules
 # We don't build or execute examples/, locale/, or thread/ by default,
 # but we do want "make clean" etc to recurse into them.  Likewise for ssl/,
 # because the SSL test suite is not secure to run on a multi-user system.
-ALWAYS_SUBDIRS = examples locale thread ssl
+ALWAYS_SUBDIRS = examples locale thread ssl recovery
 
 # We want to recurse to all subdirs for all standard targets, except that
 # installcheck and install should not recurse into the subdirectory "modules".
diff --git a/src/test/perl/RecoveryTest.pm b/src/test/perl/RecoveryTest.pm
new file mode 100644
index 000..cbd2441
--- /dev/null
+++ b/src/test/perl/RecoveryTest.pm
@@ -0,0 +1,410 @@
+package RecoveryTest;
+
+# Set of common routines for recovery regression tests for a PostgreSQL
+# cluster. This includes global variables and methods that can be used
+# by the various set of tests present to set up cluster nodes and
+# configure them according to the test scenario wanted.
+#
+# Cluster nodes can be freely created using initdb or using the existing
+# base backup of another node, with minimum configuration done when the
+# node is created for the first time like having a proper port number.
+# It is then up to the test to decide what to do with the newly-created
+# node.
+#
+# Environment configuration of each node is available through a set
+# of global variables provided by this package, hashed depending on the
+# port number of a node:
+# - connstr_nodes connection string to connect to this node
+# - datadir_nodes to get the data folder of a given node
+# - archive_nodes for the location of the WAL archives of a node
+# - backup_nodes for the location of base backups of a node
+# - applname_nodes, application_name to use for a standby
+#
+# Nodes are identified by their port number, which should be unique
+# for each node of the cluster as it is run locally.
+
+use Cwd;
+use TestLib;
+use Test::More;
+
+use Archive::Tar;
+use IPC::Run qw(run start);
+
+use Exporter 'import';
+

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-07 Thread Amir Rohan
On 10/07/2015 10:29 AM, Michael Paquier wrote:
> On Wed, Oct 7, 2015 at 4:16 PM, Amir Rohan  wrote:
>> On 10/07/2015 09:27 AM, Michael Paquier wrote:
>>> On Wed, Oct 7, 2015 at 7:51 AM, Michael Paquier
>>>  wrote:
 On Wed, Oct 7, 2015 at 7:43 AM, Michael Paquier
  wrote:
> On Wed, Oct 7, 2015 at 5:58 AM, Robert Haas wrote:
>> On Sat, Oct 3, 2015 at 7:38 AM, Michael Paquier wrote:
>> It seems that these days 'make check' creates a directory in /tmp
>> called /tmp/pg_regress-RANDOMSTUFF.  Listening on TCP ports is
>> disabled, and the socket goes inside this directory with a name like
>> .s.PGSQL.PORT.  You can connect with psql -h
>> /tmp/pg_regress-RANDOMSTUFF -p PORT, but not over TCP.  This basically
>> removes the risk of TCP port number collisions, as well as the risk of
>> your temporary instance being hijacked by a malicious user on the same
>> machine.
>
> Right, that's for example /var/folders/ on OSX, and this is defined
> once per test run via $tempdir_short. PGHOST is set to that as well.

 Er, mistake here. That's actually once per standard_initdb, except
 that all the tests I have included in my patch run it just once to
 create a master node. It seems that it would be wiser to set one
 socket dir per node then, remove the port assignment stuff, and use
 tempdir_short as a key to define a node as well as in the connection
 string to this node. I'll update the patch later today...
>>>
>>> So, my conclusion regarding multiple calls of make_master is that we
>>> should not allow to do it. On Unix/Linux we could have a separate unix
>>> socket directory for each node, but not on Windows where
>>> listen_addresses is set to look after 127.0.0.1. On Unix/Linux, PGHOST
>>> is set by the master node to a tempdir once and for all. Hence, to
>>> make the code more consistent, I think that we should keep the port
>>> lookup machinery here. An updated patch is attached.
>>>
>> If parallel tests are supported, get_free_port is still racy even
>> with last_port_found because it's:
>> 1) process-local.
>> 2) even if it were shared, there's the race window between the
>> available-check and the listen() I mentioned upthread.
>>
>> If parallel tests are explicitly disallowed, a comment to that
>> effect (and a note on things known to break) might help someone
>> down the road.
> 
> Actually, no, port lookup will not map and parallel tests would work
> fine thinking more about it, each set of tests uses its own PGHOST to
> a private unix socket directory so even if multiple tests use the same
> port number they won't interact with each other because they connect
> to different socket paths. MinGW is a problem though, and an existing
> one in the perl test scripts, I recall that it can use make -j and
> that's on Windows where PGHOST is mapping to 127.0.0.1 only.
> 

ah, the portnum is actually a real tcp port only on windows, and
the race is limited to that case as you say. Note that in the
tcp case, using psql to check is wrong:
$ nc -l 8001  # listen on 8001
$ psql -X -h lo -p 8001 postgres < /dev/null psql: could not connect to
server: Connection refused
Is the server running on host "lo" (127.0.0.1) and accepting
TCP/IP connections on port 8001?

The port isn't free, but psql is really only checking if pg is there
and reports that the port is available. That's a fairly mild issue, though.

>> Also, the removal of poll_query_until from pg_rewind looks suspiciously
>> like a copy-paste gone bad. Pardon if I'm missing something.
> 
> Perhaps. Do you have a suggestion regarding that? It seems to me that
> this is more useful in TestLib.pm as-is.
> 

My mistake, the patch only shows some internal function being deleted
but RewindTest.pm (obviously) imports TestLib. You're right, TestLib is
a better place for it.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-07 Thread Michael Paquier
On Wed, Oct 7, 2015 at 5:44 PM, Amir Rohan wrote:
> On 10/07/2015 10:29 AM, Michael Paquier wrote:
>> On Wed, Oct 7, 2015 at 4:16 PM, Amir Rohan wrote:
>>> Also, the removal of poll_query_until from pg_rewind looks suspiciously
>>> like a copy-paste gone bad. Pardon if I'm missing something.
>>
>> Perhaps. Do you have a suggestion regarding that? It seems to me that
>> this is more useful in TestLib.pm as-is.
>>
>
> My mistake, the patch only shows some internal function being deleted
> but RewindTest.pm (obviously) imports TestLib. You're right, TestLib is
> a better place for it.

OK. Here is a new patch version. I have removed the restriction
preventing to call make_master multiple times in the same script (one
may actually want to test some stuff related to logical decoding or
FDW for example, who knows...), forcing PGHOST to always use the same
value after it has been initialized. I have added a sanity check
though, it is not possible to create a node based on a base backup if
no master are defined. This looks like a cheap insurance... I also
refactored a bit the code, using the new init_node_info to fill in the
fields of a newly-initialized node, and I removed get_free_port,
init_node, init_node_from_backup, enable_restoring and
enable_streaming from the list of routines exposed to the users, those
can be used directly with make_master, make_warm_standby and
make_hot_standby. We could add them again if need be, somebody may
want to be able to get a free port, set up a node without those
generic routines, just that it does not seem necessary now.
Regards,
-- 
Michael
diff --git a/src/bin/pg_rewind/RewindTest.pm b/src/bin/pg_rewind/RewindTest.pm
index a4c1737..ea219d7 100644
--- a/src/bin/pg_rewind/RewindTest.pm
+++ b/src/bin/pg_rewind/RewindTest.pm
@@ -125,38 +125,6 @@ sub check_query
 	}
 }
 
-# Run a query once a second, until it returns 't' (i.e. SQL boolean true).
-sub poll_query_until
-{
-	my ($query, $connstr) = @_;
-
-	my $max_attempts = 30;
-	my $attempts = 0;
-	my ($stdout, $stderr);
-
-	while ($attempts < $max_attempts)
-	{
-		my $cmd = [ 'psql', '-At', '-c', "$query", '-d', "$connstr" ];
-		my $result = run $cmd, '>', \$stdout, '2>', \$stderr;
-
-		chomp($stdout);
-		$stdout =~ s/\r//g if $Config{osname} eq 'msys';
-		if ($stdout eq "t")
-		{
-			return 1;
-		}
-
-		# Wait a second before retrying.
-		sleep 1;
-		$attempts++;
-	}
-
-	# The query result didn't change in 30 seconds. Give up. Print the stderr
-	# from the last attempt, hopefully that's useful for debugging.
-	diag $stderr;
-	return 0;
-}
-
 sub append_to_file
 {
 	my ($filename, $str) = @_;
diff --git a/src/test/Makefile b/src/test/Makefile
index b713c2c..d6e51eb 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -17,7 +17,7 @@ SUBDIRS = regress isolation modules
 # We don't build or execute examples/, locale/, or thread/ by default,
 # but we do want "make clean" etc to recurse into them.  Likewise for ssl/,
 # because the SSL test suite is not secure to run on a multi-user system.
-ALWAYS_SUBDIRS = examples locale thread ssl
+ALWAYS_SUBDIRS = examples locale thread ssl recovery
 
 # We want to recurse to all subdirs for all standard targets, except that
 # installcheck and install should not recurse into the subdirectory "modules".
diff --git a/src/test/perl/RecoveryTest.pm b/src/test/perl/RecoveryTest.pm
new file mode 100644
index 000..baa4d31
--- /dev/null
+++ b/src/test/perl/RecoveryTest.pm
@@ -0,0 +1,412 @@
+package RecoveryTest;
+
+# Set of common routines for recovery regression tests for a PostgreSQL
+# cluster. This includes global variables and methods that can be used
+# by the various set of tests present to set up cluster nodes and
+# configure them according to the test scenario wanted.
+#
+# Cluster nodes can be freely created using initdb or using the existing
+# base backup of another node, with minimum configuration done when the
+# node is created for the first time like having a proper port number.
+# It is then up to the test to decide what to do with the newly-created
+# node.
+#
+# Environment configuration of each node is available through a set
+# of global variables provided by this package, hashed depending on the
+# port number of a node:
+# - connstr_nodes connection string to connect to this node
+# - datadir_nodes to get the data folder of a given node
+# - archive_nodes for the location of the WAL archives of a node
+# - backup_nodes for the location of base backups of a node
+# - applname_nodes, application_name to use for a standby
+#
+# Nodes are identified by their port number, which should be unique
+# for each node of the cluster as it is run locally.
+
+use Cwd;
+use TestLib;
+use Test::More;
+
+use Archive::Tar;
+use IPC::Run qw(run start);
+
+use Exporter 'import';
+
+our @EXPORT = qw(
+	%connstr_nodes
+	%datadir_nodes
+	%backup_nodes
+	%archive_nodes
+	%applname_nodes
+
+	append_to_file
+	backup_node
+	disable_node
+	dump_node_info
+	enable_archiving
+	enable_node
+	

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-06 Thread Robert Haas
On Sat, Oct 3, 2015 at 7:38 AM, Michael Paquier
 wrote:
>> Granted, you have to try fairly hard to shoot yourself in the leg,
>> but since the solution is so simple, why not? If we never reuse ports
>> within a single test, this goes away.
>
> Well, you can reuse the same port number in a test. Simply teardown
> the existing node and then recreate a new one. I think that port
> number assignment to a node should be transparent to the caller, in
> our case the perl test script holding a scenario.

It seems that these days 'make check' creates a directory in /tmp
called /tmp/pg_regress-RANDOMSTUFF.  Listening on TCP ports is
disabled, and the socket goes inside this directory with a name like
.s.PGSQL.PORT.  You can connect with psql -h
/tmp/pg_regress-RANDOMSTUFF -p PORT, but not over TCP.  This basically
removes the risk of TCP port number collisions, as well as the risk of
your temporary instance being hijacked by a malicious user on the same
machine.  I'm not sure what we do on Windows, though.

>> In particular, I was shutting down an archiving node and the archiving
>> was truncated. I *think* smart doesn't do this. But again, it's really
>> that the test writer can't easily override, not that the default is wrong.
>
> Ah, OK. Then fast is just fine. It shuts down the node correctly.
> "smart" would wait for all the current connections to finish but I am
> wondering if it currently matters here: I don't see yet a clear use
> case yet where it would make sense to have multi-threaded script... If
> somebody comes back with a clear idea here perhaps we could revisit
> that but it does not seem worth it now.

I don't have anything brilliant to say about this point, but here's a
perhaps-not-brilliant comment:

If there's a bug in one of smart and fast shutdown and the other works
great, it would be nice to catch that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-06 Thread Michael Paquier
On Wed, Oct 7, 2015 at 5:58 AM, Robert Haas wrote:
> On Sat, Oct 3, 2015 at 7:38 AM, Michael Paquier wrote:
> It seems that these days 'make check' creates a directory in /tmp
> called /tmp/pg_regress-RANDOMSTUFF.  Listening on TCP ports is
> disabled, and the socket goes inside this directory with a name like
> .s.PGSQL.PORT.  You can connect with psql -h
> /tmp/pg_regress-RANDOMSTUFF -p PORT, but not over TCP.  This basically
> removes the risk of TCP port number collisions, as well as the risk of
> your temporary instance being hijacked by a malicious user on the same
> machine.

Right, that's for example /var/folders/ on OSX, and this is defined
once per test run via $tempdir_short. PGHOST is set to that as well.

> I'm not sure what we do on Windows, though.

sspi with include_realm through 127.0.0.1.

>>> In particular, I was shutting down an archiving node and the archiving
>>> was truncated. I *think* smart doesn't do this. But again, it's really
>>> that the test writer can't easily override, not that the default is wrong.
>>
>> Ah, OK. Then fast is just fine. It shuts down the node correctly.
>> "smart" would wait for all the current connections to finish but I am
>> wondering if it currently matters here: I don't see yet a clear use
>> case yet where it would make sense to have multi-threaded script... If
>> somebody comes back with a clear idea here perhaps we could revisit
>> that but it does not seem worth it now.
>
> I don't have anything brilliant to say about this point, but here's a
> perhaps-not-brilliant comment:
>
> If there's a bug in one of smart and fast shutdown and the other works
> great, it would be nice to catch that.

Yes, sure. I extended the patch to support other stop modes than fast,
the default being kept to fast if none is defined.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-06 Thread Michael Paquier
On Wed, Oct 7, 2015 at 7:43 AM, Michael Paquier
 wrote:
> On Wed, Oct 7, 2015 at 5:58 AM, Robert Haas wrote:
>> On Sat, Oct 3, 2015 at 7:38 AM, Michael Paquier wrote:
>> It seems that these days 'make check' creates a directory in /tmp
>> called /tmp/pg_regress-RANDOMSTUFF.  Listening on TCP ports is
>> disabled, and the socket goes inside this directory with a name like
>> .s.PGSQL.PORT.  You can connect with psql -h
>> /tmp/pg_regress-RANDOMSTUFF -p PORT, but not over TCP.  This basically
>> removes the risk of TCP port number collisions, as well as the risk of
>> your temporary instance being hijacked by a malicious user on the same
>> machine.
>
> Right, that's for example /var/folders/ on OSX, and this is defined
> once per test run via $tempdir_short. PGHOST is set to that as well.

Er, mistake here. That's actually once per standard_initdb, except
that all the tests I have included in my patch run it just once to
create a master node. It seems that it would be wiser to set one
socket dir per node then, remove the port assignment stuff, and use
tempdir_short as a key to define a node as well as in the connection
string to this node. I'll update the patch later today...
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-04 Thread Michael Paquier
On Sat, Oct 3, 2015 at 10:47 PM, Michael Paquier wrote:
> On Sat, Oct 3, 2015 at 9:50 PM, Amir Rohan wrote:
 Block until recovery is finished, before testing. eliminate races, and
 avoid the stupid sleep(3) I used.
>>>
>>> TODO
>
> Well. I just recalled this item in the list of things you mentioned. I
> marked it but forgot to address it. It sounds right that we may want
> something using pg_isready in a loop as a node in recovery would
> reject connections.

I just hacked up an updated version with the following things:
- Optional argument for stop_node to define the stop mode of pg_ctl
- Addition of wait_for_node where pg_isready is used to wait until a
node is ready to accept queries
- Addition of a local lookup variable to track the last port assigned.
This accelerates get_free_port.
Regards,
-- 
Michael
diff --git a/src/bin/pg_rewind/RewindTest.pm b/src/bin/pg_rewind/RewindTest.pm
index a4c1737..ea219d7 100644
--- a/src/bin/pg_rewind/RewindTest.pm
+++ b/src/bin/pg_rewind/RewindTest.pm
@@ -125,38 +125,6 @@ sub check_query
 	}
 }
 
-# Run a query once a second, until it returns 't' (i.e. SQL boolean true).
-sub poll_query_until
-{
-	my ($query, $connstr) = @_;
-
-	my $max_attempts = 30;
-	my $attempts = 0;
-	my ($stdout, $stderr);
-
-	while ($attempts < $max_attempts)
-	{
-		my $cmd = [ 'psql', '-At', '-c', "$query", '-d', "$connstr" ];
-		my $result = run $cmd, '>', \$stdout, '2>', \$stderr;
-
-		chomp($stdout);
-		$stdout =~ s/\r//g if $Config{osname} eq 'msys';
-		if ($stdout eq "t")
-		{
-			return 1;
-		}
-
-		# Wait a second before retrying.
-		sleep 1;
-		$attempts++;
-	}
-
-	# The query result didn't change in 30 seconds. Give up. Print the stderr
-	# from the last attempt, hopefully that's useful for debugging.
-	diag $stderr;
-	return 0;
-}
-
 sub append_to_file
 {
 	my ($filename, $str) = @_;
diff --git a/src/test/Makefile b/src/test/Makefile
index b713c2c..d6e51eb 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -17,7 +17,7 @@ SUBDIRS = regress isolation modules
 # We don't build or execute examples/, locale/, or thread/ by default,
 # but we do want "make clean" etc to recurse into them.  Likewise for ssl/,
 # because the SSL test suite is not secure to run on a multi-user system.
-ALWAYS_SUBDIRS = examples locale thread ssl
+ALWAYS_SUBDIRS = examples locale thread ssl recovery
 
 # We want to recurse to all subdirs for all standard targets, except that
 # installcheck and install should not recurse into the subdirectory "modules".
diff --git a/src/test/perl/RecoveryTest.pm b/src/test/perl/RecoveryTest.pm
new file mode 100644
index 000..aa8998c
--- /dev/null
+++ b/src/test/perl/RecoveryTest.pm
@@ -0,0 +1,401 @@
+package RecoveryTest;
+
+# Set of common routines for recovery regression tests for a PostgreSQL
+# cluster. This includes global variables and methods that can be used
+# by the various set of tests present to set up cluster nodes and
+# configure them according to the test scenario wanted.
+#
+# Cluster nodes can be freely created using initdb or using the existing
+# base backup of another node, with minimum configuration done when the
+# node is created for the first time like having a proper port number.
+# It is then up to the test to decide what to do with the newly-created
+# node.
+#
+# Environment configuration of each node is available through a set
+# of global variables provided by this package, hashed depending on the
+# port number of a node:
+# - connstr_nodes connection string to connect to this node
+# - datadir_nodes to get the data folder of a given node
+# - archive_nodes for the location of the WAL archives of a node
+# - backup_nodes for the location of base backups of a node
+# - applname_nodes, application_name to use for a standby
+#
+# Nodes are identified by their port number, which should be unique
+# for each node of the cluster as it is run locally.
+
+use Cwd;
+use TestLib;
+use Test::More;
+
+use Archive::Tar;
+use IPC::Run qw(run start);
+
+use Exporter 'import';
+
+our @EXPORT = qw(
+	%connstr_nodes
+	%datadir_nodes
+	%backup_nodes
+	%archive_nodes
+	%applname_nodes
+
+	append_to_file
+	backup_node
+	disable_node
+	dump_node_info
+	enable_archiving
+	enable_node
+	enable_restoring
+	enable_streaming
+	get_free_port
+	init_node
+	init_node_from_backup
+	make_master
+	make_warm_standby
+	make_hot_standby
+	restart_node
+	start_node
+	stop_node
+	teardown_node
+);
+
+# Global variables for node data
+%datadir_nodes = {};	# PGDATA folders
+%backup_nodes = {};		# Backup base folder
+%archive_nodes = {};	# Archive base folder
+%connstr_nodes = {};	# Connection strings
+%applname_nodes = {};	# application_name used for standbys
+
+# Tracking of last port value assigned to accelerate free port lookup.
+# XXX: Should this part use PG_VERSION_NUM?
+my $last_port_assigned =  90600 % 16384 + 49152;
+
+# Database used for each connection attempt via psql
+$ENV{PGDATABASE} = "postgres";
+
+# Tracker of active nodes
+my @active_nodes = 

Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-04 Thread Andres Freund
On October 4, 2015 3:27:00 PM GMT+02:00, Amir Rohan  wrote:

>Perhaps it would help a little if you posted the latest patch here as
>well? So that at least the app picks it up again.

You can as additional threads in the cf app.

-- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.

2015-10-04 Thread Amir Rohan
On 10/04/2015 04:29 PM, Andres Freund wrote:
> On October 4, 2015 3:27:00 PM GMT+02:00, Amir Rohan  
> wrote:
> 
>> Perhaps it would help a little if you posted the latest patch here as
>> well? So that at least the app picks it up again.
> 
> You can as additional threads in the cf app.
> 

Done, thank you.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >