Re: Addition of --no-sync to pg_upgrade for test speedup

2021-12-21 Thread Michael Paquier
On Mon, Dec 20, 2021 at 10:46:13AM -0300, Alvaro Herrera wrote:
> On 2021-Dec-16, Michael Paquier wrote:
>> In pg_upgrade, we let the flush happen with initdb --sync-only, based
>> on the binary path of the new cluster, so I think that we are not
>> going to miss any test coverage by skipping that.
> 
> There was one patch of mine with breakage that only manifested in the
> pg_upgrade test *because* of its lack of no-sync.  I'm afraid that this
> change would hide certain problems.
> https://postgr.es/m/20210130023011.n545o54j65t4k...@alap3.anarazel.de

Hmm.  This talks about fsync=on being a factor counting in detecting a
failure with the backend.  Why would the fsync done with initdb
--sync-only on the target cluster once pg_upgrade is done change
something here?

> I'm not 100% comfortable with this.  What can we do to preserve *some*
> testing that include syncing?  Maybe some option that a few buildfarm
> animals use?

If you object about this part, I am fine to revert the change in
test.sh until there is a better facility to enforce syncs across tests
in the buildfarm, though.  I can hack something to centralize all
that, of course, but I am not sure when I'll be able to do so in the
short term.  Could I keep that in MSVC's vcregress.pl at least for the
time being?
--
Michael


signature.asc
Description: PGP signature


Re: Addition of --no-sync to pg_upgrade for test speedup

2021-12-20 Thread Alvaro Herrera
On 2021-Dec-16, Michael Paquier wrote:

> In pg_upgrade, we let the flush happen with initdb --sync-only, based
> on the binary path of the new cluster, so I think that we are not
> going to miss any test coverage by skipping that.

There was one patch of mine with breakage that only manifested in the
pg_upgrade test *because* of its lack of no-sync.  I'm afraid that this
change would hide certain problems.
https://postgr.es/m/20210130023011.n545o54j65t4k...@alap3.anarazel.de

> Thoughts or opinions?

I'm not 100% comfortable with this.  What can we do to preserve *some*
testing that include syncing?  Maybe some option that a few buildfarm
animals use?

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"The Postgresql hackers have what I call a "NASA space shot" mentality.
 Quite refreshing in a world of "weekend drag racer" developers."
(Scott Marlowe)




Re: Addition of --no-sync to pg_upgrade for test speedup

2021-12-18 Thread Michael Paquier
On Fri, Dec 17, 2021 at 09:47:05AM -0500, Bruce Momjian wrote:
> On Fri, Dec 17, 2021 at 10:21:04AM +0100, Peter Eisentraut wrote:
>> I think that is reasonable.

Thanks.  I have applied that, as that really helped here.

>> Maybe we could have some global option, like some environment variable, that
>> enables the "sync" mode in all tests, so it's easy to test that once in a
>> while.  Not really a requirement for your patch, but an idea in case this is
>> a concern.
> 
> Yes, I think it would be good to see all the places we might want to
> pass the no-sync option.

The remaining places in src/bin/ that I can see are pg_resetwal, where
we would fsync() a WAL segment full of zeros, and pg_recvlogical
OutputFsync(), which does not point to much data, I guess.  The first
one may be worth it, but that's just 16MB we are talking about and
WriteEmptyXLOG() is not a code path taken currently by the tests.

We could introduce a new environment variable if one wishes to enforce
those flushes, say PG_TEST_SYNC, on top of patching any TAP test that
has a --no-sync to filter it out.
--
Michael


signature.asc
Description: PGP signature


Re: Addition of --no-sync to pg_upgrade for test speedup

2021-12-17 Thread Bruce Momjian
On Fri, Dec 17, 2021 at 10:21:04AM +0100, Peter Eisentraut wrote:
> On 16.12.21 07:50, Michael Paquier wrote:
> > As per $subject, avoiding the flush of the new cluster's data
> > directory shortens a bint the runtime of the test.  In some of my slow
> > VMs, aka Windows, this shaves a couple of seconds even if the bulk of
> > the time is still spent on the main regression test suite.
> > 
> > In pg_upgrade, we let the flush happen with initdb --sync-only, based
> > on the binary path of the new cluster, so I think that we are not
> > going to miss any test coverage by skipping that.
> 
> I think that is reasonable.
> 
> Maybe we could have some global option, like some environment variable, that
> enables the "sync" mode in all tests, so it's easy to test that once in a
> while.  Not really a requirement for your patch, but an idea in case this is
> a concern.

Yes, I think it would be good to see all the places we might want to
pass the no-sync option.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.





Re: Addition of --no-sync to pg_upgrade for test speedup

2021-12-17 Thread Peter Eisentraut

On 16.12.21 07:50, Michael Paquier wrote:

As per $subject, avoiding the flush of the new cluster's data
directory shortens a bint the runtime of the test.  In some of my slow
VMs, aka Windows, this shaves a couple of seconds even if the bulk of
the time is still spent on the main regression test suite.

In pg_upgrade, we let the flush happen with initdb --sync-only, based
on the binary path of the new cluster, so I think that we are not
going to miss any test coverage by skipping that.


I think that is reasonable.

Maybe we could have some global option, like some environment variable, 
that enables the "sync" mode in all tests, so it's easy to test that 
once in a while.  Not really a requirement for your patch, but an idea 
in case this is a concern.






Addition of --no-sync to pg_upgrade for test speedup

2021-12-15 Thread Michael Paquier
Hi all,

As per $subject, avoiding the flush of the new cluster's data
directory shortens a bint the runtime of the test.  In some of my slow
VMs, aka Windows, this shaves a couple of seconds even if the bulk of
the time is still spent on the main regression test suite.

In pg_upgrade, we let the flush happen with initdb --sync-only, based
on the binary path of the new cluster, so I think that we are not
going to miss any test coverage by skipping that.

Thoughts or opinions?
--
Michael
diff --git a/src/bin/pg_upgrade/option.c b/src/bin/pg_upgrade/option.c
index 64bbda5650..5c6b6d2411 100644
--- a/src/bin/pg_upgrade/option.c
+++ b/src/bin/pg_upgrade/option.c
@@ -43,6 +43,7 @@ parseCommandLine(int argc, char *argv[])
 		{"new-datadir", required_argument, NULL, 'D'},
 		{"old-bindir", required_argument, NULL, 'b'},
 		{"new-bindir", required_argument, NULL, 'B'},
+		{"no-sync", no_argument, NULL, 'N'},
 		{"old-options", required_argument, NULL, 'o'},
 		{"new-options", required_argument, NULL, 'O'},
 		{"old-port", required_argument, NULL, 'p'},
@@ -67,6 +68,7 @@ parseCommandLine(int argc, char *argv[])
 	time_t		run_time = time(NULL);
 
 	user_opts.transfer_mode = TRANSFER_MODE_COPY;
+	user_opts.do_sync = true;
 
 	os_info.progname = get_progname(argv[0]);
 
@@ -101,7 +103,7 @@ parseCommandLine(int argc, char *argv[])
 	if (os_user_effective_id == 0)
 		pg_fatal("%s: cannot be run as root\n", os_info.progname);
 
-	while ((option = getopt_long(argc, argv, "d:D:b:B:cj:ko:O:p:P:rs:U:v",
+	while ((option = getopt_long(argc, argv, "d:D:b:B:cj:kNo:O:p:P:rs:U:v",
  long_options, )) != -1)
 	{
 		switch (option)
@@ -134,6 +136,10 @@ parseCommandLine(int argc, char *argv[])
 user_opts.transfer_mode = TRANSFER_MODE_LINK;
 break;
 
+			case 'N':
+user_opts.do_sync = false;
+break;
+
 			case 'o':
 /* append option? */
 if (!old_cluster.pgopts)
@@ -297,6 +303,7 @@ usage(void)
 	printf(_("  -D, --new-datadir=DATADIR new cluster data directory\n"));
 	printf(_("  -j, --jobs=NUMnumber of simultaneous processes or threads to use\n"));
 	printf(_("  -k, --linklink instead of copying files to new cluster\n"));
+	printf(_("  -N, --no-sync do not wait for changes to be written safely to disk\n"));
 	printf(_("  -o, --old-options=OPTIONS old cluster options to pass to the server\n"));
 	printf(_("  -O, --new-options=OPTIONS new cluster options to pass to the server\n"));
 	printf(_("  -p, --old-port=PORT   old cluster port number (default %d)\n"), old_cluster.port);
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 3628bd74a7..f85cb2e262 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -169,11 +169,14 @@ main(int argc, char **argv)
 			  new_cluster.pgdata);
 	check_ok();
 
-	prep_status("Sync data directory to disk");
-	exec_prog(UTILITY_LOG_FILE, NULL, true, true,
-			  "\"%s/initdb\" --sync-only \"%s\"", new_cluster.bindir,
-			  new_cluster.pgdata);
-	check_ok();
+	if (user_opts.do_sync)
+	{
+		prep_status("Sync data directory to disk");
+		exec_prog(UTILITY_LOG_FILE, NULL, true, true,
+  "\"%s/initdb\" --sync-only \"%s\"", new_cluster.bindir,
+  new_cluster.pgdata);
+		check_ok();
+	}
 
 	create_script_for_old_cluster_deletion(_script_file_name);
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index 235a770026..22169f1002 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -279,6 +279,7 @@ typedef struct
 {
 	bool		check;			/* true -> ask user for permission to make
  * changes */
+	bool		do_sync;		/* flush changes to disk */
 	transferMode transfer_mode; /* copy files or link them? */
 	int			jobs;			/* number of processes/threads to use */
 	char	   *socketdir;		/* directory to use for Unix sockets */
diff --git a/src/bin/pg_upgrade/test.sh b/src/bin/pg_upgrade/test.sh
index 32d186d897..d6a318367a 100644
--- a/src/bin/pg_upgrade/test.sh
+++ b/src/bin/pg_upgrade/test.sh
@@ -233,7 +233,7 @@ PGDATA="$BASE_PGDATA"
 
 standard_initdb 'initdb'
 
-pg_upgrade $PG_UPGRADE_OPTS -d "${PGDATA}.old" -D "$PGDATA" -b "$oldbindir" -p "$PGPORT" -P "$PGPORT"
+pg_upgrade $PG_UPGRADE_OPTS --no-sync -d "${PGDATA}.old" -D "$PGDATA" -b "$oldbindir" -p "$PGPORT" -P "$PGPORT"
 
 # make sure all directories and files have group permissions, on Unix hosts
 # Windows hosts don't support Unix-y permissions.
diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 1289123129..c5ce732ee9 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -130,6 +130,22 @@ PostgreSQL documentation
   cluster
  
 
+ 
+  -N
+  --no-sync
+  
+   
+By default, pg_upgrade will wait for all files
+of the upgraded cluster to be written safely to disk.  This option
+causes pg_upgrade to return without