[Combining replies to emails from different authors into one message]

On Wed, Sep 2, 2015 at 2:21 PM, Jaime Casanova <
jaime.casan...@2ndquadrant.com> wrote:

> On 1 September 2015 at 20:25, Thomas Munro <thomas.mu...@enterprisedb.com>
> wrote:

> As a quick weekend learning exercise/hack I recently went looking into how
> > we could support $SUBJECT.  I discovered we already report the apply
> > progress back to the master, and the synchronous waiting facility seemed
> to
> > be all ready to support this.  In fact it seemed a little too easy so
> > something tells me it must be wrong!  But anyway, please see the attached
> > toy POC patch which does that.
>
> i haven't seen the patch, but probably is as easy as you see it...
> IIRC, Simon proposed a patch for this a few years ago and this was
> actually contempleted from the beggining in the design of SR.
>

Ah,  thanks, that certainly explains that.  The source code practically had
big arrows pointing to the place to type.  I don't want to step on anyone's
toes, so if Simon or anyone else is actively working on this, please let me
know, I'll happily cease and desist.


On Thu, Sep 3, 2015 at 12:35 AM, Robert Haas <robertmh...@gmail.com> wrote:

> On Tue, Sep 1, 2015 at 9:25 PM, Thomas Munro
> <thomas.mu...@enterprisedb.com> wrote:
> > The next problem is that the master can be waiting quite a long time for
> a
> > reply from the remote walreceiver containing the desired apply LSN: in
> the
> > best case it learns of apply progress from replies to subsequent
> unrelated
> > records (which might be very soon on a busy system but still involves
> > waiting for the next transaction's WAL flush), and in the worst case it
> > needs to wait for wal_receiver_status_interval (10 seconds by default),
> > which makes for a long COMMIT delay.  I was thinking that the solution to
> > that may be to teach StartupLOG to signal the walreceiver after it
> updates
> > XLogCtl->lastReplayedEndRecPtr, which should cause walrcv_receive to be
> > interrupted and return early, and then walreceiver could send a reply if
> it
> > sees that lastReplayedEndRecPtr has moved.  Maybe that would generate an
> > unacceptably high frequency of signals, and maybe there is a better form
> of
> > IPC for this.
>
> Yeah, that could be a problem, as could reply volume. If you've got a
> bunch of heap inserts of narrow rows into some table, you don't really
> want to send a reply after each one.  That would be a lot of replies,
> and nobody can really care about them anyway, at least not for
> synchronous_commit purposes.  But what if you only sent a signal when
> the just-replayed record was a COMMIT record?  I suppose that could
> still be a lot of replies on something like a full-tilt pgbench
> workload, but even in that case it would help a lot.
>

Here's a version that does that.  It's still ugly POC code for now -- the
flow control in walreceiver.c probably needs a bit of refactoring so it
doesn't have to do the same work in two different places, and it needs some
thought about how it balances time spent write wal and sending replies.
But ... it seems to work for simple tests.

I have also attached a test program.  Here are some numbers I measured with
master and standby running on my laptop using that program:

synchronous_commit  loops  Time   TPS
off                 10000  0.841s 11890
local               10000  1.869s  5350
remote_write        10000  3.123s  3202
on                  10000  3.085s  3241
apply               10000  3.361s  2975

If you run it with "--check" you can see that the changes are not always
immediately visible in anything below "apply" and are always visible in
"apply".  (I can't explain why "on" consistently beats "remote_write" on my
machine by a small margin...  Maybe something to do with being an assert
build.)


On Thu, Sep 3, 2015 at 12:02 AM, Fujii Masao <masao.fu...@gmail.com> wrote:
>
> One idea is to change the standby so that it manages the locations
> that the backends in "apply" mode are waiting for in the master,
> and to make the startup process wake the walreceiver up whenever
> the replay location reaches either of those locations. In this idea,
> walreceiver sends back the "apply" location to the master only when
> needed.
>

Hmm.  So maybe commit records could have a flag saying 'someone is waiting
for this to commit to apply', and the startup process's apply loop would
only bother to signal the walreceiver if it sees that flag.  I will try
that.

-- 
Thomas Munro
http://www.enterprisedb.com

Attachment: synchronous-commit-apply-v2.patch
Description: Binary data

#include <libpq-fe.h>

#include <assert.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int
main(int argc, char *argv[])
{
	PGconn *master;
	PGconn *standby;
	PGresult *result;
	int i;
	int loops = 10000;
	char buffer[1024];
	const char *level = "off";
	bool check_applied = false;

	for (i = 1; i != argc; ++i)
	{
		bool more = (i < argc - 1);

		if (strcmp(argv[i], "--check") == 0)
			check_applied = true;
		else if (strcmp(argv[i], "--level") == 0 && more)
			level = argv[++i];
		else if (strcmp(argv[i], "--loops") == 0 && more)
			loops = atoi(argv[++i]);
		else
		{
			fprintf(stderr, "bad argument\n");
			exit(1);
		}
	}

	master = PQconnectdb("dbname=postgres port=5432");
	assert(PQstatus(master) == CONNECTION_OK);

	standby = PQconnectdb("dbname=postgres port=5433");
	assert(PQstatus(standby) == CONNECTION_OK);

	snprintf(buffer, sizeof(buffer), "SET synchronous_commit = %s", level);
	result = PQexec(master, buffer);
	assert(PQresultStatus(result) == PGRES_COMMAND_OK);
	PQclear(result);

	result = PQexec(master, "CREATE TABLE counter AS SELECT 0 AS n");
	assert(PQresultStatus(result) == PGRES_COMMAND_OK ||
		 strcmp(PQresultErrorField(result, PG_DIAG_SQLSTATE), "42P07") == 0);
	PQclear(result);

	for (i = 0; i < loops; ++i)
	{
		/*printf("Updating master...\n");*/
		snprintf(buffer, sizeof(buffer), "UPDATE counter SET n = %d", i);
		result = PQexec(master, buffer);
		assert(PQresultStatus(result) == PGRES_COMMAND_OK);
		PQclear(result);

		if (check_applied)
		{
			/*printf("Checking standby...\n");*/
			snprintf(buffer, sizeof(buffer), "SELECT n FROM counter");
			result = PQexec(standby, buffer);
			assert(PQresultStatus(result) == PGRES_TUPLES_OK);
			assert(PQntuples(result) == 1);
			assert(atoi(PQgetvalue(result, 0, 0)) == i);
			PQclear(result);
		}
	}
	exit(0);
}
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to