Re: Sharing more infrastructure between walsenders and regular backends (was Re: [HACKERS] Switching timeline over streaming replication)
On Thursday, October 04, 2012 8:40 PM Heikki Linnakangas wrote: > On 03.10.2012 18:15, Amit Kapila wrote: > > 35.WalSenderMain(void) > > { > > .. > > +if (walsender_shutdown_requested) > > +ereport(FATAL, > > + > (errcode(ERRCODE_ADMIN_SHUTDOWN), > > + errmsg("terminating > > + replication > > connection due to administrator command"))); > > + > > +/* Tell the client that we are ready to receive > > + commands */ > > > > +ReadyForQuery(DestRemote); > > + > > .. > > > > +if (walsender_shutdown_requested) > > +ereport(FATAL, > > + > (errcode(ERRCODE_ADMIN_SHUTDOWN), > > + errmsg("terminating > > + replication > > connection due to administrator command"))); > > + > > > > is it necessary to check walsender_shutdown_requested 2 times in a > > loop, if yes, then can we write comment why it is important to check > > it again. > > The idea was to check for shutdown request before and after the > pq_getbyte() call, because that can block for a long time. > > Looking closer, we don't currently (ie. without this patch) make any > effort to react to SIGTERM in a walsender, while it's waiting for a > command from the client. After starting replication, it does check > walsender_shutdown_requested in the loop, and it's also checked during a > base backup (although only when switching to send next file, which seems > too seldom). This issue is orthogonal to handling timeline changes over > streaming replication, although that patch will make it more important > to handle SIGTERM quickly while waiting for a command, because you stay > in that mode for longer and more often. > > I think walsender needs to share more infrastructure with regular > backends to handle this better. When we first implemented streaming > replication in 9.0, it made sense to implement just the bare minimum > needed to accept the handshake commands before entering the Copy state, > but now that the replication command set has grown to cover base > backups, and fetching timelines with the patch being discussed, we > should bite the bullet and make the command loop more feature-complete > and robust. Certainly there are benefits of making walsender and backend infrastructure similar as mentioned by Simon, Andres and Tom. However shall we not do this as a separate feature along with some other requirements and let switchtimeline patch get committed first as this is altogether a different feature. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Disable _FORTIFY_SOURCE with ICC
I wrote: > I've seen this warning in recent Fedora builds, though not on my F16 > production box. Some googling suggests the locution > #if __OPTIMIZE__ > #define _FORTIFY_SOURCE 2 > #endif > but I've not tested that. A bit later: testing on an F17 box (glibc-2.15-56.fc17.x86_64) confirms Peter G's complaint, and also confirms that putting the above into port/linux.h (instead of the template file) fixes the problem. Don't know how to disable it on ICC, but I suppose there's some way to test for ICC via an #if. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Disable _FORTIFY_SOURCE with ICC
Peter Eisentraut writes: > On Tue, 2012-10-02 at 17:05 +0100, Peter Geoghegan wrote: >> /usr/include/features.h:314:4: warning: #warning _FORTIFY_SOURCE >> requires compiling with optimization (-O) [-Wcpp] > Which glibc version is this? (Run /lib/libc.so) I've seen this warning in recent Fedora builds, though not on my F16 production box. Some googling suggests the locution #if __OPTIMIZE__ #define _FORTIFY_SOURCE 2 #endif but I've not tested that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for REINDEX CONCURRENTLY
On Fri, Oct 5, 2012 at 6:58 AM, Andres Freund wrote: > On Thursday, October 04, 2012 04:51:29 AM Tom Lane wrote: > I can understand hesitation around that.. I would like to make sure I > understand the problem correctly. When we get to the point where we switch > indexes we should be in the following state: > - both indexes are indisready > - old should be invalid > - new index should be valid > - have the same indcheckxmin > - be locked by us preventing anybody else from making changes > Looks like a good presentation of the problem. I am not sure if marking the new index as valid is necessary though. As long as it is done inside the same transaction as the swap there are no problems, no? > Lets assume we have index a_old(relfilenode 1) as the old index and a > rebuilt > index a_new (relfilenode 2) as the one we just built. If we do it properly > nobody will have 'a' open for querying, just for modifications (its > indisready) > as we had waited for everyone that could have seen a as valid to finish. > > As far as I understand the code a session using a_new will also have built > a > relcache entry for a_old. > Two problems: > * relying on the relcache to be built for both indexes seems hinky > * As the relcache is built with SnapshotNow it could read the old > definition > for a_new and the new one for a_old (or the reverse) and thus end up with > both > pointing to the same relfilenode. Which would be ungood. > OK, so the problem here is that the relcache, as the syscache, are relying on SnapshotNow which cannot be used safely as the false index definition could be read by other backends. So this looks to bring back the discussion to the point where a higher lock level is necessary to perform a safe switch of the indexes. I assume that the switch phase is not the longest phase of the concurrent operation, as you also need to build and validate the new index at prior steps. I am just wondering if it is acceptable to you guys to take a stronger lock only during this switch phase. This won't make the reindex being concurrently all the time but it would avoid any visibility issues and have an index switch processing which is more consistent with the existing implementation as it could rely on the same mechanism as normal reindex that switches relfilenode. -- Michael Paquier http://michael.otacoo.com
Re: [HACKERS] Support for REINDEX CONCURRENTLY
On Thursday, October 04, 2012 04:51:29 AM Tom Lane wrote: > Greg Stark writes: > > I'm a bit puzzled why we're so afraid of swapping the relfilenodes > > when that's what the current REINDEX does. > > Swapping the relfilenodes is fine *as long as you have exclusive lock*. > The trick is to make it safe without that. It will definitely not work > to do that without exclusive lock, because at the instant you would try > it, people will be accessing the new index (by OID). I can understand hesitation around that.. I would like to make sure I understand the problem correctly. When we get to the point where we switch indexes we should be in the following state: - both indexes are indisready - old should be invalid - new index should be valid - have the same indcheckxmin - be locked by us preventing anybody else from making changes Lets assume we have index a_old(relfilenode 1) as the old index and a rebuilt index a_new (relfilenode 2) as the one we just built. If we do it properly nobody will have 'a' open for querying, just for modifications (its indisready) as we had waited for everyone that could have seen a as valid to finish. As far as I understand the code a session using a_new will also have built a relcache entry for a_old. Two problems: * relying on the relcache to be built for both indexes seems hinky * As the relcache is built with SnapshotNow it could read the old definition for a_new and the new one for a_old (or the reverse) and thus end up with both pointing to the same relfilenode. Which would be ungood. Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sharing more infrastructure between walsenders and regular backends (was Re: Switching timeline over streaming replication)
On Thursday, October 04, 2012 10:58:53 PM Tom Lane wrote: > Simon Riggs writes: > > On 4 October 2012 17:23, Heikki Linnakangas wrote: > >> Perhaps we could make walsenders even more like regular backends than > >> what I was proposing, so that the replication commands are parsed and > >> executed just like regular utility commands. However, that'd require > >> some transaction support in walsender, for starters, which seems messy. > >> It might become sensible in the future if the replication command set > >> gets even more complicated, but it doesn't seem like a good idea at the > >> moment. > > > > It's come up a few times now that people want to run a few queries > > either before or after running a base backup. ... > > Andres suggested to me the other day we make walsender more like > > regular backends. At the time I wasn't sure I agreed, but reading this > > it looks like a sensible way to go. I only went that way after youve disliked my other suggestions ;) > That was what I was thinking too, but on reflection there's at least one > huge problem: how could we run queries without being connected to a > specific database? Which walsender isn't. I had quite some problems with that too. For now I've hacked walsender to connect to the database specified in the connection, not sure whether thats the way to go. Seems to work so far. I wanted to start a thread about this anyway, but as it came up here... The reason "we" (as in logical rep) need a database in the current approach is that we need to access the catalog (in a timetraveling fashion) to know how to decode the data in the wal... The patch I sent two weeks ago does the decoding from inside a normal backend but that was just because I couldn't make walsender work inside a database in time... Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Sharing more infrastructure between walsenders and regular backends (was Re: [HACKERS] Switching timeline over streaming replication)
Simon Riggs writes: > On 4 October 2012 17:23, Heikki Linnakangas wrote: >> Perhaps we could make walsenders even more like regular backends than what I >> was proposing, so that the replication commands are parsed and executed just >> like regular utility commands. However, that'd require some transaction >> support in walsender, for starters, which seems messy. It might become >> sensible in the future if the replication command set gets even more >> complicated, but it doesn't seem like a good idea at the moment. > It's come up a few times now that people want to run a few queries > either before or after running a base backup. ... > Andres suggested to me the other day we make walsender more like > regular backends. At the time I wasn't sure I agreed, but reading this > it looks like a sensible way to go. That was what I was thinking too, but on reflection there's at least one huge problem: how could we run queries without being connected to a specific database? Which walsender isn't. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Sharing more infrastructure between walsenders and regular backends (was Re: [HACKERS] Switching timeline over streaming replication)
On 4 October 2012 17:23, Heikki Linnakangas wrote: > On 04.10.2012 19:00, Tom Lane wrote: >> >> Heikki Linnakangas writes: >>> >>> So I propose the attached patch. I made small changes to postgres.c to >>> make it call exec_replication_command() instead of exec_simple_query(), >>> and reject extend query protocol, in a WAL sender process. A lot of code >>> related to handling the main command loop and signals is removed from >>> walsender.c. >> >> >> Why do we need the forbidden_in_wal_sender stuff? If we're going in >> this direction, I suggest there is little reason to restrict what the >> replication client can do. This seems to be both ugly and a drag on >> the performance of normal backends. > > > Well, there's not much need for parameterized queries or cursors with the > replication command set at the moment. I don't think it's worth it to try to > support them. Fastpath function calls make no sense either, as you can't > call user-defined functions in a walsender anyway. > > Perhaps we could make walsenders even more like regular backends than what I > was proposing, so that the replication commands are parsed and executed just > like regular utility commands. However, that'd require some transaction > support in walsender, for starters, which seems messy. It might become > sensible in the future if the replication command set gets even more > complicated, but it doesn't seem like a good idea at the moment. It's come up a few times now that people want to run a few queries either before or after running a base backup. Since the pg_basebackup stuff uses walsender, this make such things impossible. So to support that, we need to allow two kinds of connection, one to "replication" and one to something else, and since the something else is not guaranteed to exist that makes it even harder. Andres suggested to me the other day we make walsender more like regular backends. At the time I wasn't sure I agreed, but reading this it looks like a sensible way to go. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Promoting a standby during base backup (was Re: [HACKERS] Switching timeline over streaming replication)
On Thu, Oct 4, 2012 at 4:59 PM, Heikki Linnakangas wrote: > On 03.10.2012 18:15, Amit Kapila wrote: >> >> On Tuesday, October 02, 2012 4:21 PM Heikki Linnakangas wrote: >>> >>> Hmm, should a base backup be aborted when the standby is promoted? Does >>> the promotion render the backup corrupt? >> >> >> I think currently it does so. Pls refer >> 1. >> do_pg_stop_backup(char *labelfile, bool waitforarchive) >> { >> .. >> if (strcmp(backupfrom, "standby") == 0&& !backup_started_in_recovery) >> ereport(ERROR, >> >> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), >> errmsg("the standby was promoted during >> online backup"), >> errhint("This means that the backup >> being >> taken is corrupt " >> "and should not be used. >> " >> "Try taking another >> online >> backup."))); >> .. >> >> } > > > Okay. I think that check in do_pg_stop_backup() actually already ensures > that you don't end up with a corrupt backup, even if the standby is promoted > while a backup is being taken. Admittedly it would be nicer to abort it > immediately rather than error out at the end. > > But I wonder why promoting a standby renders the backup invalid in the first > place? Fujii, Simon, can you explain that? Simon had the same question and I answered it before. http://archives.postgresql.org/message-id/cahgqgwfu04oo8yl5sucdjvq3brni7wtfzty9wa2kxtznhic...@mail.gmail.com --- > You say > "If the standby is promoted to the master during online backup, the > backup fails." > but no explanation of why? > > I could work those things out, but I don't want to have to, plus we > may disagree if I did. If the backup succeeds in that case, when we start an archive recovery from that backup, the recovery needs to cross between two timelines. Which means that we need to set recovery_target_timeline before starting recovery. Whether recovery_target_timeline needs to be set or not depends on whether the standby was promoted during taking the backup. Leaving such a decision to a user seems fragile. pg_basebackup -x ensures that all required files are included in the backup and we can start recovery without restoring any file from the archive. But if the standby is promoted during the backup, the timeline history file would become an essential file for recovery, but it's not included in the backup. --- The situation may change if your switching-timeline patch has been committed. It's useful if we can continue the backup even if the standby is promoted. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Make CREATE AGGREGATE check validity of initcond value?
El 03/10/2012 21:38, "Tom Lane" escribió: > > Does anyone have an objection to this? I can imagine cases where the > check would reject values that would get accepted at runtime, if the > type's input function was sensitive to the phase of the moon or > something. But it doesn't seem very probable, whereas checking the > value seems like an eminently useful thing to do. Or maybe I'm just > overreacting to the report --- I can't recall any previous complaints > like this, so maybe entering a bogus initcond is a corner case too. I guess a wrong initcond value, probably is a pilot error. So, my first reaction is +1 to make it an error. But if you feel there a corner cases maybe at least a warning -- Jaime Casanova 2ndQuadrant: Your PostgreSQL partner
Re: [HACKERS] PQping command line tool
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 > I was wondering recently if there was any command line tool that > utilized PQping() or PQpingParams(). I searched the code and couldn't > find anything and was wondering if there was any interest to have > something like this included? I wrote something for my purposes of > performing a health check that also supports nagios style status > output. It's probably convenient for scripting purposes as well. I'm not sure how useful this information would be. Most health checks (Nagios or otherwise) really only care if things are working all the up to point A or not, where point A is usually a simple query such as "SELECT 1". Knowing various failure states as returned by PQping* does not seem to fit into such tools - any failure needs to be handled manually. - -- Greg Sabino Mullane g...@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201210041146 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAlBtukQACgkQvJuQZxSWSsiCbACePHFhTefoQnLwVuvIONH0JcSD jq8AoIPusD88fX1rBcse5IreaADH7wkZ =IRgc -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Sharing more infrastructure between walsenders and regular backends (was Re: [HACKERS] Switching timeline over streaming replication)
On 04.10.2012 19:00, Tom Lane wrote: Heikki Linnakangas writes: So I propose the attached patch. I made small changes to postgres.c to make it call exec_replication_command() instead of exec_simple_query(), and reject extend query protocol, in a WAL sender process. A lot of code related to handling the main command loop and signals is removed from walsender.c. Why do we need the forbidden_in_wal_sender stuff? If we're going in this direction, I suggest there is little reason to restrict what the replication client can do. This seems to be both ugly and a drag on the performance of normal backends. Well, there's not much need for parameterized queries or cursors with the replication command set at the moment. I don't think it's worth it to try to support them. Fastpath function calls make no sense either, as you can't call user-defined functions in a walsender anyway. Perhaps we could make walsenders even more like regular backends than what I was proposing, so that the replication commands are parsed and executed just like regular utility commands. However, that'd require some transaction support in walsender, for starters, which seems messy. It might become sensible in the future if the replication command set gets even more complicated, but it doesn't seem like a good idea at the moment. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Make pg_basebackup configure and start standby [Review]
2012-10-04 16:43 keltezéssel, Tom Lane írta: Boszormenyi Zoltan writes: Did you think about something like the attached code? Or rather this one, which fixes a bug so fillPGconn() and PQconninfo() are symmetric and work for "requiressl". That's incredibly ugly. I'm not sure where we should keep the "R" information, but shoehorning it into the existing PQconninfoOption struct like that seems totally unacceptable. Either we're willing to break backwards compatibility on the Disp-Char strings, or we need to put that info somewhere else. I hope only handling the new flag part is ugly. It can be hidden in the PQconninfoMapping[] array and PQconninfo(conn, true) pre-filters the list as in the attached version. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ diff -durpN postgresql/doc/src/sgml/libpq.sgml postgresql.1/doc/src/sgml/libpq.sgml --- postgresql/doc/src/sgml/libpq.sgml 2012-08-03 09:39:30.114266570 +0200 +++ postgresql.1/doc/src/sgml/libpq.sgml 2012-10-04 18:15:11.686199926 +0200 @@ -496,6 +496,37 @@ typedef struct + + PQconninfoPQconninfo + + + Returns the connection options used by a live connection. + +PQconninfoOption *PQconninfo(PGconn *conn, bool for_replication); + + + + + Returns a connection options array. This can be used to determine + all possible PQconnectdb options and their + current values that were used to connect to the server. The return + value points to an array of PQconninfoOption + structures, which ends with an entry having a null keyword + pointer. Every notes above for PQconndefaults also apply. + + + + The for_replication argument can be used to exclude some + options from the list which are used by the walreceiver module. + pg_basebackup uses this pre-filtered list + to construct primary_conninfo in the automatically generated + recovery.conf file. + + + + + + PQconninfoParsePQconninfoParse diff -durpN postgresql/src/interfaces/libpq/exports.txt postgresql.1/src/interfaces/libpq/exports.txt --- postgresql/src/interfaces/libpq/exports.txt 2012-08-03 09:39:30.118266598 +0200 +++ postgresql.1/src/interfaces/libpq/exports.txt 2012-10-04 17:10:11.483654334 +0200 @@ -161,3 +161,4 @@ PQping158 PQpingParams 159 PQlibVersion 160 PQsetSingleRowMode161 +PQconninfo162 diff -durpN postgresql/src/interfaces/libpq/fe-connect.c postgresql.1/src/interfaces/libpq/fe-connect.c --- postgresql/src/interfaces/libpq/fe-connect.c 2012-09-09 08:11:09.470401480 +0200 +++ postgresql.1/src/interfaces/libpq/fe-connect.c 2012-10-04 17:53:26.205928500 +0200 @@ -137,6 +137,9 @@ static int ldapServiceLookup(const char * PQconninfoOptions[] *must* be NULL. In a working copy, non-null "val" * fields point to malloc'd strings that should be freed when the working * array is freed (see PQconninfoFree). + * + * If you add a new connection option to this list, remember to add it to + * PQconninfoMappings[] below. * -- */ static const PQconninfoOption PQconninfoOptions[] = { @@ -264,6 +267,62 @@ static const PQconninfoOption PQconninfo NULL, NULL, 0} }; +/* + * We need a mapping between the PQconninfoOptions[] array + * and PGconn members. We have to keep it separate from + * PQconninfoOptions[] to not leak info about PGconn members + * to clients. + */ +typedef struct PQconninfoMapping { + char *keyword; + size_t member_offset; + bool for_replication; + char *conninfoValue; /* Special value mapping */ + char *connValue; +} PQconninfoMapping; +#define PGCONNMEMBERADDR(conn, mapping) ((char **)((char *)conn + mapping->member_offset)) + +static const PQconninfoMapping PQconninfoMappings[] = +{ + /* "authtype" is not used anymore, there is no mapping to PGconn */ + /* there is no mapping of "service" to PGconn */ + { "user", offsetof(struct pg_conn, pguser), true, NULL, NULL }, + { "password", offsetof(struct pg_conn, pgpass), true, NULL, NULL }, + { "connect_timeout", offsetof(struct pg_conn, connect_timeout), true, NULL, NULL }, + { "dbname", offsetof(struct pg_conn, dbName), false, NULL, NULL }, + { "host", offsetof(struct pg_conn, pghost), true, NULL, NULL }, + { "hostaddr", offsetof(struct pg_conn, pghostaddr), true, NULL, NULL }, + { "port", offsetof(struct pg_conn, pgport), true, NULL, NULL }, + { "client_encoding", offsetof(struct pg_conn, client_encoding_initial), false, NULL, NULL }, + { "tty", offsetof(struct pg_conn, pgtty), false, NULL, NULL }, + { "options", offsetof(struct pg_conn, pgoptions), true, NULL, NULL }, + { "application_name", offsetof(struct pg_conn, appname), false, NULL, NULL }, + { "fallback_application_name", offsetof(struc
Re: Sharing more infrastructure between walsenders and regular backends (was Re: [HACKERS] Switching timeline over streaming replication)
Heikki Linnakangas writes: > So I propose the attached patch. I made small changes to postgres.c to > make it call exec_replication_command() instead of exec_simple_query(), > and reject extend query protocol, in a WAL sender process. A lot of code > related to handling the main command loop and signals is removed from > walsender.c. Why do we need the forbidden_in_wal_sender stuff? If we're going in this direction, I suggest there is little reason to restrict what the replication client can do. This seems to be both ugly and a drag on the performance of normal backends. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bison location reporting for potentially-empty list productions
I wrote: > To produce a really useful cursor for this type of situation I think > we would have to do something like this: > #define YYLLOC_DEFAULT(Current, Rhs, N) \ > do { \ > int i; > (Current) = -1; \ > for (i = 1; i <= (N); i++) > { > (Current) = (Rhs)[i]; \ > if ((Current) >= 0) > break; > } > } while (0) > This is pretty ugly and seems likely to create a visible hit on the > parser's speed (which is already not that good). I'm not sure it's > worth it for something that seems to be a corner case --- anyway > we've not hit it before in six years since the location tracking > support was added. After another look at the Bison manual I realized that there's a way to fix this without making YYLLOC_DEFAULT slower, because that macro only sets the *default* location for the current nonterminal; the rule action is allowed to override its location value. So I propose the attached patch against HEAD. This is a little bit ugly because we have to remember to put some extra code in productions that have this issue ... but track record so far suggests that there will be very few of them. Anyone have an objection, or a better idea? regards, tom lane diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index 7feadeac1693e7ff46f44ef3ad9ea2fa86bf46a8..e4ff76e66e0990d91790ccc54615c26dd64a143e 100644 *** a/src/backend/parser/gram.y --- b/src/backend/parser/gram.y *** *** 67,82 #include "utils/xml.h" ! /* Location tracking support --- simpler than bison's default */ #define YYLLOC_DEFAULT(Current, Rhs, N) \ do { \ ! if (N) \ (Current) = (Rhs)[1]; \ else \ ! (Current) = (Rhs)[0]; \ } while (0) /* * Bison doesn't allocate anything that needs to live across parser calls, * so we can easily have it use palloc instead of malloc. This prevents * memory leaks if we error out during parsing. Note this only works with --- 67,100 #include "utils/xml.h" ! /* ! * Location tracking support --- simpler than bison's default, since we only ! * want to track the start position not the end position of each nonterminal. ! */ #define YYLLOC_DEFAULT(Current, Rhs, N) \ do { \ ! if ((N) > 0) \ (Current) = (Rhs)[1]; \ else \ ! (Current) = (-1); \ } while (0) /* + * The above macro assigns -1 (unknown) as the parse location of any + * nonterminal that was reduced from an empty rule. This is problematic + * for nonterminals defined like + * OptFooList: / * EMPTY * / { ... } | OptFooList Foo { ... } ; + * because we'll set -1 as the location during the first reduction and then + * copy it during each subsequent reduction, leaving us with -1 for the + * location even when the list is not empty. To fix that, do this in the + * action for the nonempty rule(s): + * if (@$ < 0) @$ = @2; + * (Although we have many nonterminals that follow this pattern, we only + * bother with fixing @$ like this when the nonterminal's parse location + * is actually referenced in some rule.) + */ + + /* * Bison doesn't allocate anything that needs to live across parser calls, * so we can easily have it use palloc instead of malloc. This prevents * memory leaks if we error out during parsing. Note this only works with *** OptSchemaName: *** 1223,1230 ; OptSchemaEltList: ! OptSchemaEltList schema_stmt { $$ = lappend($1, $2); } ! | /* EMPTY */ { $$ = NIL; } ; /* --- 1241,1254 ; OptSchemaEltList: ! OptSchemaEltList schema_stmt ! { ! if (@$ < 0) /* see comments for YYLLOC_DEFAULT */ ! @$ = @2; ! $$ = lappend($1, $2); ! } ! | /* EMPTY */ ! { $$ = NIL; } ; /* diff --git a/src/test/regress/expected/namespace.out b/src/test/regress/expected/namespace.out index 5fcd46daf42705147f53599d657e52d8fe7b5b1f..9187c8126a3ffa32e365400816b262daf3b6f2d5 100644 *** a/src/test/regress/expected/namespace.out --- b/src/test/regress/expected/namespace.out *** CREATE SCHEMA IF NOT EXISTS test_schema_ *** 47,54 b int UNIQUE ); ERROR: CREATE SCHEMA IF NOT EXISTS cannot include schema elements ! LINE 1: CREATE SCHEMA IF NOT EXISTS test_schema_1 ! ^ DROP SCHEMA test_schema_1 CASCADE; NOTICE: drop cascades to 2 other objects DETAIL: drop cascades to table test_schema_1.abc --- 47,54 b int UNIQUE ); ERROR: CREATE SCHEMA IF NOT EXISTS cannot include schema elements ! LINE 2:CREATE TABLE abc ( !^ DROP SCHEMA test_schema_1 CASCADE; NOTICE: drop cascades to 2 other objects DETAIL: drop cascades to table test_schema_1.abc -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] hackers newsgroup hacked ?
Reading this mailing list via newsgroup (news.postgresql.org port 119) I can see that last "legitimate" message is from 29 August since then only "RUSSIAN" posts are present. Regards Gaetano Mendola -- cpp-today.blogspot.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Sharing more infrastructure between walsenders and regular backends (was Re: [HACKERS] Switching timeline over streaming replication)
On 03.10.2012 18:15, Amit Kapila wrote: 35.WalSenderMain(void) { .. +if (walsender_shutdown_requested) +ereport(FATAL, +(errcode(ERRCODE_ADMIN_SHUTDOWN), + errmsg("terminating replication connection due to administrator command"))); + +/* Tell the client that we are ready to receive commands */ +ReadyForQuery(DestRemote); + .. +if (walsender_shutdown_requested) +ereport(FATAL, +(errcode(ERRCODE_ADMIN_SHUTDOWN), + errmsg("terminating replication connection due to administrator command"))); + is it necessary to check walsender_shutdown_requested 2 times in a loop, if yes, then can we write comment why it is important to check it again. The idea was to check for shutdown request before and after the pq_getbyte() call, because that can block for a long time. Looking closer, we don't currently (ie. without this patch) make any effort to react to SIGTERM in a walsender, while it's waiting for a command from the client. After starting replication, it does check walsender_shutdown_requested in the loop, and it's also checked during a base backup (although only when switching to send next file, which seems too seldom). This issue is orthogonal to handling timeline changes over streaming replication, although that patch will make it more important to handle SIGTERM quickly while waiting for a command, because you stay in that mode for longer and more often. I think walsender needs to share more infrastructure with regular backends to handle this better. When we first implemented streaming replication in 9.0, it made sense to implement just the bare minimum needed to accept the handshake commands before entering the Copy state, but now that the replication command set has grown to cover base backups, and fetching timelines with the patch being discussed, we should bite the bullet and make the command loop more feature-complete and robust. In a regular backend, the command loop reacts to SIGTERM immediately, setting ImmediateInterruptOK at the right places, and calling CHECK_FOR_INTERRUPTS() at strategic places. I propose that we let PostgresMain handle the main command loop for walsender processes too, like it does for regular backends, and use ProcDiePending and the regular die() signal handler for SIGTERM in walsender as well. So I propose the attached patch. I made small changes to postgres.c to make it call exec_replication_command() instead of exec_simple_query(), and reject extend query protocol, in a WAL sender process. A lot of code related to handling the main command loop and signals is removed from walsender.c. - Heikki *** a/src/backend/replication/basebackup.c --- b/src/backend/replication/basebackup.c *** *** 22,27 --- 22,28 #include "lib/stringinfo.h" #include "libpq/libpq.h" #include "libpq/pqformat.h" + #include "miscadmin.h" #include "nodes/pg_list.h" #include "replication/basebackup.h" #include "replication/walsender.h" *** *** 30,36 #include "storage/ipc.h" #include "utils/builtins.h" #include "utils/elog.h" - #include "utils/memutils.h" #include "utils/ps_status.h" typedef struct --- 31,36 *** *** 370,388 void SendBaseBackup(BaseBackupCmd *cmd) { DIR *dir; - MemoryContext backup_context; - MemoryContext old_context; basebackup_options opt; parse_basebackup_options(cmd->options, &opt); - backup_context = AllocSetContextCreate(CurrentMemoryContext, - "Streaming base backup context", - ALLOCSET_DEFAULT_MINSIZE, - ALLOCSET_DEFAULT_INITSIZE, - ALLOCSET_DEFAULT_MAXSIZE); - old_context = MemoryContextSwitchTo(backup_context); - WalSndSetState(WALSNDSTATE_BACKUP); if (update_process_title) --- 370,379 *** *** 403,411 SendBaseBackup(BaseBackupCmd *cmd) perform_base_backup(&opt, dir); FreeDir(dir); - - MemoryContextSwitchTo(old_context); - MemoryContextDelete(backup_context); } static void --- 394,399 *** *** 606,612 sendDir(char *path, int basepathlen, bool sizeonly) * error in that case. The error handler further up will call * do_pg_abort_backup() for us. */ ! if (walsender_shutdown_requested || walsender_ready_to_stop) ereport(ERROR, (errmsg("shutdown requested, aborting active base backup"))); --- 594,600 * error in that case. The error handler further up will call * do_pg_abort_backup() for us. */ ! if (ProcDiePending || walsender_ready_to_stop) ereport(ERROR, (errmsg("shutdown requested, aborting active base backup"))); *** a/src/backend/replication/walsender.c --- b/src/backend/replication/w
Re: [HACKERS] [PATCH] Make pg_basebackup configure and start standby [Review]
Boszormenyi Zoltan writes: >> Did you think about something like the attached code? > Or rather this one, which fixes a bug so fillPGconn() and PQconninfo() are > symmetric and work for "requiressl". That's incredibly ugly. I'm not sure where we should keep the "R" information, but shoehorning it into the existing PQconninfoOption struct like that seems totally unacceptable. Either we're willing to break backwards compatibility on the Disp-Char strings, or we need to put that info somewhere else. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Raise a WARNING if a REVOKE affects nothing?
Robert Haas writes: > Just to ask a possibly stupid question: why is attempting to a REVOKE > a non-existent privilege anything other than an ERROR? Because the SQL standard says so? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xmalloc => pg_malloc
On Wed, Oct 3, 2012 at 11:36 PM, Tom Lane wrote: > Peter Eisentraut writes: >> xmalloc, xstrdup, etc. are pretty common names for functions that do >> alloc-or-die (another possible naming scheme ;-) ). The naming >> pg_malloc etc. on the other hand suggests that the allocation is being >> done in a PostgreSQL-specific way, and anyway sounds too close to >> palloc. > >> So I'd be more in favor of xmalloc <= pg_malloc. > > Meh. The fact that other people use that name is not really an > advantage from where I sit. I'm concerned about possible name > collisions, eg in libraries loaded into the backend. > > There are probably not any actual risks of collision right now, given > that all these functions are currently in our client-side programs --- > but it's foreseeable that we might use this same naming convention in > more-exposed places in future. In fact, somebody was already proposing > creating such functions in the core backend. > > But having said that, I'm not absolutely wedded to these names; they > were just the majority of existing cases. Why not split the difference and use pg_xmalloc? As in: "PostgreSQL-special malloc that dies on failure." -- Jon -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Switching timeline over streaming replication
> On Wednesday, October 03, 2012 8:45 PM Heikki Linnakangas wrote: > On Tuesday, October 02, 2012 4:21 PM Heikki Linnakangas wrote: > > Thanks for the thorough review! I committed the xlog.c refactoring > patch > > now. Attached is a new version of the main patch, comments on specific > > points below. I didn't adjust the docs per your comments yet, will do > > that next. > > I have some doubts regarding the comments fixed by you and some more new > review comments. > After this I shall focus majorly towards testing of this Patch. > Testing --- Failed Case -- 1. promotion of standby to master and follow standby to new master. 2. Stop standby and master. Restart standby first and then master 3. Restart of standby gives below errors E:\pg_git_code\installation\bin>LOG: database system was shut down in recovery at 2012-10-04 18:36:00 IST LOG: entering standby mode LOG: consistent recovery state reached at 0/176B800 LOG: redo starts at 0/176B800 LOG: record with zero length at 0/176BD68 LOG: database system is ready to accept read only connections LOG: streaming replication successfully connected to primary LOG: out-of-sequence timeline ID 1 (after 2) in log segment 0002000 1, offset 0 FATAL: terminating walreceiver process due to administrator command LOG: out-of-sequence timeline ID 1 (after 2) in log segment 0002000 1, offset 0 LOG: out-of-sequence timeline ID 1 (after 2) in log segment 0002000 1, offset 0 LOG: out-of-sequence timeline ID 1 (after 2) in log segment 0002000 1, offset 0 LOG: out-of-sequence timeline ID 1 (after 2) in log segment 0002000 1, offset 0 Once this error comes, restart master/standby in any order or do some operations on master, always there is above error On standby. Passed Cases - 1. After promoting standby as new master, try to make previous master (having same WAL as new master) as standby. In this case recovery.conf recovery_target_timeline set to latest. It ables to connect to new master and started streaming as per expectation. - As per expected behavior. 2. After promoting standby as new master, try to make previous master (having more WAL compare to new master) as standby, error is displayed. - As per expected behavior 3. After promoting standby as new master, try to make previous master (having same WAL as new master) as standby. In this case recovery.conf recovery_target_timeline is not set. Following LOG is displayed. LOG: fetching timeline history file for timeline 2 from primary server LOG: replication terminated by primary server DETAIL: End of WAL reached on timeline 1 LOG: walreceiver ended streaming and awaits new instructions LOG: re-handshaking at position 0/100 on tli 1 LOG: replication terminated by primary server DETAIL: End of WAL reached on timeline 1 LOG: walreceiver ended streaming and awaits new instructions LOG: re-handshaking at position 0/100 on tli 1 LOG: replication terminated by primary server DETAIL: End of WAL reached on timeline 1 - As per expected behavior Pending Cases which needs to be tested (these are scenarios, some more testing I will do based on these scenarios) --- 1. a. Master M-1 b. Standby S-1 follows M-1 c. Standby S-2 follows M-1 d. Promote S-1 as master e. Try to follow S-2 to S-1 -- operation should be success 2. a. Master M-1 b. Standby S-1 follows M-1 c. Stop S-1, M-1 d. Do the PITR in M-1 2 times. This is to increment timeline in M-1 e. try to follow standby S-1 to M-1 -- it should be success. 3. a. Master M-1 b. Standby S-1, S-2 follows M1 c. Standby S-3, S-4 follows S-1 d. Promote Standby which has highest WAL. e. follow all standby's to the new master. 4. a. Master M-1 b. Synchronous Standby S-1, S-2 c. Promote S-1 d. Follow M-1, S-2 to S-1 -- this operation should be success. Concurrent Operations --- 1. a. Master M-1 , Standby S-1 follows M-1, Standby S-2 follows M-1 b. Many concurrent operations on master M-1 c. During concurrent ops, Promote S-1 d. try S-2 to follow S-1 -- it should happen successfully. 2. During Promotion, call pg_basebackup 3. During Promotion, try to connect client Resource Testing -- 1. a.Make standby follow master which is many time lines ahead b. Observe if there is any resource leak c. Allow the streaming replication for 30 mins d. Observe if there is any resource leak Code Review - Libpqrcv_readtimelinehistoryfile() { .. .. + if (PQnfields(res) != 2 || PQntuples(res) != 1) + { + int ntuples = PQntuples(res); + int
Re: [HACKERS] Raise a WARNING if a REVOKE affects nothing?
On Tue, Oct 2, 2012 at 3:01 PM, Noah Misch wrote: > On Tue, Aug 21, 2012 at 02:31:29PM +0800, Craig Ringer wrote: >> It'd really help if REVOKE consistently raised warnings when it didn't >> actually revoke anything. > > +1 > > This will invite the same mixed feelings as the CREATE x IF NOT EXISTS > notices, but I think it's worthwhile. Just to ask a possibly stupid question: why is attempting to a REVOKE a non-existent privilege anything other than an ERROR? We would throw an ERROR if you tried to insert into a nonexistent table, or if you tried to drop a nonexistent table, or if you tried to call a nonexistent function, so why not also here? We could have REVOKE IF EXISTS for the current behavior (and users could boost client_min_messages to suppress the notice when deisred). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
> -Original Message- > From: pgsql-bugs-ow...@postgresql.org [mailto:pgsql-bugs- > ow...@postgresql.org] On Behalf Of Amit kapila > Sent: Thursday, October 04, 2012 3:43 PM > To: Heikki Linnakangas > Cc: Fujii Masao; pgsql-b...@postgresql.org; pgsql-hackers@postgresql.org > Subject: Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w > breakdown > > On Tuesday, October 02, 2012 1:56 PM Heikki Linnakangas wrote: > On 02.10.2012 10:36, Amit kapila wrote: > > On Monday, October 01, 2012 4:08 PM Heikki Linnakangas wrote: > >>> So let's think how this should ideally work from a user's point of > view. > >>> I think there should be just two settings: walsender_timeout and > >>> walreceiver_timeout. walsender_timeout specifies how long a > >>> walsender will keep a connection open if it doesn't hear from the > > Thank you for suggestions. > I have addressed your suggestions in patch attached with this mail. > > Following changes are done to support replication timeout in sender as > well as receiver: Testing Done for the Patch 1. Verified the value of new configuration parameter and changed configuration parameter using the show command (using Show of specific parameter as well as show all). 2. Verified the new configuration parameter in --describe-config. 3. Verified the existing parameter replication_timeout's new name in --describe-config. 4. Start primary and standby node with default timeout, leave it for sometime in idle situation. It should not error out due to network break error. 5. a. Start primary and standby node with default timeout, bring down the network. b. Both sender and receiver should be able to detect network break-down almost at same time. c. Once the network is up again, connection should get re-established successfully. 5. a. Start primary and standby node with wal_sender_timeout less than wal_receiver_timeout, bring down the network. b. Sender should be able to detect network break-down before receiver task. c. Once the network is up again, connection should get re-established successfully. 6. a. Start primary and standby node with wal_receiver_timeout less than wal_sender_timeout, bring down the network. b. Receiver should be able to detect network break-down before sender task. c. Once the network is up again, connection should get re-established successfully. 7. a. In 5th test case, change the value of wal_receiver_status_interval to more than wal_receiver_timeout and hence more than wal_sender_timeout. b. Then bring down the network down. c. Sender task should be able to detect network break-down once wal_sender_timeout has lapsed. d. Once the network is up again, connection should get re-established successfully. Intent of this test is to check there is no dependency of wal_sender_timeout on wal_receiver_status_interval for detection of Network break. All the above tests are passed. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Make pg_basebackup configure and start standby [Review]
2012-10-04 12:42 keltezéssel, Boszormenyi Zoltan írta: 2012-10-04 06:47 keltezéssel, Boszormenyi Zoltan írta: 2012-10-04 05:24 keltezéssel, Peter Eisentraut írta: On Wed, 2012-10-03 at 18:15 +0200, Boszormenyi Zoltan wrote: The second generation of this work is now attached and contains a new feature as was discussed and suggested by Magnus Hagander, Fujii Masao and Peter Eisentraut. So libpq has grown a new function: +/* return the connection options used by a live connection */ +extern PQconninfoOption *PQconninfo(PGconn *conn); This copies all the connection parameters back from the live PGconn itself so everything that's needed to connect is already validated. I don't like that this code maintains a second list of all possible libpq connection parameters. Where does it do that? In PQconninfo() itself? Why is it a problem? Or to put it bluntly: the same problem is in fillPGconn() too, as it also has the same set of parameters listed. So there is already code that you don't like. :-) How about a static mapping between option names and offsetof(struct pg_conn, member) values? This way both fillPGconn() and PQconninfo() can avoid maintaining the list of parameter names. Did you think about something like the attached code? Or rather this one, which fixes a bug so fillPGconn() and PQconninfo() are symmetric and work for "requiressl". Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ diff -durpN postgresql/src/interfaces/libpq/exports.txt postgresql.1/src/interfaces/libpq/exports.txt --- postgresql/src/interfaces/libpq/exports.txt 2012-08-03 09:39:30.118266598 +0200 +++ postgresql.1/src/interfaces/libpq/exports.txt 2012-10-04 11:29:24.864309200 +0200 @@ -161,3 +161,5 @@ PQping158 PQpingParams 159 PQlibVersion 160 PQsetSingleRowMode161 +PQconninfo162 +PQconninfoForReplication 163 diff -durpN postgresql/src/interfaces/libpq/fe-connect.c postgresql.1/src/interfaces/libpq/fe-connect.c --- postgresql/src/interfaces/libpq/fe-connect.c 2012-09-09 08:11:09.470401480 +0200 +++ postgresql.1/src/interfaces/libpq/fe-connect.c 2012-10-04 13:10:58.737418824 +0200 @@ -131,12 +131,17 @@ static int ldapServiceLookup(const char * "" Normal input field * "*" Password field - hide value * "D" Debug option - don't show by default + * A special "usable for replication connections" ("R") flag is + * added to Disp-Char in a backwards compatible fashion. * * PQconninfoOptions[] is a constant static array that we use to initialize * a dynamically allocated working copy. All the "val" fields in * PQconninfoOptions[] *must* be NULL. In a working copy, non-null "val" * fields point to malloc'd strings that should be freed when the working * array is freed (see PQconninfoFree). + * + * If you add a new connection option to this list, remember to add it to + * PQconninfoMappings[] below. * -- */ static const PQconninfoOption PQconninfoOptions[] = { @@ -146,62 +151,62 @@ static const PQconninfoOption PQconninfo * still try to set it. */ {"authtype", "PGAUTHTYPE", DefaultAuthtype, NULL, - "Database-Authtype", "D", 20}, + "Database-Authtype", "D\0", 20}, {"service", "PGSERVICE", NULL, NULL, - "Database-Service", "", 20}, + "Database-Service", "\0", 20}, {"user", "PGUSER", NULL, NULL, - "Database-User", "", 20}, + "Database-User", "\0R", 20}, {"password", "PGPASSWORD", NULL, NULL, - "Database-Password", "*", 20}, + "Database-Password", "*\0R", 20}, {"connect_timeout", "PGCONNECT_TIMEOUT", NULL, NULL, - "Connect-timeout", "", 10}, /* strlen(INT32_MAX) == 10 */ + "Connect-timeout", "\0R", 10}, /* strlen(INT32_MAX) == 10 */ {"dbname", "PGDATABASE", NULL, NULL, - "Database-Name", "", 20}, + "Database-Name", "\0", 20}, {"host", "PGHOST", NULL, NULL, - "Database-Host", "", 40}, + "Database-Host", "\0R", 40}, {"hostaddr", "PGHOSTADDR", NULL, NULL, - "Database-Host-IP-Address", "", 45}, + "Database-Host-IP-Address", "\0R", 45}, {"port", "PGPORT", DEF_PGPORT_STR, NULL, - "Database-Port", "", 6}, + "Database-Port", "\0R", 6}, {"client_encoding", "PGCLIENTENCODING", NULL, NULL, - "Client-Encoding", "", 10}, + "Client-Encoding", "\0", 10}, /* * "tty" is no longer used either, but keep it present for backwards * compatibility. */ {"tty", "PGTTY", DefaultTty, NULL, - "Backend-Debug-TTY", "D", 40}, + "Backend-Debug-TTY", "D\0", 40}, {"options", "PGOPTIONS", DefaultOption, NULL, - "Backend-Debug-Options", "D", 40}, + "Backend-Debug-Options", "D\0R", 40}, {"application_name", "PGAPPNAME", NULL, NULL, - "Application-Name", "", 64}, + "Application-Name", "\0", 64}, {"fallback_application_name", NULL, NULL, NULL, - "Fallback-Application-Name", "", 64}, + "Fallback-Applicat
Re: [HACKERS] Re: [WIP] Performance Improvement by reducing WAL for Update Operation
> On Thursday, October 04, 2012 12:54 PM Heikki Linnakangas > On 03.10.2012 19:03, Amit Kapila wrote: > > Any comments/suggestions regarding performance/functionality test? > > Hmm. Doing a lot of UPDATEs concurrently can be limited by the > WALInsertLock, which each inserter holds while copying the WAL record to > the buffer. Reducing the size of the WAL records, by compression or > delta encoding, alleviates that bottleneck: when WAL records are > smaller, the lock needs to be held for a shorter duration. That improves > throughput, even if individual backends need to do more CPU work to > compress the records, because that work can be done in parallel. I > suspect much of the benefit you're seeing in these tests might be > because of that effect. > > As it happens, I've been working on making WAL insertion scale better in > general: > http://archives.postgresql.org/message-id/5064779a.3050...@vmware.com. > That should also help most when inserting large WAL records. The > question is: assuming we commit the xloginsert-scale patch, how much > benefit is there left from the compression? It will surely still help to > reduce the size of WAL, which can certainly help if you're limited by > the WAL I/O, but I suspect the results from the pgbench tests you run > might look quite different. > > So, could you rerun these tests with the xloginsert-scale patch applied? I shall take care of doing the performance test with xloginsert-scale patch as well both for single and multi-thread. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Make pg_basebackup configure and start standby [Review]
2012-10-04 06:47 keltezéssel, Boszormenyi Zoltan írta: 2012-10-04 05:24 keltezéssel, Peter Eisentraut írta: On Wed, 2012-10-03 at 18:15 +0200, Boszormenyi Zoltan wrote: The second generation of this work is now attached and contains a new feature as was discussed and suggested by Magnus Hagander, Fujii Masao and Peter Eisentraut. So libpq has grown a new function: +/* return the connection options used by a live connection */ +extern PQconninfoOption *PQconninfo(PGconn *conn); This copies all the connection parameters back from the live PGconn itself so everything that's needed to connect is already validated. I don't like that this code maintains a second list of all possible libpq connection parameters. Where does it do that? In PQconninfo() itself? Why is it a problem? Or to put it bluntly: the same problem is in fillPGconn() too, as it also has the same set of parameters listed. So there is already code that you don't like. :-) How about a static mapping between option names and offsetof(struct pg_conn, member) values? This way both fillPGconn() and PQconninfo() can avoid maintaining the list of parameter names. Did you think about something like the attached code? Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ diff -durpN postgresql/src/interfaces/libpq/exports.txt postgresql.1/src/interfaces/libpq/exports.txt --- postgresql/src/interfaces/libpq/exports.txt 2012-08-03 09:39:30.118266598 +0200 +++ postgresql.1/src/interfaces/libpq/exports.txt 2012-10-04 11:29:24.864309200 +0200 @@ -161,3 +161,5 @@ PQping158 PQpingParams 159 PQlibVersion 160 PQsetSingleRowMode161 +PQconninfo162 +PQconninfoForReplication 163 diff -durpN postgresql/src/interfaces/libpq/fe-connect.c postgresql.1/src/interfaces/libpq/fe-connect.c --- postgresql/src/interfaces/libpq/fe-connect.c 2012-09-09 08:11:09.470401480 +0200 +++ postgresql.1/src/interfaces/libpq/fe-connect.c 2012-10-04 12:30:56.705605588 +0200 @@ -131,12 +131,17 @@ static int ldapServiceLookup(const char * "" Normal input field * "*" Password field - hide value * "D" Debug option - don't show by default + * A special "usable for replication connections" ("R") flag is + * added to Disp-Char in a backwards compatible fashion. * * PQconninfoOptions[] is a constant static array that we use to initialize * a dynamically allocated working copy. All the "val" fields in * PQconninfoOptions[] *must* be NULL. In a working copy, non-null "val" * fields point to malloc'd strings that should be freed when the working * array is freed (see PQconninfoFree). + * + * If you add a new connection option to this list, remember to add it to + * PQconninfoMappings[] below. * -- */ static const PQconninfoOption PQconninfoOptions[] = { @@ -146,62 +151,62 @@ static const PQconninfoOption PQconninfo * still try to set it. */ {"authtype", "PGAUTHTYPE", DefaultAuthtype, NULL, - "Database-Authtype", "D", 20}, + "Database-Authtype", "D\0", 20}, {"service", "PGSERVICE", NULL, NULL, - "Database-Service", "", 20}, + "Database-Service", "\0", 20}, {"user", "PGUSER", NULL, NULL, - "Database-User", "", 20}, + "Database-User", "\0R", 20}, {"password", "PGPASSWORD", NULL, NULL, - "Database-Password", "*", 20}, + "Database-Password", "*\0R", 20}, {"connect_timeout", "PGCONNECT_TIMEOUT", NULL, NULL, - "Connect-timeout", "", 10}, /* strlen(INT32_MAX) == 10 */ + "Connect-timeout", "\0R", 10}, /* strlen(INT32_MAX) == 10 */ {"dbname", "PGDATABASE", NULL, NULL, - "Database-Name", "", 20}, + "Database-Name", "\0", 20}, {"host", "PGHOST", NULL, NULL, - "Database-Host", "", 40}, + "Database-Host", "\0R", 40}, {"hostaddr", "PGHOSTADDR", NULL, NULL, - "Database-Host-IP-Address", "", 45}, + "Database-Host-IP-Address", "\0R", 45}, {"port", "PGPORT", DEF_PGPORT_STR, NULL, - "Database-Port", "", 6}, + "Database-Port", "\0R", 6}, {"client_encoding", "PGCLIENTENCODING", NULL, NULL, - "Client-Encoding", "", 10}, + "Client-Encoding", "\0", 10}, /* * "tty" is no longer used either, but keep it present for backwards * compatibility. */ {"tty", "PGTTY", DefaultTty, NULL, - "Backend-Debug-TTY", "D", 40}, + "Backend-Debug-TTY", "D\0", 40}, {"options", "PGOPTIONS", DefaultOption, NULL, - "Backend-Debug-Options", "D", 40}, + "Backend-Debug-Options", "D\0R", 40}, {"application_name", "PGAPPNAME", NULL, NULL, - "Application-Name", "", 64}, + "Application-Name", "\0", 64}, {"fallback_application_name", NULL, NULL, NULL, - "Fallback-Application-Name", "", 64}, + "Fallback-Application-Name", "\0", 64}, {"keepalives", NULL, NULL, NULL, - "TCP-Keepalives", "", 1}, /* should be just '0' or '1' */ + "TCP-Keepalives", "\0R", 1}, /* should be just '0' or
[HACKERS] Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
On Tuesday, October 02, 2012 1:56 PM Heikki Linnakangas wrote: On 02.10.2012 10:36, Amit kapila wrote: > On Monday, October 01, 2012 4:08 PM Heikki Linnakangas wrote: >>> So let's think how this should ideally work from a user's point of view. >>> I think there should be just two settings: walsender_timeout and >>> walreceiver_timeout. walsender_timeout specifies how long a walsender >>> will keep a connection open if it doesn't hear from the walreceiver, and >>> walreceiver_timeout is the same for walreceiver. The system should >>> The Ping/Pong messages don't necessarily need to be new message types, >>> we can use the message types we currently have, perhaps with an >>> additional flag attached to them, to request the other side to reply >>> immediately. > >> Can't we make the decision to send reply immediately based on message type, >> because these message types will be unique. > >> To clarify my understanding, >> 1. the heartbeat message from walsender side will be keepalive message ('k') >> and from walreceiver side it will be Hot Standby feedback message ('h'). >> 2. the reply message from walreceiver side will be current reply message >> ('r'). > Yep. I wonder why need separate message types for Hot Standby Feedback > 'h' and Reply 'r', though. Seems it would be simpler to have just one > messasge type that includes all the fields from both messages. moved the contents for Hot Standby Feedback 'h' to Reply 'r' and use 'h' for heart-beat purpose. >> 3. currently there is no reply kind of message from walsender, so do we need >> to introduce one new message for it or can use some existing message only? >> if new, do we need to send any additional information along with it, >> for existing messages can we use keepalive message it self as reply message >> but with an additional byte >> to indicate it is reply? > Hmm, I think I'd prefer to use the existing Keepalive message 'k', with an > additional flag. Okay. I have done it in Patch. Thank you for suggestions. I have addressed your suggestions in patch attached with this mail. Following changes are done to support replication timeout in sender as well as receiver: 1. One new configuration parameter wal_receiver_timeout is added to detect timeout at receiver task. 2. Existing parameter replication_timeout is renamed to wal_sender_timeout. 3. Now PrimaryKeepaliveMessage structure is modified to add one more field to indicate whether keep-alive is of type 'r' (i.e. reply) or 'h' (i.e. heart-beat). 4. Now the keep-alive message from sender will be sent to standby if it was idle for more than or equal to half of wal_sender_timeout. In this case it will send keep-alive of type 'h'. 5. Once the standby receiver a keep-alive, it needs to send an immediate reply to primary to indicate connection is alive. 6. Now Reply message to send wal offset and Feedback message to send oldest transaction are merged into single Reply message. So now the structure StandbyReplyMessage is changed to add two more fields as xmin and epoch. Also StandbyHSFeedbackMessage structure is changed to remove xmin and epoch fields (as these are moved to StandbyReplyMessage). 7. Because of changes as in step-6, once receiver task receives some data from primary then it will only send Reply Message. 8. Same Reply message is sent in step-5 and step-7 but incase of step-5, then reply is sent immediately but incase of step-7, reply is sent if wal_receiver_status_interval has lapsed (this part is same as earlier). 9. Similar to sender, if receiver finds itself idle for more than or equal to half of configured wal_receiver_timeout, then it will send the hot-standby heartbeat. This heart-beat has been modified to send only sendTime. 10. Once sender task receiver heart-beat message from standby then it sends back the reply immediately. In this keep-alive message is sent of type 'r'. 11. If even after wal_sender_timeout no message received from standby then it will be considered as network break at sender task. 12. If even after wal_receiver_timeout no message received from primary then it will be considered as network break at receiver task. With Regards, Amit Kapila. replication_timeout_patch_v3.patch Description: replication_timeout_patch_v3.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Promoting a standby during base backup (was Re: [HACKERS] Switching timeline over streaming replication)
On 03.10.2012 18:15, Amit Kapila wrote: On Tuesday, October 02, 2012 4:21 PM Heikki Linnakangas wrote: Hmm, should a base backup be aborted when the standby is promoted? Does the promotion render the backup corrupt? I think currently it does so. Pls refer 1. do_pg_stop_backup(char *labelfile, bool waitforarchive) { .. if (strcmp(backupfrom, "standby") == 0&& !backup_started_in_recovery) ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), errmsg("the standby was promoted during online backup"), errhint("This means that the backup being taken is corrupt " "and should not be used. " "Try taking another online backup."))); .. } Okay. I think that check in do_pg_stop_backup() actually already ensures that you don't end up with a corrupt backup, even if the standby is promoted while a backup is being taken. Admittedly it would be nicer to abort it immediately rather than error out at the end. But I wonder why promoting a standby renders the backup invalid in the first place? Fujii, Simon, can you explain that? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [WIP] Performance Improvement by reducing WAL for Update Operation
On 03.10.2012 19:03, Amit Kapila wrote: Any comments/suggestions regarding performance/functionality test? Hmm. Doing a lot of UPDATEs concurrently can be limited by the WALInsertLock, which each inserter holds while copying the WAL record to the buffer. Reducing the size of the WAL records, by compression or delta encoding, alleviates that bottleneck: when WAL records are smaller, the lock needs to be held for a shorter duration. That improves throughput, even if individual backends need to do more CPU work to compress the records, because that work can be done in parallel. I suspect much of the benefit you're seeing in these tests might be because of that effect. As it happens, I've been working on making WAL insertion scale better in general: http://archives.postgresql.org/message-id/5064779a.3050...@vmware.com. That should also help most when inserting large WAL records. The question is: assuming we commit the xloginsert-scale patch, how much benefit is there left from the compression? It will surely still help to reduce the size of WAL, which can certainly help if you're limited by the WAL I/O, but I suspect the results from the pgbench tests you run might look quite different. So, could you rerun these tests with the xloginsert-scale patch applied? Reducing the WAL size might still be a good idea even if the patch doesn't have much effect on TPS, but I'd like to make sure that the compression doesn't hurt performance. Also, it would be a good idea to repeat the tests with just a single client; we don't want to hurt the performance in that scenario either. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers