Re: Sharing more infrastructure between walsenders and regular backends (was Re: [HACKERS] Switching timeline over streaming replication)

2012-10-04 Thread Amit Kapila
On Thursday, October 04, 2012 8:40 PM Heikki Linnakangas wrote:
> On 03.10.2012 18:15, Amit Kapila wrote:
> > 35.WalSenderMain(void)
> > {
> > ..
> > +if (walsender_shutdown_requested)
> > +ereport(FATAL,
> > +
> (errcode(ERRCODE_ADMIN_SHUTDOWN),
> > + errmsg("terminating
> > + replication
> > connection due to administrator command")));
> > +
> > +/* Tell the client that we are ready to receive
> > + commands */
> >
> > +ReadyForQuery(DestRemote);
> > +
> > ..
> >
> > +if (walsender_shutdown_requested)
> > +ereport(FATAL,
> > +
> (errcode(ERRCODE_ADMIN_SHUTDOWN),
> > + errmsg("terminating
> > + replication
> > connection due to administrator command")));
> > +
> >
> > is it necessary to check walsender_shutdown_requested 2 times in a
> > loop, if yes, then can we write comment why it is important to check
> > it again.
> 
> The idea was to check for shutdown request before and after the
> pq_getbyte() call, because that can block for a long time.
> 
> Looking closer, we don't currently (ie. without this patch) make any
> effort to react to SIGTERM in a walsender, while it's waiting for a
> command from the client. After starting replication, it does check
> walsender_shutdown_requested in the loop, and it's also checked during a
> base backup (although only when switching to send next file, which seems
> too seldom). This issue is orthogonal to handling timeline changes over
> streaming replication, although that patch will make it more important
> to handle SIGTERM quickly while waiting for a command, because you stay
> in that mode for longer and more often.
> 
> I think walsender needs to share more infrastructure with regular
> backends to handle this better. When we first implemented streaming
> replication in 9.0, it made sense to implement just the bare minimum
> needed to accept the handshake commands before entering the Copy state,
> but now that the replication command set has grown to cover base
> backups, and fetching timelines with the patch being discussed, we
> should bite the bullet and make the command loop more feature-complete
> and robust.

Certainly there are benefits of making walsender and backend infrastructure
similar as mentioned by Simon, Andres and Tom.
However shall we not do this as a separate feature along with some other
requirements and let switchtimeline patch get committed first
as this is altogether a different feature.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Disable _FORTIFY_SOURCE with ICC

2012-10-04 Thread Tom Lane
I wrote:
> I've seen this warning in recent Fedora builds, though not on my F16
> production box.  Some googling suggests the locution

> #if __OPTIMIZE__
> #define _FORTIFY_SOURCE 2
> #endif

> but I've not tested that.

A bit later: testing on an F17 box (glibc-2.15-56.fc17.x86_64)
confirms Peter G's complaint, and also confirms that putting
the above into port/linux.h (instead of the template file) fixes the
problem.  Don't know how to disable it on ICC, but I suppose there's
some way to test for ICC via an #if.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Disable _FORTIFY_SOURCE with ICC

2012-10-04 Thread Tom Lane
Peter Eisentraut  writes:
> On Tue, 2012-10-02 at 17:05 +0100, Peter Geoghegan wrote:
>> /usr/include/features.h:314:4: warning: #warning _FORTIFY_SOURCE
>> requires compiling with optimization (-O) [-Wcpp]

> Which glibc version is this?  (Run /lib/libc.so)

I've seen this warning in recent Fedora builds, though not on my F16
production box.  Some googling suggests the locution

#if __OPTIMIZE__
#define _FORTIFY_SOURCE 2
#endif

but I've not tested that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for REINDEX CONCURRENTLY

2012-10-04 Thread Michael Paquier
On Fri, Oct 5, 2012 at 6:58 AM, Andres Freund wrote:

> On Thursday, October 04, 2012 04:51:29 AM Tom Lane wrote:
> I can understand hesitation around that.. I would like to make sure I
> understand the problem correctly. When we get to the point where we switch
> indexes we should be in the following state:
> - both indexes are indisready
> - old should be invalid
> - new index should be valid
> - have the same indcheckxmin
> - be locked by us preventing anybody else from making changes
>
Looks like a good presentation of the problem. I am not sure if marking the
new index as valid is necessary though. As long as it is done inside the
same transaction as the swap there are no problems, no?


> Lets assume we have index a_old(relfilenode 1) as the old index and a
> rebuilt
> index a_new (relfilenode 2) as the one we just built. If we do it properly
> nobody will have 'a' open for querying, just for modifications (its
> indisready)
> as we had waited for everyone that could have seen a as valid to finish.
>
> As far as I understand the code a session using a_new will also have built
> a
> relcache entry for a_old.
> Two problems:
> * relying on the relcache to be built for both indexes seems hinky
> * As the relcache is built with SnapshotNow it could read the old
> definition
> for a_new and the new one for a_old (or the reverse) and thus end up with
> both
> pointing to the same relfilenode. Which would be ungood.
>
OK, so the problem here is that the relcache, as the syscache, are relying
on SnapshotNow which cannot be used safely as the false index definition
could be read by other backends. So this looks to bring back the discussion
to the point where a higher lock level is necessary to perform a safe
switch of the indexes.

I assume that the switch phase is not the longest phase of the concurrent
operation, as you also need to build and validate the new index at prior
steps. I am just wondering if it is acceptable to you guys to take a
stronger lock only during this switch phase. This won't make the reindex
being concurrently all the time but it would avoid any visibility issues
and have an index switch processing which is more consistent with the
existing implementation as it could rely on the same mechanism as normal
reindex that switches relfilenode.
-- 
Michael Paquier
http://michael.otacoo.com


Re: [HACKERS] Support for REINDEX CONCURRENTLY

2012-10-04 Thread Andres Freund
On Thursday, October 04, 2012 04:51:29 AM Tom Lane wrote:
> Greg Stark  writes:
> > I'm a bit puzzled why we're so afraid of swapping the relfilenodes
> > when that's what the current REINDEX does.
> 
> Swapping the relfilenodes is fine *as long as you have exclusive lock*.
> The trick is to make it safe without that.  It will definitely not work
> to do that without exclusive lock, because at the instant you would try
> it, people will be accessing the new index (by OID).
I can understand hesitation around that.. I would like to make sure I 
understand the problem correctly. When we get to the point where we switch 
indexes we should be in the following state:
- both indexes are indisready
- old should be invalid
- new index should be valid
- have the same indcheckxmin
- be locked by us preventing anybody else from making changes

Lets assume we have index a_old(relfilenode 1) as the old index and a rebuilt 
index a_new (relfilenode 2) as the one we just built. If we do it properly 
nobody will have 'a' open for querying, just for modifications (its indisready) 
as we had waited for everyone that could have seen a as valid to finish.

As far as I understand the code a session using a_new will also have built a 
relcache entry for a_old.
Two problems:
* relying on the relcache to be built for both indexes seems hinky
* As the relcache is built with SnapshotNow it could read the old definition 
for a_new and the new one for a_old (or the reverse) and thus end up with both 
pointing to the same relfilenode. Which would be ungood.

Greetings,

Andres
-- 
Andres Freund   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sharing more infrastructure between walsenders and regular backends (was Re: Switching timeline over streaming replication)

2012-10-04 Thread Andres Freund
On Thursday, October 04, 2012 10:58:53 PM Tom Lane wrote:
> Simon Riggs  writes:
> > On 4 October 2012 17:23, Heikki Linnakangas  
wrote:
> >> Perhaps we could make walsenders even more like regular backends than
> >> what I was proposing, so that the replication commands are parsed and
> >> executed just like regular utility commands. However, that'd require
> >> some transaction support in walsender, for starters, which seems messy.
> >> It might become sensible in the future if the replication command set
> >> gets even more complicated, but it doesn't seem like a good idea at the
> >> moment.
> > 
> > It's come up a few times now that people want to run a few queries
> > either before or after running a base backup. ...
> > Andres suggested to me the other day we make walsender more like
> > regular backends. At the time I wasn't sure I agreed, but reading this
> > it looks like a sensible way to go.
I only went that way after youve disliked my other suggestions ;)

> That was what I was thinking too, but on reflection there's at least one
> huge problem: how could we run queries without being connected to a
> specific database?  Which walsender isn't.
I had quite some problems with that too. For now I've hacked walsender to 
connect to the database specified in the connection, not sure whether thats the 
way to go. Seems to work so far.

I wanted to start a thread about this anyway, but as it came up here...

The reason "we" (as in logical rep) need a database in the current approach is 
that we need to access the catalog (in a timetraveling fashion) to know how to 
decode the data in the wal... The patch I sent two weeks ago does the decoding 
from inside a normal backend but that was just because I couldn't make 
walsender work inside a database in time...

Greetings,

Andres
-- 
Andres Freund   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Sharing more infrastructure between walsenders and regular backends (was Re: [HACKERS] Switching timeline over streaming replication)

2012-10-04 Thread Tom Lane
Simon Riggs  writes:
> On 4 October 2012 17:23, Heikki Linnakangas  wrote:
>> Perhaps we could make walsenders even more like regular backends than what I
>> was proposing, so that the replication commands are parsed and executed just
>> like regular utility commands. However, that'd require some transaction
>> support in walsender, for starters, which seems messy. It might become
>> sensible in the future if the replication command set gets even more
>> complicated, but it doesn't seem like a good idea at the moment.

> It's come up a few times now that people want to run a few queries
> either before or after running a base backup. ...
> Andres suggested to me the other day we make walsender more like
> regular backends. At the time I wasn't sure I agreed, but reading this
> it looks like a sensible way to go.

That was what I was thinking too, but on reflection there's at least one
huge problem: how could we run queries without being connected to a
specific database?  Which walsender isn't.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Sharing more infrastructure between walsenders and regular backends (was Re: [HACKERS] Switching timeline over streaming replication)

2012-10-04 Thread Simon Riggs
On 4 October 2012 17:23, Heikki Linnakangas  wrote:
> On 04.10.2012 19:00, Tom Lane wrote:
>>
>> Heikki Linnakangas  writes:
>>>
>>> So I propose the attached patch. I made small changes to postgres.c to
>>> make it call exec_replication_command() instead of exec_simple_query(),
>>> and reject extend query protocol, in a WAL sender process. A lot of code
>>> related to handling the main command loop and signals is removed from
>>> walsender.c.
>>
>>
>> Why do we need the forbidden_in_wal_sender stuff?  If we're going in
>> this direction, I suggest there is little reason to restrict what the
>> replication client can do.  This seems to be both ugly and a drag on
>> the performance of normal backends.
>
>
> Well, there's not much need for parameterized queries or cursors with the
> replication command set at the moment. I don't think it's worth it to try to
> support them. Fastpath function calls make no sense either, as you can't
> call user-defined functions in a walsender anyway.
>
> Perhaps we could make walsenders even more like regular backends than what I
> was proposing, so that the replication commands are parsed and executed just
> like regular utility commands. However, that'd require some transaction
> support in walsender, for starters, which seems messy. It might become
> sensible in the future if the replication command set gets even more
> complicated, but it doesn't seem like a good idea at the moment.

It's come up a few times now that people want to run a few queries
either before or after running a base backup.

Since the pg_basebackup stuff uses walsender, this make such things impossible.

So to support that, we need to allow two kinds of connection, one to
"replication" and one to something else, and since the something else
is not guaranteed to exist that makes it even harder.

Andres suggested to me the other day we make walsender more like
regular backends. At the time I wasn't sure I agreed, but reading this
it looks like a sensible way to go.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Promoting a standby during base backup (was Re: [HACKERS] Switching timeline over streaming replication)

2012-10-04 Thread Fujii Masao
On Thu, Oct 4, 2012 at 4:59 PM, Heikki Linnakangas
 wrote:
> On 03.10.2012 18:15, Amit Kapila wrote:
>>
>> On Tuesday, October 02, 2012 4:21 PM Heikki Linnakangas wrote:
>>>
>>> Hmm, should a base backup be aborted when the standby is promoted? Does
>>> the promotion render the backup corrupt?
>>
>>
>> I think currently it does so. Pls refer
>> 1.
>> do_pg_stop_backup(char *labelfile, bool waitforarchive)
>> {
>> ..
>> if (strcmp(backupfrom, "standby") == 0&&  !backup_started_in_recovery)
>>  ereport(ERROR,
>>
>> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>>   errmsg("the standby was promoted during
>> online backup"),
>>   errhint("This means that the backup
>> being
>> taken is corrupt "
>>   "and should not be used.
>> "
>>   "Try taking another
>> online
>> backup.")));
>> ..
>>
>> }
>
>
> Okay. I think that check in do_pg_stop_backup() actually already ensures
> that you don't end up with a corrupt backup, even if the standby is promoted
> while a backup is being taken. Admittedly it would be nicer to abort it
> immediately rather than error out at the end.
>
> But I wonder why promoting a standby renders the backup invalid in the first
> place? Fujii, Simon, can you explain that?

Simon had the same question and I answered it before.

http://archives.postgresql.org/message-id/cahgqgwfu04oo8yl5sucdjvq3brni7wtfzty9wa2kxtznhic...@mail.gmail.com
---
> You say
> "If the standby is promoted to the master during online backup, the
> backup fails."
> but no explanation of why?
>
> I could work those things out, but I don't want to have to, plus we
> may disagree if I did.

If the backup succeeds in that case, when we start an archive recovery from that
backup, the recovery needs to cross between two timelines. Which means that
we need to set recovery_target_timeline before starting recovery. Whether
recovery_target_timeline needs to be set or not depends on whether the standby
was promoted during taking the backup. Leaving such a decision to a user seems
fragile.

pg_basebackup -x ensures that all required files are included in the backup and
we can start recovery without restoring any file from the archive. But
if the standby is promoted during the backup, the timeline history
file would become
an essential file for recovery, but it's not included in the backup.
---

The situation may change if your switching-timeline patch has been committed.
It's useful if we can continue the backup even if the standby is promoted.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Make CREATE AGGREGATE check validity of initcond value?

2012-10-04 Thread Jaime Casanova
El 03/10/2012 21:38, "Tom Lane"  escribió:
>
> Does anyone have an objection to this?  I can imagine cases where the
> check would reject values that would get accepted at runtime, if the
> type's input function was sensitive to the phase of the moon or
> something.  But it doesn't seem very probable, whereas checking the
> value seems like an eminently useful thing to do.  Or maybe I'm just
> overreacting to the report --- I can't recall any previous complaints
> like this, so maybe entering a bogus initcond is a corner case too.

I guess a wrong initcond value, probably is a pilot error.
So, my first reaction is +1 to make it an error.

But if you feel there a corner cases maybe at least a warning

--
Jaime Casanova
2ndQuadrant: Your PostgreSQL partner


Re: [HACKERS] PQping command line tool

2012-10-04 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


> I was wondering recently if there was any command line tool that
> utilized PQping() or PQpingParams(). I searched the code and couldn't
> find anything and was wondering if there was any interest to have
> something like this included? I wrote something for my purposes of
> performing a health check that also supports nagios style status
> output. It's probably convenient for scripting purposes as well.

I'm not sure how useful this information would be. Most health 
checks (Nagios or otherwise) really only care if things are 
working all the up to point A or not, where point A is usually 
a simple query such as "SELECT 1". Knowing various failure states 
as returned by PQping* does not seem to fit into such tools - 
any failure needs to be handled manually.

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201210041146
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAlBtukQACgkQvJuQZxSWSsiCbACePHFhTefoQnLwVuvIONH0JcSD
jq8AoIPusD88fX1rBcse5IreaADH7wkZ
=IRgc
-END PGP SIGNATURE-




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Sharing more infrastructure between walsenders and regular backends (was Re: [HACKERS] Switching timeline over streaming replication)

2012-10-04 Thread Heikki Linnakangas

On 04.10.2012 19:00, Tom Lane wrote:

Heikki Linnakangas  writes:

So I propose the attached patch. I made small changes to postgres.c to
make it call exec_replication_command() instead of exec_simple_query(),
and reject extend query protocol, in a WAL sender process. A lot of code
related to handling the main command loop and signals is removed from
walsender.c.


Why do we need the forbidden_in_wal_sender stuff?  If we're going in
this direction, I suggest there is little reason to restrict what the
replication client can do.  This seems to be both ugly and a drag on
the performance of normal backends.


Well, there's not much need for parameterized queries or cursors with 
the replication command set at the moment. I don't think it's worth it 
to try to support them. Fastpath function calls make no sense either, as 
you can't call user-defined functions in a walsender anyway.


Perhaps we could make walsenders even more like regular backends than 
what I was proposing, so that the replication commands are parsed and 
executed just like regular utility commands. However, that'd require 
some transaction support in walsender, for starters, which seems messy. 
It might become sensible in the future if the replication command set 
gets even more complicated, but it doesn't seem like a good idea at the 
moment.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Make pg_basebackup configure and start standby [Review]

2012-10-04 Thread Boszormenyi Zoltan

2012-10-04 16:43 keltezéssel, Tom Lane írta:

Boszormenyi Zoltan  writes:

Did you think about something like the attached code?

Or rather this one, which fixes a bug so fillPGconn() and PQconninfo() are
symmetric and work for "requiressl".

That's incredibly ugly.  I'm not sure where we should keep the "R"
information, but shoehorning it into the existing PQconninfoOption
struct like that seems totally unacceptable.  Either we're willing to
break backwards compatibility on the Disp-Char strings, or we need
to put that info somewhere else.


I hope only handling the new flag part is ugly. It can be hidden
in the PQconninfoMapping[] array and PQconninfo(conn, true)
pre-filters the list as in the attached version.

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/

diff -durpN postgresql/doc/src/sgml/libpq.sgml postgresql.1/doc/src/sgml/libpq.sgml
--- postgresql/doc/src/sgml/libpq.sgml	2012-08-03 09:39:30.114266570 +0200
+++ postgresql.1/doc/src/sgml/libpq.sgml	2012-10-04 18:15:11.686199926 +0200
@@ -496,6 +496,37 @@ typedef struct
  
 
 
+
+ PQconninfoPQconninfo
+ 
+  
+   Returns the connection options used by a live connection.
+
+PQconninfoOption *PQconninfo(PGconn *conn, bool for_replication);
+
+  
+
+  
+   Returns a connection options array.  This can be used to determine
+   all possible PQconnectdb options and their
+   current values that were used to connect to the server. The return
+   value points to an array of PQconninfoOption
+   structures, which ends with an entry having a null keyword
+   pointer.  Every notes above for PQconndefaults also apply.
+  
+
+  
+   The for_replication argument can be used to exclude some
+   options from the list which are used by the walreceiver module.
+   pg_basebackup uses this pre-filtered list
+   to construct primary_conninfo in the automatically generated
+   recovery.conf file.
+  
+
+ 
+
+
+
 
  PQconninfoParsePQconninfoParse
  
diff -durpN postgresql/src/interfaces/libpq/exports.txt postgresql.1/src/interfaces/libpq/exports.txt
--- postgresql/src/interfaces/libpq/exports.txt	2012-08-03 09:39:30.118266598 +0200
+++ postgresql.1/src/interfaces/libpq/exports.txt	2012-10-04 17:10:11.483654334 +0200
@@ -161,3 +161,4 @@ PQping158
 PQpingParams  159
 PQlibVersion  160
 PQsetSingleRowMode161
+PQconninfo162
diff -durpN postgresql/src/interfaces/libpq/fe-connect.c postgresql.1/src/interfaces/libpq/fe-connect.c
--- postgresql/src/interfaces/libpq/fe-connect.c	2012-09-09 08:11:09.470401480 +0200
+++ postgresql.1/src/interfaces/libpq/fe-connect.c	2012-10-04 17:53:26.205928500 +0200
@@ -137,6 +137,9 @@ static int ldapServiceLookup(const char
  * PQconninfoOptions[] *must* be NULL.	In a working copy, non-null "val"
  * fields point to malloc'd strings that should be freed when the working
  * array is freed (see PQconninfoFree).
+ *
+ * If you add a new connection option to this list, remember to add it to
+ * PQconninfoMappings[] below.
  * --
  */
 static const PQconninfoOption PQconninfoOptions[] = {
@@ -264,6 +267,62 @@ static const PQconninfoOption PQconninfo
 	NULL, NULL, 0}
 };
 
+/*
+ * We need a mapping between the PQconninfoOptions[] array
+ * and PGconn members. We have to keep it separate from
+ * PQconninfoOptions[] to not leak info about PGconn members
+ * to clients.
+ */
+typedef struct PQconninfoMapping {
+	char		*keyword;
+	size_t		member_offset;
+	bool		for_replication;
+	char		*conninfoValue;	/* Special value mapping */
+	char		*connValue;
+} PQconninfoMapping;
+#define PGCONNMEMBERADDR(conn, mapping) ((char **)((char *)conn + mapping->member_offset))
+
+static const PQconninfoMapping PQconninfoMappings[] =
+{
+	/* "authtype" is not used anymore, there is no mapping to PGconn */
+	/* there is no mapping of "service" to PGconn */
+	{ "user", offsetof(struct pg_conn, pguser), true, NULL, NULL },
+	{ "password", offsetof(struct pg_conn, pgpass), true, NULL, NULL },
+	{ "connect_timeout", offsetof(struct pg_conn, connect_timeout), true, NULL, NULL },
+	{ "dbname", offsetof(struct pg_conn, dbName), false, NULL, NULL },
+	{ "host", offsetof(struct pg_conn, pghost), true, NULL, NULL },
+	{ "hostaddr", offsetof(struct pg_conn, pghostaddr), true, NULL, NULL },
+	{ "port", offsetof(struct pg_conn, pgport), true, NULL, NULL },
+	{ "client_encoding", offsetof(struct pg_conn, client_encoding_initial), false, NULL, NULL },
+	{ "tty", offsetof(struct pg_conn, pgtty), false, NULL, NULL },
+	{ "options", offsetof(struct pg_conn, pgoptions), true, NULL, NULL },
+	{ "application_name", offsetof(struct pg_conn, appname), false, NULL, NULL },
+	{ "fallback_application_name", offsetof(struc

Re: Sharing more infrastructure between walsenders and regular backends (was Re: [HACKERS] Switching timeline over streaming replication)

2012-10-04 Thread Tom Lane
Heikki Linnakangas  writes:
> So I propose the attached patch. I made small changes to postgres.c to 
> make it call exec_replication_command() instead of exec_simple_query(), 
> and reject extend query protocol, in a WAL sender process. A lot of code 
> related to handling the main command loop and signals is removed from 
> walsender.c.

Why do we need the forbidden_in_wal_sender stuff?  If we're going in
this direction, I suggest there is little reason to restrict what the
replication client can do.  This seems to be both ugly and a drag on
the performance of normal backends.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bison location reporting for potentially-empty list productions

2012-10-04 Thread Tom Lane
I wrote:
> To produce a really useful cursor for this type of situation I think
> we would have to do something like this:

> #define YYLLOC_DEFAULT(Current, Rhs, N) \
> do { \
> int i;
> (Current) = -1; \
> for (i = 1; i <= (N); i++)
> {
> (Current) = (Rhs)[i]; \
> if ((Current) >= 0)
> break;
> }
> } while (0)

> This is pretty ugly and seems likely to create a visible hit on the
> parser's speed (which is already not that good).  I'm not sure it's
> worth it for something that seems to be a corner case --- anyway
> we've not hit it before in six years since the location tracking
> support was added.

After another look at the Bison manual I realized that there's a way to
fix this without making YYLLOC_DEFAULT slower, because that macro only
sets the *default* location for the current nonterminal; the rule action
is allowed to override its location value.  So I propose the attached
patch against HEAD.  This is a little bit ugly because we have to
remember to put some extra code in productions that have this issue ...
but track record so far suggests that there will be very few of them.

Anyone have an objection, or a better idea?

regards, tom lane

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 7feadeac1693e7ff46f44ef3ad9ea2fa86bf46a8..e4ff76e66e0990d91790ccc54615c26dd64a143e 100644
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***
*** 67,82 
  #include "utils/xml.h"
  
  
! /* Location tracking support --- simpler than bison's default */
  #define YYLLOC_DEFAULT(Current, Rhs, N) \
  	do { \
! 		if (N) \
  			(Current) = (Rhs)[1]; \
  		else \
! 			(Current) = (Rhs)[0]; \
  	} while (0)
  
  /*
   * Bison doesn't allocate anything that needs to live across parser calls,
   * so we can easily have it use palloc instead of malloc.  This prevents
   * memory leaks if we error out during parsing.  Note this only works with
--- 67,100 
  #include "utils/xml.h"
  
  
! /*
!  * Location tracking support --- simpler than bison's default, since we only
!  * want to track the start position not the end position of each nonterminal.
!  */
  #define YYLLOC_DEFAULT(Current, Rhs, N) \
  	do { \
! 		if ((N) > 0) \
  			(Current) = (Rhs)[1]; \
  		else \
! 			(Current) = (-1); \
  	} while (0)
  
  /*
+  * The above macro assigns -1 (unknown) as the parse location of any
+  * nonterminal that was reduced from an empty rule.  This is problematic
+  * for nonterminals defined like
+  *		OptFooList: / * EMPTY * / { ... } | OptFooList Foo { ... } ;
+  * because we'll set -1 as the location during the first reduction and then
+  * copy it during each subsequent reduction, leaving us with -1 for the
+  * location even when the list is not empty.  To fix that, do this in the
+  * action for the nonempty rule(s):
+  *		if (@$ < 0) @$ = @2;
+  * (Although we have many nonterminals that follow this pattern, we only
+  * bother with fixing @$ like this when the nonterminal's parse location
+  * is actually referenced in some rule.)
+  */
+ 
+ /*
   * Bison doesn't allocate anything that needs to live across parser calls,
   * so we can easily have it use palloc instead of malloc.  This prevents
   * memory leaks if we error out during parsing.  Note this only works with
*** OptSchemaName:
*** 1223,1230 
  		;
  
  OptSchemaEltList:
! 			OptSchemaEltList schema_stmt			{ $$ = lappend($1, $2); }
! 			| /* EMPTY */			{ $$ = NIL; }
  		;
  
  /*
--- 1241,1254 
  		;
  
  OptSchemaEltList:
! 			OptSchemaEltList schema_stmt
! {
! 	if (@$ < 0)			/* see comments for YYLLOC_DEFAULT */
! 		@$ = @2;
! 	$$ = lappend($1, $2);
! }
! 			| /* EMPTY */
! { $$ = NIL; }
  		;
  
  /*
diff --git a/src/test/regress/expected/namespace.out b/src/test/regress/expected/namespace.out
index 5fcd46daf42705147f53599d657e52d8fe7b5b1f..9187c8126a3ffa32e365400816b262daf3b6f2d5 100644
*** a/src/test/regress/expected/namespace.out
--- b/src/test/regress/expected/namespace.out
*** CREATE SCHEMA IF NOT EXISTS test_schema_
*** 47,54 
b int UNIQUE
 );
  ERROR:  CREATE SCHEMA IF NOT EXISTS cannot include schema elements
! LINE 1: CREATE SCHEMA IF NOT EXISTS test_schema_1 
! ^
  DROP SCHEMA test_schema_1 CASCADE;
  NOTICE:  drop cascades to 2 other objects
  DETAIL:  drop cascades to table test_schema_1.abc
--- 47,54 
b int UNIQUE
 );
  ERROR:  CREATE SCHEMA IF NOT EXISTS cannot include schema elements
! LINE 2:CREATE TABLE abc (
!^
  DROP SCHEMA test_schema_1 CASCADE;
  NOTICE:  drop cascades to 2 other objects
  DETAIL:  drop cascades to table test_schema_1.abc

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] hackers newsgroup hacked ?

2012-10-04 Thread Gaetano Mendola
Reading this mailing list via newsgroup  (news.postgresql.org port
119) I can see that last "legitimate" message is from
29 August since then only "RUSSIAN" posts are present.

Regards
Gaetano Mendola

-- 
cpp-today.blogspot.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Sharing more infrastructure between walsenders and regular backends (was Re: [HACKERS] Switching timeline over streaming replication)

2012-10-04 Thread Heikki Linnakangas

On 03.10.2012 18:15, Amit Kapila wrote:

35.WalSenderMain(void)
{
..
+if (walsender_shutdown_requested)
+ereport(FATAL,
+(errcode(ERRCODE_ADMIN_SHUTDOWN),
+ errmsg("terminating replication
connection due to administrator command")));
+
+/* Tell the client that we are ready to receive commands */

+ReadyForQuery(DestRemote);
+
..

+if (walsender_shutdown_requested)
+ereport(FATAL,
+(errcode(ERRCODE_ADMIN_SHUTDOWN),
+ errmsg("terminating replication
connection due to administrator command")));
+

is it necessary to check walsender_shutdown_requested 2 times in a
loop, if yes, then can we write comment why it is important to check
it again.


The idea was to check for shutdown request before and after the 
pq_getbyte() call, because that can block for a long time.


Looking closer, we don't currently (ie. without this patch) make any 
effort to react to SIGTERM in a walsender, while it's waiting for a 
command from the client. After starting replication, it does check 
walsender_shutdown_requested in the loop, and it's also checked during a 
base backup (although only when switching to send next file, which seems 
too seldom). This issue is orthogonal to handling timeline changes over 
streaming replication, although that patch will make it more important 
to handle SIGTERM quickly while waiting for a command, because you stay 
in that mode for longer and more often.


I think walsender needs to share more infrastructure with regular 
backends to handle this better. When we first implemented streaming 
replication in 9.0, it made sense to implement just the bare minimum 
needed to accept the handshake commands before entering the Copy state, 
but now that the replication command set has grown to cover base 
backups, and fetching timelines with the patch being discussed, we 
should bite the bullet and make the command loop more feature-complete 
and robust.


In a regular backend, the command loop reacts to SIGTERM immediately, 
setting ImmediateInterruptOK at the right places, and calling 
CHECK_FOR_INTERRUPTS() at strategic places. I propose that we let 
PostgresMain handle the main command loop for walsender processes too, 
like it does for regular backends, and use ProcDiePending and the 
regular die() signal handler for SIGTERM in walsender as well.


So I propose the attached patch. I made small changes to postgres.c to 
make it call exec_replication_command() instead of exec_simple_query(), 
and reject extend query protocol, in a WAL sender process. A lot of code 
related to handling the main command loop and signals is removed from 
walsender.c.


- Heikki
*** a/src/backend/replication/basebackup.c
--- b/src/backend/replication/basebackup.c
***
*** 22,27 
--- 22,28 
  #include "lib/stringinfo.h"
  #include "libpq/libpq.h"
  #include "libpq/pqformat.h"
+ #include "miscadmin.h"
  #include "nodes/pg_list.h"
  #include "replication/basebackup.h"
  #include "replication/walsender.h"
***
*** 30,36 
  #include "storage/ipc.h"
  #include "utils/builtins.h"
  #include "utils/elog.h"
- #include "utils/memutils.h"
  #include "utils/ps_status.h"
  
  typedef struct
--- 31,36 
***
*** 370,388  void
  SendBaseBackup(BaseBackupCmd *cmd)
  {
  	DIR		   *dir;
- 	MemoryContext backup_context;
- 	MemoryContext old_context;
  	basebackup_options opt;
  
  	parse_basebackup_options(cmd->options, &opt);
  
- 	backup_context = AllocSetContextCreate(CurrentMemoryContext,
- 		   "Streaming base backup context",
- 		   ALLOCSET_DEFAULT_MINSIZE,
- 		   ALLOCSET_DEFAULT_INITSIZE,
- 		   ALLOCSET_DEFAULT_MAXSIZE);
- 	old_context = MemoryContextSwitchTo(backup_context);
- 
  	WalSndSetState(WALSNDSTATE_BACKUP);
  
  	if (update_process_title)
--- 370,379 
***
*** 403,411  SendBaseBackup(BaseBackupCmd *cmd)
  	perform_base_backup(&opt, dir);
  
  	FreeDir(dir);
- 
- 	MemoryContextSwitchTo(old_context);
- 	MemoryContextDelete(backup_context);
  }
  
  static void
--- 394,399 
***
*** 606,612  sendDir(char *path, int basepathlen, bool sizeonly)
  		 * error in that case. The error handler further up will call
  		 * do_pg_abort_backup() for us.
  		 */
! 		if (walsender_shutdown_requested || walsender_ready_to_stop)
  			ereport(ERROR,
  (errmsg("shutdown requested, aborting active base backup")));
  
--- 594,600 
  		 * error in that case. The error handler further up will call
  		 * do_pg_abort_backup() for us.
  		 */
! 		if (ProcDiePending || walsender_ready_to_stop)
  			ereport(ERROR,
  (errmsg("shutdown requested, aborting active base backup")));
  
*** a/src/backend/replication/walsender.c
--- b/src/backend/replication/w

Re: [HACKERS] [PATCH] Make pg_basebackup configure and start standby [Review]

2012-10-04 Thread Tom Lane
Boszormenyi Zoltan  writes:
>> Did you think about something like the attached code?

> Or rather this one, which fixes a bug so fillPGconn() and PQconninfo() are
> symmetric and work for "requiressl".

That's incredibly ugly.  I'm not sure where we should keep the "R"
information, but shoehorning it into the existing PQconninfoOption
struct like that seems totally unacceptable.  Either we're willing to
break backwards compatibility on the Disp-Char strings, or we need
to put that info somewhere else.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Raise a WARNING if a REVOKE affects nothing?

2012-10-04 Thread Tom Lane
Robert Haas  writes:
> Just to ask a possibly stupid question: why is attempting to a REVOKE
> a non-existent privilege anything other than an ERROR?

Because the SQL standard says so?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xmalloc => pg_malloc

2012-10-04 Thread Jon Nelson
On Wed, Oct 3, 2012 at 11:36 PM, Tom Lane  wrote:
> Peter Eisentraut  writes:
>> xmalloc, xstrdup, etc. are pretty common names for functions that do
>> alloc-or-die (another possible naming scheme ;-) ).  The naming
>> pg_malloc etc. on the other hand suggests that the allocation is being
>> done in a PostgreSQL-specific way, and anyway sounds too close to
>> palloc.
>
>> So I'd be more in favor of xmalloc <= pg_malloc.
>
> Meh.  The fact that other people use that name is not really an
> advantage from where I sit.  I'm concerned about possible name
> collisions, eg in libraries loaded into the backend.
>
> There are probably not any actual risks of collision right now, given
> that all these functions are currently in our client-side programs ---
> but it's foreseeable that we might use this same naming convention in
> more-exposed places in future.  In fact, somebody was already proposing
> creating such functions in the core backend.
>
> But having said that, I'm not absolutely wedded to these names; they
> were just the majority of existing cases.

Why not split the difference and use pg_xmalloc?
As in: "PostgreSQL-special malloc that dies on failure."

-- 
Jon


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Switching timeline over streaming replication

2012-10-04 Thread Amit Kapila
> On Wednesday, October 03, 2012 8:45 PM Heikki Linnakangas wrote:
> On Tuesday, October 02, 2012 4:21 PM Heikki Linnakangas wrote:
> > Thanks for the thorough review! I committed the xlog.c refactoring
> patch
> > now. Attached is a new version of the main patch, comments on specific
> > points below. I didn't adjust the docs per your comments yet, will do
> > that next.
> 
> I have some doubts regarding the comments fixed by you and some more new
> review comments.
> After this I shall focus majorly towards testing of this Patch.
> 

Testing
---

Failed Case
--
1. promotion of standby to master and follow standby to new master.
2. Stop standby and master. Restart standby first and then master
3. Restart of standby gives below errors
E:\pg_git_code\installation\bin>LOG:  database system was shut down in
recovery 
at 2012-10-04 18:36:00 IST 
LOG:  entering standby mode 
LOG:  consistent recovery state reached at 0/176B800 
LOG:  redo starts at 0/176B800 
LOG:  record with zero length at 0/176BD68 
LOG:  database system is ready to accept read only connections 
LOG:  streaming replication successfully connected to primary 
LOG:  out-of-sequence timeline ID 1 (after 2) in log segment
0002000 
1, offset 0 
FATAL:  terminating walreceiver process due to administrator command 
LOG:  out-of-sequence timeline ID 1 (after 2) in log segment
0002000 
1, offset 0 
LOG:  out-of-sequence timeline ID 1 (after 2) in log segment
0002000 
1, offset 0 
LOG:  out-of-sequence timeline ID 1 (after 2) in log segment
0002000 
1, offset 0 
LOG:  out-of-sequence timeline ID 1 (after 2) in log segment
0002000 
1, offset 0

Once this error comes, restart master/standby in any order or do some
operations on master, always there is above error
On standby.


Passed Cases
-
1. After promoting standby as new master, try to make previous master
(having same WAL as new master) as standby. 
   In this case recovery.conf recovery_target_timeline set to latest. It
ables to connect to new master and started 
   streaming as per expectation. 
   - As per expected behavior. 

2. After promoting standby as new master, try to make previous master
(having more WAL compare to new master) as standby, 
   error is displayed. 
   - As per expected behavior 

3. After promoting standby as new master, try to make previous master
(having same WAL as new master) as standby. 
   In this case recovery.conf recovery_target_timeline is not set. Following
LOG is displayed. 
   LOG:  fetching timeline history file for timeline 2 from primary server 
   LOG:  replication terminated by primary server 
   DETAIL:  End of WAL reached on timeline 1 
   LOG:  walreceiver ended streaming and awaits new instructions 
   LOG:  re-handshaking at position 0/100 on tli 1 
   LOG:  replication terminated by primary server 
   DETAIL:  End of WAL reached on timeline 1 
   LOG:  walreceiver ended streaming and awaits new instructions 
   LOG:  re-handshaking at position 0/100 on tli 1 
   LOG:  replication terminated by primary server 
   DETAIL:  End of WAL reached on timeline 1 
   - As per expected behavior


Pending Cases which needs to be tested (these are scenarios, some more
testing I will do based on these scenarios)
---
1. a. Master  M-1 
   b. Standby S-1 follows M-1 
   c. Standby S-2 follows M-1 
   d. Promote S-1 as master 
   e. Try to follow S-2  to S-1 -- operation should be success 

2. a. Master M-1 
   b. Standby S-1 follows M-1 
   c. Stop S-1, M-1 
   d. Do the PITR in M-1 2 times. This is to increment timeline in M-1 
   e. try to follow standby S-1 to M-1 -- it should be success. 

3. a. Master M-1 
   b. Standby S-1, S-2 follows M1 
   c. Standby S-3, S-4 follows S-1 
   d. Promote Standby which has highest WAL. 
   e. follow all standby's to the new master. 

4. a. Master M-1 
   b. Synchronous Standby S-1, S-2 
   c. Promote S-1 
   d. Follow M-1, S-2 to S-1 -- this operation should be success. 


   Concurrent Operations 
   --- 
   1. a. Master M-1 , Standby S-1 follows M-1, Standby S-2 follows M-1 
  b. Many concurrent operations on master M-1 
  c. During concurrent ops, Promote S-1 
  d. try S-2 to follow S-1 -- it should happen successfully. 

   2. During Promotion, call pg_basebackup 

   3. During Promotion, try to connect client 

   Resource Testing 
   -- 
   1. a.Make standby follow master which is many time lines ahead 
  b. Observe if there is any resource leak 
  c. Allow the streaming replication for 30 mins 
  d. Observe if there is any resource leak

Code Review
-
Libpqrcv_readtimelinehistoryfile()
{
  ..
  ..
+   if (PQnfields(res) != 2 || PQntuples(res) != 1) 
+   { 
+   int ntuples = PQntuples(res); 
+   int

Re: [HACKERS] Raise a WARNING if a REVOKE affects nothing?

2012-10-04 Thread Robert Haas
On Tue, Oct 2, 2012 at 3:01 PM, Noah Misch  wrote:
> On Tue, Aug 21, 2012 at 02:31:29PM +0800, Craig Ringer wrote:
>> It'd really help if REVOKE consistently raised warnings when it didn't
>> actually revoke anything.
>
> +1
>
> This will invite the same mixed feelings as the CREATE x IF NOT EXISTS
> notices, but I think it's worthwhile.

Just to ask a possibly stupid question: why is attempting to a REVOKE
a non-existent privilege anything other than an ERROR?

We would throw an ERROR if you tried to insert into a nonexistent
table, or if you tried to drop a nonexistent table, or if you tried to
call a nonexistent function, so why not also here?

We could have REVOKE IF EXISTS for the current behavior (and users
could boost client_min_messages to suppress the notice when deisred).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown

2012-10-04 Thread Amit Kapila


> -Original Message-
> From: pgsql-bugs-ow...@postgresql.org [mailto:pgsql-bugs-
> ow...@postgresql.org] On Behalf Of Amit kapila
> Sent: Thursday, October 04, 2012 3:43 PM
> To: Heikki Linnakangas
> Cc: Fujii Masao; pgsql-b...@postgresql.org; pgsql-hackers@postgresql.org
> Subject: Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w
> breakdown
> 
> On Tuesday, October 02, 2012 1:56 PM Heikki Linnakangas wrote:
> On 02.10.2012 10:36, Amit kapila wrote:
> > On Monday, October 01, 2012 4:08 PM Heikki Linnakangas wrote:
> >>> So let's think how this should ideally work from a user's point of
> view.
> >>> I think there should be just two settings: walsender_timeout and
> >>> walreceiver_timeout. walsender_timeout specifies how long a
> >>> walsender will keep a connection open if it doesn't hear from the

> 
> Thank you for suggestions.
> I have addressed your suggestions in patch attached with this mail.
> 
> Following changes are done to support replication timeout in sender as
> well as receiver:


Testing Done for the Patch

1. Verified the value of new configuration parameter and changed
configuration parameter using the show command (using Show of specific 
   parameter as well as show all). 
2. Verified the new configuration parameter in --describe-config. 
3. Verified the existing parameter replication_timeout's new name in
--describe-config. 
4. Start primary and standby node with default timeout, leave it for
sometime in idle situation. 
   It should not error out due to network break error. 
5. a. Start primary and standby node with default timeout, bring down the
network. 
   b. Both sender and receiver should be able to detect network break-down
almost at same time. 
   c. Once the network is up again, connection should get re-established
successfully. 
5. a. Start primary and standby node with wal_sender_timeout less than
wal_receiver_timeout, bring down the network. 
   b. Sender should be able to detect network break-down before receiver
task. 
   c. Once the network is up again, connection should get re-established
successfully. 
6. a. Start primary and standby node with wal_receiver_timeout less than
wal_sender_timeout, bring down the network. 
   b. Receiver should be able to detect network break-down before sender
task. 
   c. Once the network is up again, connection should get re-established
successfully. 
7. a. In 5th test case, change the value of wal_receiver_status_interval to
more than wal_receiver_timeout and hence more than  
  wal_sender_timeout. 
   b. Then bring down the network down.
   c. Sender task should be able to detect network break-down once
wal_sender_timeout has lapsed. 
   d. Once the network is up again, connection should get re-established
successfully.
   Intent of this test is to check there is no dependency of
wal_sender_timeout on wal_receiver_status_interval for detection of
   Network break.

All the above tests are passed. 

With Regards,
Amit Kapila.
 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Make pg_basebackup configure and start standby [Review]

2012-10-04 Thread Boszormenyi Zoltan

2012-10-04 12:42 keltezéssel, Boszormenyi Zoltan írta:

2012-10-04 06:47 keltezéssel, Boszormenyi Zoltan írta:

2012-10-04 05:24 keltezéssel, Peter Eisentraut írta:

On Wed, 2012-10-03 at 18:15 +0200, Boszormenyi Zoltan wrote:

The second generation of this work is now attached and contains a new
feature as was discussed and suggested by Magnus Hagander, Fujii Masao
and Peter Eisentraut. So libpq has grown a new function:

+/* return the connection options used by a live connection */
+extern PQconninfoOption *PQconninfo(PGconn *conn);

This copies all the connection parameters back from the live PGconn
itself
so everything that's needed to connect is already validated.

I don't like that this code maintains a second list of all possible
libpq connection parameters.


Where does it do that? In PQconninfo() itself? Why is it a problem?
Or to put it bluntly: the same problem is in fillPGconn() too, as it also
has the same set of parameters listed. So there is already code
that you don't like. :-)

How about a static mapping between option names and
offsetof(struct pg_conn, member) values? This way both fillPGconn()
and PQconninfo() can avoid maintaining the list of parameter names.


Did you think about something like the attached code?


Or rather this one, which fixes a bug so fillPGconn() and PQconninfo() are
symmetric and work for "requiressl".



Best regards,
Zoltán Böszörményi




--
--
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/

diff -durpN postgresql/src/interfaces/libpq/exports.txt postgresql.1/src/interfaces/libpq/exports.txt
--- postgresql/src/interfaces/libpq/exports.txt	2012-08-03 09:39:30.118266598 +0200
+++ postgresql.1/src/interfaces/libpq/exports.txt	2012-10-04 11:29:24.864309200 +0200
@@ -161,3 +161,5 @@ PQping158
 PQpingParams  159
 PQlibVersion  160
 PQsetSingleRowMode161
+PQconninfo162
+PQconninfoForReplication  163
diff -durpN postgresql/src/interfaces/libpq/fe-connect.c postgresql.1/src/interfaces/libpq/fe-connect.c
--- postgresql/src/interfaces/libpq/fe-connect.c	2012-09-09 08:11:09.470401480 +0200
+++ postgresql.1/src/interfaces/libpq/fe-connect.c	2012-10-04 13:10:58.737418824 +0200
@@ -131,12 +131,17 @@ static int ldapServiceLookup(const char
  *		""		Normal input field
  *		"*"		Password field - hide value
  *		"D"		Debug option - don't show by default
+ * A special "usable for replication connections" ("R") flag is
+ * added to Disp-Char in a backwards compatible fashion.
  *
  * PQconninfoOptions[] is a constant static array that we use to initialize
  * a dynamically allocated working copy.  All the "val" fields in
  * PQconninfoOptions[] *must* be NULL.	In a working copy, non-null "val"
  * fields point to malloc'd strings that should be freed when the working
  * array is freed (see PQconninfoFree).
+ *
+ * If you add a new connection option to this list, remember to add it to
+ * PQconninfoMappings[] below.
  * --
  */
 static const PQconninfoOption PQconninfoOptions[] = {
@@ -146,62 +151,62 @@ static const PQconninfoOption PQconninfo
 	 * still try to set it.
 	 */
 	{"authtype", "PGAUTHTYPE", DefaultAuthtype, NULL,
-	"Database-Authtype", "D", 20},
+	"Database-Authtype", "D\0", 20},
 
 	{"service", "PGSERVICE", NULL, NULL,
-	"Database-Service", "", 20},
+	"Database-Service", "\0", 20},
 
 	{"user", "PGUSER", NULL, NULL,
-	"Database-User", "", 20},
+	"Database-User", "\0R", 20},
 
 	{"password", "PGPASSWORD", NULL, NULL,
-	"Database-Password", "*", 20},
+	"Database-Password", "*\0R", 20},
 
 	{"connect_timeout", "PGCONNECT_TIMEOUT", NULL, NULL,
-	"Connect-timeout", "", 10}, /* strlen(INT32_MAX) == 10 */
+	"Connect-timeout", "\0R", 10}, /* strlen(INT32_MAX) == 10 */
 
 	{"dbname", "PGDATABASE", NULL, NULL,
-	"Database-Name", "", 20},
+	"Database-Name", "\0", 20},
 
 	{"host", "PGHOST", NULL, NULL,
-	"Database-Host", "", 40},
+	"Database-Host", "\0R", 40},
 
 	{"hostaddr", "PGHOSTADDR", NULL, NULL,
-	"Database-Host-IP-Address", "", 45},
+	"Database-Host-IP-Address", "\0R", 45},
 
 	{"port", "PGPORT", DEF_PGPORT_STR, NULL,
-	"Database-Port", "", 6},
+	"Database-Port", "\0R", 6},
 
 	{"client_encoding", "PGCLIENTENCODING", NULL, NULL,
-	"Client-Encoding", "", 10},
+	"Client-Encoding", "\0", 10},
 
 	/*
 	 * "tty" is no longer used either, but keep it present for backwards
 	 * compatibility.
 	 */
 	{"tty", "PGTTY", DefaultTty, NULL,
-	"Backend-Debug-TTY", "D", 40},
+	"Backend-Debug-TTY", "D\0", 40},
 
 	{"options", "PGOPTIONS", DefaultOption, NULL,
-	"Backend-Debug-Options", "D", 40},
+	"Backend-Debug-Options", "D\0R", 40},
 
 	{"application_name", "PGAPPNAME", NULL, NULL,
-	"Application-Name", "", 64},
+	"Application-Name", "\0", 64},
 
 	{"fallback_application_name", NULL, NULL, NULL,
-	"Fallback-Application-Name", "", 64},
+	"Fallback-Applicat

Re: [HACKERS] Re: [WIP] Performance Improvement by reducing WAL for Update Operation

2012-10-04 Thread Amit Kapila
> On Thursday, October 04, 2012 12:54 PM Heikki Linnakangas
> On 03.10.2012 19:03, Amit Kapila wrote:
> > Any comments/suggestions regarding performance/functionality test?
> 
> Hmm. Doing a lot of UPDATEs concurrently can be limited by the
> WALInsertLock, which each inserter holds while copying the WAL record to
> the buffer. Reducing the size of the WAL records, by compression or
> delta encoding, alleviates that bottleneck: when WAL records are
> smaller, the lock needs to be held for a shorter duration. That improves
> throughput, even if individual backends need to do more CPU work to
> compress the records, because that work can be done in parallel. I
> suspect much of the benefit you're seeing in these tests might be
> because of that effect.
> 
> As it happens, I've been working on making WAL insertion scale better in
> general:
> http://archives.postgresql.org/message-id/5064779a.3050...@vmware.com.
> That should also help most when inserting large WAL records. The
> question is: assuming we commit the xloginsert-scale patch, how much
> benefit is there left from the compression? It will surely still help to
> reduce the size of WAL, which can certainly help if you're limited by
> the WAL I/O, but I suspect the results from the pgbench tests you run
> might look quite different.
> 
> So, could you rerun these tests with the xloginsert-scale patch applied?

I shall take care of doing the performance test with xloginsert-scale patch
as well
both for single and multi-thread.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Make pg_basebackup configure and start standby [Review]

2012-10-04 Thread Boszormenyi Zoltan

2012-10-04 06:47 keltezéssel, Boszormenyi Zoltan írta:

2012-10-04 05:24 keltezéssel, Peter Eisentraut írta:

On Wed, 2012-10-03 at 18:15 +0200, Boszormenyi Zoltan wrote:

The second generation of this work is now attached and contains a new
feature as was discussed and suggested by Magnus Hagander, Fujii Masao
and Peter Eisentraut. So libpq has grown a new function:

+/* return the connection options used by a live connection */
+extern PQconninfoOption *PQconninfo(PGconn *conn);

This copies all the connection parameters back from the live PGconn
itself
so everything that's needed to connect is already validated.

I don't like that this code maintains a second list of all possible
libpq connection parameters.


Where does it do that? In PQconninfo() itself? Why is it a problem?
Or to put it bluntly: the same problem is in fillPGconn() too, as it also
has the same set of parameters listed. So there is already code
that you don't like. :-)

How about a static mapping between option names and
offsetof(struct pg_conn, member) values? This way both fillPGconn()
and PQconninfo() can avoid maintaining the list of parameter names.


Did you think about something like the attached code?

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/

diff -durpN postgresql/src/interfaces/libpq/exports.txt postgresql.1/src/interfaces/libpq/exports.txt
--- postgresql/src/interfaces/libpq/exports.txt	2012-08-03 09:39:30.118266598 +0200
+++ postgresql.1/src/interfaces/libpq/exports.txt	2012-10-04 11:29:24.864309200 +0200
@@ -161,3 +161,5 @@ PQping158
 PQpingParams  159
 PQlibVersion  160
 PQsetSingleRowMode161
+PQconninfo162
+PQconninfoForReplication  163
diff -durpN postgresql/src/interfaces/libpq/fe-connect.c postgresql.1/src/interfaces/libpq/fe-connect.c
--- postgresql/src/interfaces/libpq/fe-connect.c	2012-09-09 08:11:09.470401480 +0200
+++ postgresql.1/src/interfaces/libpq/fe-connect.c	2012-10-04 12:30:56.705605588 +0200
@@ -131,12 +131,17 @@ static int ldapServiceLookup(const char
  *		""		Normal input field
  *		"*"		Password field - hide value
  *		"D"		Debug option - don't show by default
+ * A special "usable for replication connections" ("R") flag is
+ * added to Disp-Char in a backwards compatible fashion.
  *
  * PQconninfoOptions[] is a constant static array that we use to initialize
  * a dynamically allocated working copy.  All the "val" fields in
  * PQconninfoOptions[] *must* be NULL.	In a working copy, non-null "val"
  * fields point to malloc'd strings that should be freed when the working
  * array is freed (see PQconninfoFree).
+ *
+ * If you add a new connection option to this list, remember to add it to
+ * PQconninfoMappings[] below.
  * --
  */
 static const PQconninfoOption PQconninfoOptions[] = {
@@ -146,62 +151,62 @@ static const PQconninfoOption PQconninfo
 	 * still try to set it.
 	 */
 	{"authtype", "PGAUTHTYPE", DefaultAuthtype, NULL,
-	"Database-Authtype", "D", 20},
+	"Database-Authtype", "D\0", 20},
 
 	{"service", "PGSERVICE", NULL, NULL,
-	"Database-Service", "", 20},
+	"Database-Service", "\0", 20},
 
 	{"user", "PGUSER", NULL, NULL,
-	"Database-User", "", 20},
+	"Database-User", "\0R", 20},
 
 	{"password", "PGPASSWORD", NULL, NULL,
-	"Database-Password", "*", 20},
+	"Database-Password", "*\0R", 20},
 
 	{"connect_timeout", "PGCONNECT_TIMEOUT", NULL, NULL,
-	"Connect-timeout", "", 10}, /* strlen(INT32_MAX) == 10 */
+	"Connect-timeout", "\0R", 10}, /* strlen(INT32_MAX) == 10 */
 
 	{"dbname", "PGDATABASE", NULL, NULL,
-	"Database-Name", "", 20},
+	"Database-Name", "\0", 20},
 
 	{"host", "PGHOST", NULL, NULL,
-	"Database-Host", "", 40},
+	"Database-Host", "\0R", 40},
 
 	{"hostaddr", "PGHOSTADDR", NULL, NULL,
-	"Database-Host-IP-Address", "", 45},
+	"Database-Host-IP-Address", "\0R", 45},
 
 	{"port", "PGPORT", DEF_PGPORT_STR, NULL,
-	"Database-Port", "", 6},
+	"Database-Port", "\0R", 6},
 
 	{"client_encoding", "PGCLIENTENCODING", NULL, NULL,
-	"Client-Encoding", "", 10},
+	"Client-Encoding", "\0", 10},
 
 	/*
 	 * "tty" is no longer used either, but keep it present for backwards
 	 * compatibility.
 	 */
 	{"tty", "PGTTY", DefaultTty, NULL,
-	"Backend-Debug-TTY", "D", 40},
+	"Backend-Debug-TTY", "D\0", 40},
 
 	{"options", "PGOPTIONS", DefaultOption, NULL,
-	"Backend-Debug-Options", "D", 40},
+	"Backend-Debug-Options", "D\0R", 40},
 
 	{"application_name", "PGAPPNAME", NULL, NULL,
-	"Application-Name", "", 64},
+	"Application-Name", "\0", 64},
 
 	{"fallback_application_name", NULL, NULL, NULL,
-	"Fallback-Application-Name", "", 64},
+	"Fallback-Application-Name", "\0", 64},
 
 	{"keepalives", NULL, NULL, NULL,
-	"TCP-Keepalives", "", 1},	/* should be just '0' or '1' */
+	"TCP-Keepalives", "\0R", 1},	/* should be just '0' or 

[HACKERS] Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown

2012-10-04 Thread Amit kapila
On Tuesday, October 02, 2012 1:56 PM Heikki Linnakangas wrote:
On 02.10.2012 10:36, Amit kapila wrote:
> On Monday, October 01, 2012 4:08 PM Heikki Linnakangas wrote:
>>> So let's think how this should ideally work from a user's point of view.
>>> I think there should be just two settings: walsender_timeout and
>>> walreceiver_timeout. walsender_timeout specifies how long a walsender
>>> will keep a connection open if it doesn't hear from the walreceiver, and
>>> walreceiver_timeout is the same for walreceiver. The system should


>>> The Ping/Pong messages don't necessarily need to be new message types,
>>> we can use the message types we currently have, perhaps with an
>>> additional flag attached to them, to request the other side to reply
>>> immediately.
>
>> Can't we make the decision to send reply immediately based on message type, 
>> because these message types will be unique.
>
>> To clarify my understanding,
>> 1. the heartbeat message from walsender side will be keepalive message ('k') 
>> and from walreceiver side it will be Hot Standby feedback message ('h').
>> 2. the reply message from walreceiver side will be current reply message 
>> ('r').

> Yep. I wonder why need separate message types for Hot Standby Feedback
> 'h' and Reply 'r', though. Seems it would be simpler to have just one
> messasge type that includes all the fields from both messages.

moved the contents for Hot Standby Feedback 'h' to Reply 'r' and use 'h' for 
heart-beat purpose.

>> 3. currently there is no reply kind of message from walsender, so do we need 
>> to introduce one new message for it or can use some existing message only?
>>  if new, do we need to send any additional information along with it, 
>> for existing messages can we use keepalive message it self as reply message 
>> but with an additional byte
>>  to indicate it is reply?

> Hmm, I think I'd prefer to use the existing Keepalive message 'k', with an 
> additional flag.
   Okay. I have done it in Patch.

Thank you for suggestions. 
I have addressed your suggestions in patch attached with this mail.

Following changes are done to support replication timeout in sender as well as 
receiver: 

1. One new configuration parameter wal_receiver_timeout is added to detect 
timeout at receiver task. 
2. Existing parameter replication_timeout is renamed to wal_sender_timeout. 
3. Now PrimaryKeepaliveMessage structure is modified to add one more field to 
indicate whether keep-alive is of type 'r' (i.e. 
reply) or 'h' (i.e. heart-beat). 
4. Now the keep-alive message from sender will be sent to standby if it was 
idle for more than or equal to half of wal_sender_timeout. 
In this case it will send keep-alive of type 'h'. 
5. Once the standby receiver a keep-alive, it needs to send an immediate reply 
to primary to indicate connection is alive. 
6. Now Reply message to send wal offset and Feedback message to send oldest 
transaction are merged into single Reply message. 
So now the structure StandbyReplyMessage is changed to add two more fields 
as xmin and epoch. Also StandbyHSFeedbackMessage 
structure is changed to remove xmin and epoch fields (as these are moved to 
StandbyReplyMessage). 
7. Because of changes as in step-6, once receiver task receives some data from 
primary then it will only send Reply Message. 
8. Same Reply message is sent in step-5 and step-7 but incase of step-5, then 
reply is sent immediately but incase of step-7, reply is sent 
 if wal_receiver_status_interval has lapsed (this part is same as earlier). 
9. Similar to sender, if receiver finds itself idle for more than or equal to 
half of configured wal_receiver_timeout, then it will send the 
 hot-standby heartbeat. This heart-beat has been modified to send only 
sendTime. 
10. Once sender task receiver heart-beat message from standby then it sends 
back the reply immediately. In this keep-alive message is 
   sent of type 'r'. 
11. If even after wal_sender_timeout no message received from standby then it 
will be considered as network break at sender task. 
12. If even after wal_receiver_timeout no message received from primary then it 
will be considered as network break at receiver task. 


With Regards,
Amit Kapila.

replication_timeout_patch_v3.patch
Description: replication_timeout_patch_v3.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Promoting a standby during base backup (was Re: [HACKERS] Switching timeline over streaming replication)

2012-10-04 Thread Heikki Linnakangas

On 03.10.2012 18:15, Amit Kapila wrote:

On Tuesday, October 02, 2012 4:21 PM Heikki Linnakangas wrote:

Hmm, should a base backup be aborted when the standby is promoted? Does
the promotion render the backup corrupt?


I think currently it does so. Pls refer
1.
do_pg_stop_backup(char *labelfile, bool waitforarchive)
{
..
if (strcmp(backupfrom, "standby") == 0&&  !backup_started_in_recovery)
 ereport(ERROR,

(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
  errmsg("the standby was promoted during
online backup"),
  errhint("This means that the backup being
taken is corrupt "
  "and should not be used. "
  "Try taking another online
backup.")));
..

}


Okay. I think that check in do_pg_stop_backup() actually already ensures 
that you don't end up with a corrupt backup, even if the standby is 
promoted while a backup is being taken. Admittedly it would be nicer to 
abort it immediately rather than error out at the end.


But I wonder why promoting a standby renders the backup invalid in the 
first place? Fujii, Simon, can you explain that?


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [WIP] Performance Improvement by reducing WAL for Update Operation

2012-10-04 Thread Heikki Linnakangas

On 03.10.2012 19:03, Amit Kapila wrote:

Any comments/suggestions regarding performance/functionality test?


Hmm. Doing a lot of UPDATEs concurrently can be limited by the 
WALInsertLock, which each inserter holds while copying the WAL record to 
the buffer. Reducing the size of the WAL records, by compression or 
delta encoding, alleviates that bottleneck: when WAL records are 
smaller, the lock needs to be held for a shorter duration. That improves 
throughput, even if individual backends need to do more CPU work to 
compress the records, because that work can be done in parallel. I 
suspect much of the benefit you're seeing in these tests might be 
because of that effect.


As it happens, I've been working on making WAL insertion scale better in 
general: 
http://archives.postgresql.org/message-id/5064779a.3050...@vmware.com. 
That should also help most when inserting large WAL records. The 
question is: assuming we commit the xloginsert-scale patch, how much 
benefit is there left from the compression? It will surely still help to 
reduce the size of WAL, which can certainly help if you're limited by 
the WAL I/O, but I suspect the results from the pgbench tests you run 
might look quite different.


So, could you rerun these tests with the xloginsert-scale patch applied? 
Reducing the WAL size might still be a good idea even if the patch 
doesn't have much effect on TPS, but I'd like to make sure that the 
compression doesn't hurt performance. Also, it would be a good idea to 
repeat the tests with just a single client; we don't want to hurt the 
performance in that scenario either.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers