Re: [HACKERS] proposal: PL/Pythonu - function ereport

2015-10-22 Thread Catalin Iacob
On Sat, Oct 17, 2015 at 8:18 PM, Pavel Stehule  wrote:
> here is new patch
>
> cleaned, all unwanted artefacts removed. I am not sure if used way for
> method registration is 100% valid, but I didn't find any related
> documentation.

Pavel, some notes about the patch, not a full review (yet?).

In PLy_add_exceptions PyDict_New is not checked for failure.

In PLy_spi_error__init__, kw will be NULL if SPIError is called with
no arguments. With the current code NULL will get passed to
PyDict_Size which will generate something like SystemError Bad
internal function call. This also means a test using just SPIError()
is needed.
It seems args should never be NULL by the way, if there are no
arguments it's an empty tuple so the other side of the check is ok.

The current code doesn't build on Python3 because the 3rd argument of
PyMethod_New, the troubled one you need set to NULL has been removed.
This has to do with the distinction between bound and unbound methods
which is gone in Python3.

There is http://bugs.python.org/issue1587 which discusses how to
replace the "third argument" functionality for PyMethod_New in
Python3. One of the messages there links to
http://bugs.python.org/issue1505 and
https://hg.python.org/cpython/rev/429cadbc5b10/ which has an example
very similar to what you're trying to do, rewritten to work in
Python3. But this is still confusing: note that the replaced code
*didn't really use PyMethod_New at all* as the removed line meth =
PyMethod_New(func, NULL, ComError); was commented out and the replaced
code used to simply assign the function to the class dictionary
without even creating a method.
Still, the above link shows a (more verbose but maybe better)
alternative: don't use PyErr_NewException and instead define an
SPIError type with each slot spelled out explicitly. This will remove
the "is it safe to set the third argument to NULL" question.

I tried to answer the "is it safe to use NULL for the third argument
of PyMethod_New in Python2?" question and don't have a definite answer
yet. If you look at the CPython code it's used to set im_class
(Objects/classobject.c) of PyMethodObject which is accessible from
Python.
But this code:
init = plpy.SPIError.__init__
plpy.notice("repr {} str {} im_class {}".format(repr(init), str(init),
init.im_class))
produces:
NOTICE:  repr  str  im_class 
so the SPIError class name is set in im_class from somewhere. But this
is all moot if you use the explicit SPIError type definition.

>> Any help is welcome

I can work with you on this. I don't have a lot of experience with the
C API but not zero either.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-10-22 Thread Robert Haas
On Fri, Oct 23, 2015 at 12:31 AM, Tom Lane  wrote:
> BTW, my Salesforce colleagues have been bit^H^H^Hgriping for quite some
> time about the performance costs associated with translating between
> plpgsql's internal PLpgSQL_datum-array format and the ParamListInfo
> representation.  Maybe it's time to think about some wholesale redesign of
> ParamListInfo?  Because TBH this patch doesn't seem like much but a kluge.
> It's mostly layering still-another bunch of ad-hoc restrictions on
> copyParamList, without removing any one of the kluges we had already.

I have no objection to some kind of a redesign there, but (1) I don't
think we're going to be better off doing that before getting Partial
Seq Scan committed and (2) I don't think I'm the best-qualified person
to do the work.  With respect to the first point, despite my best
efforts, this feature is going to have bugs, and getting it committed
in November without a ParamListInfo redesign is surely going to be
better for the overall stability of PostgreSQL and the timeliness of
our release schedule than getting it committed in February with such a
redesign -- never mind that this is far from the only redesign into
which I could get sucked.  I want to put in place some narrow fix for
this issue so that I can move forward.  Three alternatives have been
proposed so far: (1) this, (2) the fix I coded and posted previously,
which made plpgsql_param_fetch's bms_is_member test unconditional, and
(3) not allowing PL/pgsql to run parallel queries.  (3) sounds worse
to me than either (1) or (2); I defer to others on which of (1) and
(2) is preferable, or perhaps you have another proposal.

On the second point, I really don't know enough about the problems
with ParamListInfo to know what would be better, so I can't really
help there.  If you do and want to redesign it, fine, but I really
need whatever you replace it with have an easy way of serializing and
restoring it - be it nodeToString() and stringToNode(),
SerializeParamList and RestoreParamList, or whatever.  Without that,
parallel query is going to have to be disabled for any query involving
parameters, and that would be, uh, extremely sad.  Also, FWIW, in my
opinion, it would be far more useful to PostgreSQL for you to finish
the work on upper planner path-ification ... an awful lot of people
are waiting for that to be completed to start their own work, or are
doing work that may have to be completely redone when that lands.
YMMV, of course.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-10-22 Thread Robert Haas
On Tue, Oct 20, 2015 at 3:04 AM, Amit Kapila  wrote:
> I have rebased the partial seq scan patch based on the above committed
> patches.  Now for rescanning it reuses the dsm and to achieve that we
> need to ensure that workers have been completely shutdown and then
> reinitializes the dsm.  To ensure complete shutdown of workers, current
> function  WaitForParallelWorkersToFinish is not sufficient as that just
> waits for the last message to receive from worker backend, so I have
> written a new function WaitForParallelWorkersToDie.  Also on receiving
> 'X' message in HandleParallelMessage, it just frees the worker handle
> without ensuring if the worker is died due to which later it will be
> difficult
> to even find whether worker is died or not, so I have removed that code
> from HandleParallelMessage.  Another change is that after receiving last
> tuple in Gather node, it just shutdown down the workers without
> destroying the dsm.

+   /*
+* We can't finish transaction commit or abort until all of the
+* workers are dead.  This means, in particular, that
we can't respond
+* to interrupts at this stage.
+*/
+   HOLD_INTERRUPTS();
+   status =
WaitForBackgroundWorkerShutdown(pcxt->worker[i].bgwhandle);
+   RESUME_INTERRUPTS();

These comments are correct when this code is called from
DestroyParallelContext(), but they're flat wrong when called from
ReinitializeParallelDSM().  I suggest moving the comment back to
DestroyParallelContext and following it with this:

HOLD_INTERRUPTS();
WaitForParallelWorkersToDie();
RESUME_INTERRUPTS();

Then ditch the HOLD/RESUME interrupts in WaitForParallelWorkersToDie() itself.

This hunk is a problem:

case 'X':   /* Terminate,
indicating clean exit */
{
-   pfree(pcxt->worker[i].bgwhandle);
pfree(pcxt->worker[i].error_mqh);
-   pcxt->worker[i].bgwhandle = NULL;
pcxt->worker[i].error_mqh = NULL;
break;
}

If you do that on receipt of the 'X' message, then
DestroyParallelContext() might SIGTERM a worker that has supposedly
exited cleanly.  That seems bad.  I think maybe the solution is to
make DestroyParallelContext() terminate the worker only if
pcxt->worker[i].error_mqh != NULL.  So make error_mqh == NULL mean a
clean loss of a worker: either we couldn't register it, or it exited
cleanly.  And bgwhandle == NULL would mean it's actually gone.

It makes sense to have ExecShutdownGather and
ExecShutdownGatherWorkers, but couldn't the former call the latter
instead of duplicating the code?

I think ReInitialize should be capitalized as Reinitialize throughout.

ExecParallelReInitializeTupleQueues is almost a cut-and-paste
duplicate of ExecParallelSetupTupleQueues.  Please refactor this to
avoid duplication - e.g. change
ExecParallelSetupTupleQueues(ParallelContext *pcxt) to take a second
argument bool reinit. ExecParallelReInitializeTupleQueues can just do
ExecParallelSetupTupleQueues(pxct, true).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-10-22 Thread Tom Lane
Robert Haas  writes:
> On Thu, Oct 15, 2015 at 8:23 PM, Noah Misch  wrote:
>>> Would it work to define this as "if non-NULL,
>>> params lacking a 1-bit may be safely ignored"?  Or some other tweak
>>> that basically says that you don't need to care about this, but you
>>> can if you want to.

>> ... this is a better specification.

> Here's an attempt to implement that.

BTW, my Salesforce colleagues have been bit^H^H^Hgriping for quite some
time about the performance costs associated with translating between
plpgsql's internal PLpgSQL_datum-array format and the ParamListInfo
representation.  Maybe it's time to think about some wholesale redesign of
ParamListInfo?  Because TBH this patch doesn't seem like much but a kluge.
It's mostly layering still-another bunch of ad-hoc restrictions on
copyParamList, without removing any one of the kluges we had already.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-10-22 Thread Robert Haas
On Thu, Oct 15, 2015 at 8:23 PM, Noah Misch  wrote:
> Agreed.  More specifically, I had in mind for copyParamList() to check the
> mask while e.g. ExecEvalParamExtern() would either check nothing or merely
> assert that any mask included the requested parameter.  It would be tricky to
> verify that as safe, so ...
>
>> Would it work to define this as "if non-NULL,
>> params lacking a 1-bit may be safely ignored"?  Or some other tweak
>> that basically says that you don't need to care about this, but you
>> can if you want to.
>
> ... this is a better specification.

Here's an attempt to implement that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
commit 1ffc4a3a1bac686b46d47dfa40bd0eb3cb8b0be4
Author: Robert Haas 
Date:   Thu Oct 22 23:56:51 2015 -0400

Fix problems with ParamListInfo serialization mechanism.

Commit d1b7c1ffe72e86932b5395f29e006c3f503bc53d introduced a mechanism
for serializing a ParamListInfo structure to be passed to a parallel
worker.  However, this mechanism failed to handle external expanded
values, as pointed out by Noah Misch.  Moreover, plpgsql_param_fetch
requires adjustment because the serialization mechanism needs it to skip
evaluating unused parameters just as we would do when it is called from
copyParamList, but params == estate->paramLI in that case.  To fix,
have setup_param_list set a new ParamListInfo field, paramMask, to
the parameters actually used in the expression, so that we don't try
to fetch those that are not needed.

diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index fb33d30..0d4aa69 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -392,6 +392,7 @@ EvaluateParams(PreparedStatement *pstmt, List *params,
 	paramLI->parserSetup = NULL;
 	paramLI->parserSetupArg = NULL;
 	paramLI->numParams = num_params;
+	paramLI->paramMask = NULL;
 
 	i = 0;
 	foreach(l, exprstates)
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 812a610..0919c04 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -910,6 +910,7 @@ postquel_sub_params(SQLFunctionCachePtr fcache,
 			paramLI->parserSetup = NULL;
 			paramLI->parserSetupArg = NULL;
 			paramLI->numParams = nargs;
+			paramLI->paramMask = NULL;
 			fcache->paramLI = paramLI;
 		}
 		else
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 300401e..13ddb8f 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -2330,6 +2330,7 @@ _SPI_convert_params(int nargs, Oid *argtypes,
 		paramLI->parserSetup = NULL;
 		paramLI->parserSetupArg = NULL;
 		paramLI->numParams = nargs;
+		paramLI->paramMask = NULL;
 
 		for (i = 0; i < nargs; i++)
 		{
diff --git a/src/backend/nodes/params.c b/src/backend/nodes/params.c
index d093263..9f5fd3a 100644
--- a/src/backend/nodes/params.c
+++ b/src/backend/nodes/params.c
@@ -15,6 +15,7 @@
 
 #include "postgres.h"
 
+#include "nodes/bitmapset.h"
 #include "nodes/params.h"
 #include "storage/shmem.h"
 #include "utils/datum.h"
@@ -50,6 +51,7 @@ copyParamList(ParamListInfo from)
 	retval->parserSetup = NULL;
 	retval->parserSetupArg = NULL;
 	retval->numParams = from->numParams;
+	retval->paramMask = bms_copy(from->paramMask);
 
 	for (i = 0; i < from->numParams; i++)
 	{
@@ -58,6 +60,20 @@ copyParamList(ParamListInfo from)
 		int16		typLen;
 		bool		typByVal;
 
+		/*
+		 * Ignore parameters we don't need, to save cycles and space, and
+		 * in case the fetch hook might fail.
+		 */
+		if (retval->paramMask != NULL &&
+			!bms_is_member(i, retval->paramMask))
+		{
+			nprm->value = (Datum) 0;
+			nprm->isnull = true;
+			nprm->pflags = 0;
+			nprm->ptype = InvalidOid;
+			continue;
+		}
+
 		/* give hook a chance in case parameter is dynamic */
 		if (!OidIsValid(oprm->ptype) && from->paramFetch != NULL)
 			(*from->paramFetch) (from, i + 1);
@@ -90,19 +106,31 @@ EstimateParamListSpace(ParamListInfo paramLI)
 	for (i = 0; i < paramLI->numParams; i++)
 	{
 		ParamExternData *prm = ¶mLI->params[i];
+		Oid			typeOid;
 		int16		typLen;
 		bool		typByVal;
 
-		/* give hook a chance in case parameter is dynamic */
-		if (!OidIsValid(prm->ptype) && paramLI->paramFetch != NULL)
-			(*paramLI->paramFetch) (paramLI, i + 1);
+		/*
+		 * Ignore parameters we don't need, to save cycles and space, and
+		 * in case the fetch hook might fail.
+		 */
+		if (paramLI->paramMask != NULL &&
+			!bms_is_member(i, paramLI->paramMask))
+			typeOid = InvalidOid;
+		else
+		{
+			/* give hook a chance in case parameter is dynamic */
+			if (!OidIsValid(prm->ptype) && paramLI->paramFetch != NULL)
+(*paramLI->paramFetch) (paramLI, i + 1);
+			typeOid = prm->ptype;
+		}
 
 		sz = add_size(sz, sizeof(Oid));			/* space for type OID */
 		sz = add_size(sz, sizeof(uint16));		/* space for pflags */
 
 		/* space for datum/isnull */
-		if (OidIsValid(prm->ptype))
-	

Re: [HACKERS] Making tab-complete.c easier to maintain

2015-10-22 Thread Stephen Frost
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> Stephen Frost wrote:
> 
> > The issue for Debian, at least, isn't really about libreadline or
> > libedit, it's about the GPL and the OpenSSL license.  From the Debian
> > perspective, if we were able to build with GNUTLS or another SSL library
> > other than OpenSSL (which I know we've made some progress on, thanks
> > Heikki...) then Debian wouldn't have any problem shipping PG linked with
> > libreadline.
> 
> What if OpenSSL were to switch to APL2?
> http://www.postgresql.org/message-id/20150801151410.ga28...@awork2.anarazel.de

According to that post, it would depend what libreadline is licensed
under.  There may be other complications with GPL3 and other libraries
we link against, though I can't think of any offhand and perhaps there
aren't any.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Making tab-complete.c easier to maintain

2015-10-22 Thread Alvaro Herrera
Stephen Frost wrote:

> The issue for Debian, at least, isn't really about libreadline or
> libedit, it's about the GPL and the OpenSSL license.  From the Debian
> perspective, if we were able to build with GNUTLS or another SSL library
> other than OpenSSL (which I know we've made some progress on, thanks
> Heikki...) then Debian wouldn't have any problem shipping PG linked with
> libreadline.

What if OpenSSL were to switch to APL2?
http://www.postgresql.org/message-id/20150801151410.ga28...@awork2.anarazel.de


-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [BUGS] BUG #13611: test_postmaster_connection failed (Windows, listen_addresses = '0.0.0.0' or '::')

2015-10-22 Thread Tom Lane
Noah Misch  writes:
> On Thu, Oct 22, 2015 at 04:15:10PM -0700, Tom Lane wrote:
>> I continue to think
>> that the OP's complaint is somehow founded on a bad address obtained by
>> looking up "localhost", because where else would it've come from?

> pg_ctl reads the address from postmaster.pid, which in turn derives from
> listen_addresses:

> $ grep -E '(unix|listen)' postgresql.conf
> listen_addresses = '0.0.0.0'
> unix_socket_directories = ''

Hmm, now I see.  I was about to object that that's a pretty silly setting,
but I see that we actually document it as supported.

> $ strace -e connect pg_ctl -D . -w start
> --- SIGCHLD (Child exited) @ 0 (0) ---
> waiting for server to start...connect(3, {sa_family=AF_FILE, 
> path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
> connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT 
> (No such file or directory)
> connect(3, {sa_family=AF_INET, sin_port=htons(6432), 
> sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EINPROGRESS (Operation now in 
> progress)
> 403978 2015-10-23 00:45:06.677 GMT LOG:  redirecting log output to logging 
> collector process
> 403978 2015-10-23 00:45:06.677 GMT HINT:  Future log output will appear in 
> directory "..".
>  done
> server started
> Process 403975 detached

... although this trace appears to show pg_ctl working just fine with this
setting, which kinda weakens your theory.  Still, it wouldn't be the first
thing we've seen fail on Windows but work elsewhere.

I'd be inclined to suggest fixing it like this:

/* If postmaster is listening on "*", use localhost */
-   if (strcmp(host_str, "*") == 0)
+   if (strcmp(host_str, "*") == 0 ||
+   strcmp(host_str, "0.0.0.0") == 0 ||
+   strcmp(host_str, "::") == 0)
strcpy(host_str, "localhost");

which covers the cases we document as supported.

A different line of thought would be to teach the postmaster to record
actually bound-to addresses in postgresql.conf, rather than regurgitating
the listen_addresses setting verbatim.  That would likely be a lot harder
(and less portable); though if we think there's anything besides pg_ctl
looking at this field, it might be worth trying.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shm_mq fix for non-blocking mode

2015-10-22 Thread Robert Haas
On Thu, Oct 22, 2015 at 4:45 PM, Robert Haas  wrote:
> On Fri, Oct 16, 2015 at 5:08 PM, Robert Haas  wrote:
>> The shm_mq code handles blocking mode and non-blocking mode
>> asymmetrically in a couple of places, with the unfortunate result that
>> if you are using non-blocking mode, and your counterparty dies before
>> attaching the queue, operations on the queue continue to return
>> SHM_MQ_WOULD_BLOCK instead of, as they should, returning
>> SHM_MQ_DETACHED.  The attached patch fixes the problem.  Thanks to my
>> colleague Rushabh Lathia for helping track this down.
>>
>> (There's are some further bugs in this area outside the shm_mq code
>> ... but I'm still trying to figure out exactly what they are and what
>> we should do about them.  This much, however, seems clear-cut.)
>
> ...and so I've committed it and back-patched to 9.4.

Sigh.  This was buggy; I have no idea how it survived my earlier testing.

I will go fix it.  Sorry.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making tab-complete.c easier to maintain

2015-10-22 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> David Fetter  writes:
> > On Thu, Oct 22, 2015 at 04:05:06PM -0700, Tom Lane wrote:
> >> Given the license issues around GNU readline, requiring it seems like
> >> probably a non-starter.
> 
> > I should have made this more clear.  I am not an IP attorney and don't
> > play one on TV, but this is what I've gotten out of conversations with
> > IP attorneys on this subject:
> 
> I'm not an IP attorney either, and the ones I've talked to seem to agree
> with you.  (Red Hat's corporate attorneys, at least, certainly thought
> that when I last asked them.)  But the observed facts are that some
> distros refuse to ship psql-linked-to-readline, and substitute libedit
> instead, because they got different advice from their attorneys.

The issue for Debian, at least, isn't really about libreadline or
libedit, it's about the GPL and the OpenSSL license.  From the Debian
perspective, if we were able to build with GNUTLS or another SSL library
other than OpenSSL (which I know we've made some progress on, thanks
Heikki...) then Debian wouldn't have any problem shipping PG linked with
libreadline.

> If we desupport libedit, the end result is going to be these distros
> shipping psql with no command-line-editing support at all.  Our opinion,
> or even our lawyers' opinion, is unlikely to prevent that outcome.
> 
> While libedit certainly has got its bugs and deficiencies, it's way better
> than nothing at all, so I think we'd better keep supporting it.

I agree that it's probably not a good idea to desupport libedit.
However, I don't believe supporting libedit necessairly prevents us
from providing better support when libreadline is available.
Debian-based distributions won't be able to reap the benefits of that
better support until we remove the OpenSSL dependency, which is
unfortunate, but other distributions would be able to.

Also, if the PG project is agreeable to it, there's no reason why we
couldn't provide full libreadline-enabled builds off of
apt.postgresql.org, regardless of what Debian wants.  That would add a
bit of extra burden on our package maintainers, as there isn't any
difference between packages built for Debian and packages built for
apt.p.o currently, I don't believe, but it seems like it'd be a pretty
minor change that would be well worth it if we get better libreadline
integration.

The one bit of all of this that worries me is that we would take
libreadline for granted and the libedit support would end up broken.
That seems a relatively minor concern though, as it sounds like there is
a fair bit of exercise of the libedit path due to the commercial forks
of PG which make use of it to avoid the GPL.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Avoid full page images in streaming replication?

2015-10-22 Thread Alvaro Herrera
Andres Freund wrote:
> On 2015-10-22 20:27:20 -0300, Alvaro Herrera wrote:
> > You can run the new one in old pg_xlog ...
> 
> You can? The xlog format between 9.4 and 9.5 changed, so I can't see how
> that'd work?

Oh, crap.  Must have been some other cross-version trial run I did,
then.  I would hope it's at least not terribly difficult to back-patch
that commit locally, anyway.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making tab-complete.c easier to maintain

2015-10-22 Thread Robert Haas
On Thu, Oct 22, 2015 at 8:09 PM, Andres Freund  wrote:
> On 2015-10-22 16:26:10 -0700, David Fetter wrote:
>> To be affective negatively by libreadline's viral license, an entity
>> would need to fork the psql client in proprietary ways that they did
>> not wish not to make available to end users, at the same time linking
>> in libreadline.
>
>> Maybe I'm missing something big, but I really don't see people out
>> there shipping a libreadline-enabled psql client, details of whose
>> source they'd want to keep a deep, dark secret.
>
> Isn't that just about every proprietary fork of postgres? Most have
> added backend features and I guess many of those have in turn added
> support to psql for those features.  Sure it'd probably in reality be
> relatively harmless for them to release these psql modifications, but I
> rather doubt their management will generally see it that way.

Yeah, exactly.  EnterpriseDB have to keep libedit working even if the
PostgreSQL community dropped support, so I hope we don't decide to do
that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY FREEZE and PD_ALL_VISIBLE

2015-10-22 Thread Simon Riggs
On 18 October 2015 at 17:23, Jeff Janes  wrote:

> I'm planning on adding a todo item to have COPY FREEZE set
> PD_ALL_VISIBLE.  Or is there some reason this can't be done?
>
> Since the whole point of COPY FREEZE is to avoid needing to rewrite the
> entire table, it seems rather perverse that the first time the table is
> vacuumed, it needs to rewrite the entire table.
>

That's pretty darn weird. I remember measuring the benefit during testing.
I remember Jeff D was talking about removing that flag at that time, so
perhaps that's it.

Either way, my bug, my bad, thanks for the report.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] COPY FREEZE and PD_ALL_VISIBLE

2015-10-22 Thread Simon Riggs
On 21 October 2015 at 13:31, Jeff Janes  wrote:

> On Tue, Oct 20, 2015 at 7:02 AM, Robert Haas 
> wrote:
>
>> On Sun, Oct 18, 2015 at 5:23 PM, Jeff Janes  wrote:
>> > I'm planning on adding a todo item to have COPY FREEZE set
>> PD_ALL_VISIBLE.
>> > Or is there some reason this can't be done?
>> >
>> > Since the whole point of COPY FREEZE is to avoid needing to rewrite the
>> > entire table, it seems rather perverse that the first time the table is
>> > vacuumed, it needs to rewrite the entire table.
>>
>> *facepalm*
>>
>> I don't know how hard that is to implement, but +1 for trying to
>> figure out a way.
>>
>
>
> It turns out it was pretty easy to set PD_ALL_VISIBLE on the new pages,
> since the code in hio that requests the relation to be extended already has
> info on the tuple's intended freeze status.
>
> Then you just need to refrain from clearing PD_ALL_VISIBLE when that tuple
> is actually written into the page.  Not only because clearing would defeat
> the purpose, but also because it will cause an error--apparently the
> incipient page is not yet in a state where visibilitymap_clear is willing
> to deal with it.
>
> With this patch, you get a table which has PD_ALL_VISIBLE set for all
> pages, but which doesn't have a _vm file.
>

Patch is simple enough. All usage looks safe, so I reckon this is good.


Index-only scans will visit the heap for each tuple until the first VACUUM
> is done.
>
> The first vacuum will read the entire table, but not need to write it
> anymore.  And will create the _vm file.
>
> I think we really want to create _vm file as well as set PD_ALL_VISIBLE,
> but I don't know the best way to do that.  Set a flag somewhere and then
> create it in bulk at the end of the transaction?  Set it bit by bit as the
> pages are extended and initialized?
>

Easy enough to do it at the end of the COPY FREEZE in one step. No need to
wait until EOXact.

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] Re: [BUGS] BUG #13611: test_postmaster_connection failed (Windows, listen_addresses = '0.0.0.0' or '::')

2015-10-22 Thread Noah Misch
On Thu, Oct 22, 2015 at 04:15:10PM -0700, Tom Lane wrote:
> Tatsuo Ishii  writes:
> >> The original post used only "0.0.0.0" and "::", not "localhost" or anything
> >> else entailing name resolution.  As I wrote above, Kondo proposed for 
> >> pg_ctl
> >> to use PQping("host='127.0.0.1'") in place of PQping("host='0.0.0.0'").
> >> That's all.  pg_ctl would continue to use PQping("host='localhost'") where
> >> it's doing so today.
> > 
> > Does anybody already write a patch in this direction or willing to do
> > it? If not, I (or kondo) would like to write the patch.

I have not; please do.

> AFAICS, the only hard-wired hostname reference in pg_ctl is "localhost",
> not "127.0.0.1" (much less "0.0.0.0").  So what you're proposing doesn't
> seem to me to have anything to do with what's there.  I continue to think
> that the OP's complaint is somehow founded on a bad address obtained by
> looking up "localhost", because where else would it've come from?

pg_ctl reads the address from postmaster.pid, which in turn derives from
listen_addresses:

$ grep -E '(unix|listen)' postgresql.conf
listen_addresses = '0.0.0.0'
unix_socket_directories = ''
$ strace -e connect pg_ctl -D . -w start
--- SIGCHLD (Child exited) @ 0 (0) ---
waiting for server to start...connect(3, {sa_family=AF_FILE, 
path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) = -1 ENOENT 
(No such file or directory)
connect(3, {sa_family=AF_INET, sin_port=htons(6432), 
sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EINPROGRESS (Operation now in progress)
403978 2015-10-23 00:45:06.677 GMT LOG:  redirecting log output to logging 
collector process
403978 2015-10-23 00:45:06.677 GMT HINT:  Future log output will appear in 
directory "..".
 done
server started
Process 403975 detached


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making tab-complete.c easier to maintain

2015-10-22 Thread Andres Freund
On 2015-10-22 16:26:10 -0700, David Fetter wrote:
> To be affective negatively by libreadline's viral license, an entity
> would need to fork the psql client in proprietary ways that they did
> not wish not to make available to end users, at the same time linking
> in libreadline.

> Maybe I'm missing something big, but I really don't see people out
> there shipping a libreadline-enabled psql client, details of whose
> source they'd want to keep a deep, dark secret.

Isn't that just about every proprietary fork of postgres? Most have
added backend features and I guess many of those have in turn added
support to psql for those features.  Sure it'd probably in reality be
relatively harmless for them to release these psql modifications, but I
rather doubt their management will generally see it that way.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid full page images in streaming replication?

2015-10-22 Thread Michael Paquier
On Fri, Oct 23, 2015 at 8:42 AM, Andres Freund  wrote:
> On 2015-10-22 20:27:20 -0300, Alvaro Herrera wrote:
>> You can run the new one in old pg_xlog ...
>
> You can? The xlog format between 9.4 and 9.5 changed, so I can't see how
> that'd work?

That's not going to work.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid full page images in streaming replication?

2015-10-22 Thread Michael Paquier
On Fri, Oct 23, 2015 at 7:37 AM, Jim Nasby wrote:
> What I'm wondering is how compressible a 'normal' FPI is. Certainly if the
> hole is zero'd out and the page is mostly empty you'll get great
> compression. What about other workloads? For reference, if a 'FPI
> placeholder' WAL record is 16 bytes, that's 51,200% compression. If it's 12
> bytes, it's 68,200% compression. (I'm assuming we write the hole too, but
> maybe that's not true?)

Well, to begin with FPI usually avoid to include the page hole in the
middle. Now, regarding the compressibility of a page taken without its
hole, that's highly schema-dependent. Based on some measurements I did
some time ago a page with repetitive data could compress up to 40%,
with less compressible stuff like UUID I recall it to be 20~25%. I
hacked out for the FPW compression patch a module able to work
directly on raw pages to test their compressibility:
https://github.com/michaelpq/pg_plugins/tree/master/compress_test
get_raw_page() has been taken from pageinspect and I added to it an
option to remove the hole in the middle of the page. Using that you
are able to guess how much pages can get compressed.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making tab-complete.c easier to maintain

2015-10-22 Thread Tom Lane
David Fetter  writes:
> On Thu, Oct 22, 2015 at 04:05:06PM -0700, Tom Lane wrote:
>> Given the license issues around GNU readline, requiring it seems like
>> probably a non-starter.

> I should have made this more clear.  I am not an IP attorney and don't
> play one on TV, but this is what I've gotten out of conversations with
> IP attorneys on this subject:

I'm not an IP attorney either, and the ones I've talked to seem to agree
with you.  (Red Hat's corporate attorneys, at least, certainly thought
that when I last asked them.)  But the observed facts are that some
distros refuse to ship psql-linked-to-readline, and substitute libedit
instead, because they got different advice from their attorneys.

If we desupport libedit, the end result is going to be these distros
shipping psql with no command-line-editing support at all.  Our opinion,
or even our lawyers' opinion, is unlikely to prevent that outcome.

While libedit certainly has got its bugs and deficiencies, it's way better
than nothing at all, so I think we'd better keep supporting it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [BUGS] BUG #13611: test_postmaster_connection failed (Windows, listen_addresses = '0.0.0.0' or '::')

2015-10-22 Thread Tom Lane
Andres Freund  writes:
> Perhaps we should start to emit a notice at startup if localhost doesn't
> resolve to either v4 or v6 definitions. The few environments where
> that's indeed intentionally not the case, should be fine with such a
> message at pgstat startup.

I think there's already a complaint of that sort coming out of the
pgstat machinery, because it will fail to establish the statistics
reporting socket.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-10-22 Thread Robert Haas
On Tue, Oct 13, 2015 at 5:59 PM, Robert Haas  wrote:
> - Although the changes in parallelpaths.c are in a good direction, I'm
> pretty sure this is not yet up to scratch.  I am less sure exactly
> what needs to be fixed, so I'll have to give some more thought to
> that.

Please find attached a proposed set of changes that I think are
better.  These changes compute a consider_parallel flag for each
RelOptInfo, which is true if it's a non-temporary relation whose
baserestrictinfo references no PARAM_EXEC parameters, sublinks, or
parallel-restricted functions.  Actually, I made an effort to set the
flag correctly even for baserels other than plain tables, and for
joinrels, though we don't technically need that stuff until we get to
the point of pushing joins beneath Gather nodes.  When we get there,
it will be important - any joinrel for which consider_parallel = false
needn't even try to generate parallel paths, while if
consider_parallel = true then we can consider it, if the costing makes
sense.

The advantage of this is that the logic is centralized.  If we have
parallel seq scan and also, say, parallel bitmap heap scan, your
approach would require that we duplicate the logic to check for
parallel-restricted functions for each path generation function.  By
caching it in the RelOptInfo, we don't have to do that.  The function
you wrote to generate parallel paths can just check the flag; if it's
false, return without generating any paths.  If it's true, then
parallel paths can be considered.

Ultimately, I think that each RelOptInfo should have a new List *
member containing a list of partial paths for that relation.  For a
baserel, we generate a partial path (e.g. Partial Seq Scan).  Then, we
can consider turning each partial path into a complete path by pushing
a Gather path on top of it.  For a joinrel, we can consider generating
a partial hash join or partial nest loop path by taking an outer
partial path and an ordinary inner path and putting the appropriate
path on top.  In theory it would also be correct to generate merge
join paths this way, but it's difficult to believe that such a plan
would ever be anything but a disaster.  These can then be used to
generate a complete path by putting a Gather node on top of them, or
they can bubble up to the next level of the join tree in the same way.
However, I think for the first version of this we can keep it simple:
if the consider_parallel flag is set on a relation, consider Gather ->
Partial Seq Scan.  If not, forget it.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
commit 79eb838ba87b19788617aac611712ed734e0102c
Author: Robert Haas 
Date:   Fri Oct 2 23:57:46 2015 -0400

Strengthen planner infrastructure for parallelism.

Add a new flag, consider_parallel, to each RelOptInfo, indicating
whether a plan for that relation could conceivably be run inside of
a parallel worker.  Right now, we're pretty conservative: for example,
it might be possible to defer applying a parallel-restricted qual
in a worker, and later do it in the leader, but right now we just
don't try to parallelize access to that relation.  That's probably
the right decision in most cases, anyway.

diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 3e75cd1..0030a9b 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1868,6 +1868,7 @@ _outRelOptInfo(StringInfo str, const RelOptInfo *node)
 	WRITE_INT_FIELD(width);
 	WRITE_BOOL_FIELD(consider_startup);
 	WRITE_BOOL_FIELD(consider_param_startup);
+	WRITE_BOOL_FIELD(consider_parallel);
 	WRITE_NODE_FIELD(reltargetlist);
 	WRITE_NODE_FIELD(pathlist);
 	WRITE_NODE_FIELD(ppilist);
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 8fc1cfd..f582b86 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -21,6 +21,7 @@
 #include "access/tsmapi.h"
 #include "catalog/pg_class.h"
 #include "catalog/pg_operator.h"
+#include "catalog/pg_proc.h"
 #include "foreign/fdwapi.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
@@ -71,6 +72,9 @@ static void set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
  Index rti, RangeTblEntry *rte);
 static void set_plain_rel_size(PlannerInfo *root, RelOptInfo *rel,
    RangeTblEntry *rte);
+static void set_rel_consider_parallel(PlannerInfo *root, RelOptInfo *rel,
+		  RangeTblEntry *rte);
+static bool function_rte_parallel_ok(RangeTblEntry *rte);
 static void set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	   RangeTblEntry *rte);
 static void set_tablesample_rel_size(PlannerInfo *root, RelOptInfo *rel,
@@ -158,7 +162,8 @@ make_one_rel(PlannerInfo *root, List *joinlist)
 	set_base_rel_consider_startup(root);
 
 	/*
-	 * Generate access paths for the base rels.
+	 * Generate access paths for the base rels.  set_base_rel_sizes also
+	 * sets the co

Re: [HACKERS] Avoid full page images in streaming replication?

2015-10-22 Thread Andres Freund
On 2015-10-22 20:27:20 -0300, Alvaro Herrera wrote:
> You can run the new one in old pg_xlog ...

You can? The xlog format between 9.4 and 9.5 changed, so I can't see how
that'd work?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freezing without cleanup lock

2015-10-22 Thread Alvaro Herrera
Jim Nasby wrote:

> That would be the minimal-impact version, yes. But I suspect if we went
> through the trouble to do that, it would be just as easy to attempt the
> freeze regardless of what scan_all is set to.

You mean if !scan_all we conditional-get the cleanup lock, if we get it
then prune, if not then freeze?  That seems nice on paper but I think
it's useless because unless scan_all is true, then relfrozenxid doesn't
advance anyway.

> What I wish I knew is whether this problem was worth worrying about or not.
> Hopefully the extra logging in 9.5 will shed some light at some point...

As I recall, Andres says it isn't, but I have recollections of scans
that take a very long time to finish because they keep running into a
vacuum that has a page locked.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freezing without cleanup lock

2015-10-22 Thread Jim Nasby

On 10/21/15 3:14 PM, Alvaro Herrera wrote:

Jim Nasby wrote:

While warning a client that just did a Slony-based version upgrade to make
sure to freeze the new database, it occurred to me that it should be safe to
freeze without the cleanup lock. This is interesting because it would allow
a scan_all vacuum to do it's job without blocking on the cleanup lock.

Does anyone have a feel for whether scan_all vacuums blocking on the cleanup
lock is an actual problem?


Yeah, I remember we discussed this and some other possible improvements
related to freezing.  I think other ideas proposed were that (1) during
an emergency (uncancellable) autovacuum run, we process only the tables
that are past the age limit, and (2) we remove the cost-based sleep so
that it finishes as quickly as possible.  (Yours is (3) only freeze and
not do any actual pruning -- did I get that right?)


That would be the minimal-impact version, yes. But I suspect if we went 
through the trouble to do that, it would be just as easy to attempt the 
freeze regardless of what scan_all is set to.


What I wish I knew is whether this problem was worth worrying about or 
not. Hopefully the extra logging in 9.5 will shed some light at some 
point...

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ATT_FOREIGN_TABLE and ATWrongRelkindError()

2015-10-22 Thread Amit Langote
On 2015/10/23 6:06, Robert Haas wrote:
> On Wed, Oct 21, 2015 at 1:51 AM, Amit Langote
>  wrote:
>> Attached adds those.
> 
> Good catch.  Committed and back-patched to 9.5.
> 
> (Yes, one of these problems goes back to 9.3, but it's a minor issue
> so I didn't bother.  If someone really feels the urge, have at it.)

Thanks, Robert!

Regards,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid full page images in streaming replication?

2015-10-22 Thread Alvaro Herrera
Jim Nasby wrote:
> On 10/22/15 5:53 PM, Alvaro Herrera wrote:
> >Jim Nasby wrote:
> >
> >>But yes, this is all very hand-wavy without any actual data on what
> >>percentage of the WAL stream is FPIs. Looks like pageinspect doesn't work
> >>for WAL... does anyone have a script/tool that breaks out what percentage of
> >>a WAL file is FPIs?
> >
> >pg_xlogdump -z
> 
> Hrm, any option for 9.4? I was hoping to get numbers from some real
> workloads...

You can run the new one in old pg_xlog ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making tab-complete.c easier to maintain

2015-10-22 Thread David Fetter
On Thu, Oct 22, 2015 at 04:05:06PM -0700, Tom Lane wrote:
> David Fetter  writes:
> > On Thu, Oct 22, 2015 at 02:36:53PM -0700, Tom Lane wrote:
> >> I have no very good idea how to do that, though.  Bison does have a
> >> notion of which symbols are possible as the next symbol at any given
> >> parse point, but it doesn't really make that accessible.  There's a lack
> >> of cooperation on the readline side too: we'd need to be able to see the
> >> whole query buffer not just the current line.
> 
> > This may be on point:
> 
> > http://stackoverflow.com/questions/161495/is-there-a-nice-way-of-handling-multi-line-input-with-gnu-readline
> 
> > I suspect we might have to stop pretending to support alternatives to
> > libreadline if we went that direction, not that that would necessarily
> > be a bad idea.
> 
> Given the license issues around GNU readline, requiring it seems like
> probably a non-starter.

I should have made this more clear.  I am not an IP attorney and don't
play one on TV, but this is what I've gotten out of conversations with
IP attorneys on this subject:

- If we allow people to disable readline at compile time, the software
  is not GPL even if only libreadline would add the command line
  editing capability.

- When people do compile with libreadline, the only software affected
  by libreadine's license is the binary to which it is linked, namely
  the psql client.
  
To be affective negatively by libreadline's viral license, an entity
would need to fork the psql client in proprietary ways that they did
not wish not to make available to end users, at the same time linking
in libreadline.

Maybe I'm missing something big, but I really don't see people out
there shipping a libreadline-enabled psql client, details of whose
source they'd want to keep a deep, dark secret.

If this gets to matter, we can probably get /pro bono/ work from IP
attorneys specializing in just this kind of thing.

> It strikes me though that maybe we don't need readline's
> cooperation.  I think it's already true that the previous lines of
> the query buffer are stashed somewhere that psql knows about, so in
> principle we could sew them together with the current line.  That
> might be a project worth tackling on its own, since we could make
> the existing code smarter about multiline queries, whether or not we
> ever get to a grammar-based solution.

Great!

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Change behavior of (m)xid_age

2015-10-22 Thread Robert Haas
On Thu, Oct 22, 2015 at 6:57 PM, Jim Nasby  wrote:
> Is there no case where it can be a permanent XID for a table or toast table?

I don't think so.

> The other issue is relminmxid, which if you're looking at relfrozenxid you'd
> want to look at as well. So you can't do a simple WHERE here.

I don't really know what you're referring to here, but it doesn't sound serious.

> Perhaps a better way to handle this is to just add correctly computed age
> fields to pg_stat_all_tables and pg_stat_database.

I'll defer to others on that point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [BUGS] BUG #13611: test_postmaster_connection failed (Windows, listen_addresses = '0.0.0.0' or '::')

2015-10-22 Thread Andres Freund
On 2015-10-22 16:15:10 -0700, Tom Lane wrote:
> AFAICS, the only hard-wired hostname reference in pg_ctl is "localhost",
> not "127.0.0.1" (much less "0.0.0.0").  So what you're proposing doesn't
> seem to me to have anything to do with what's there.  I continue to think
> that the OP's complaint is somehow founded on a bad address obtained by
> looking up "localhost", because where else would it've come from?

I've not read this thread, and this is just referencing the above:

Perhaps we should start to emit a notice at startup if localhost doesn't
resolve to either v4 or v6 definitions. The few environments where
that's indeed intentionally not the case, should be fine with such a
message at pgstat startup.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid full page images in streaming replication?

2015-10-22 Thread Jim Nasby

On 10/22/15 5:53 PM, Alvaro Herrera wrote:

Jim Nasby wrote:


But yes, this is all very hand-wavy without any actual data on what
percentage of the WAL stream is FPIs. Looks like pageinspect doesn't work
for WAL... does anyone have a script/tool that breaks out what percentage of
a WAL file is FPIs?


pg_xlogdump -z


Hrm, any option for 9.4? I was hoping to get numbers from some real 
workloads...

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [BUGS] BUG #13611: test_postmaster_connection failed (Windows, listen_addresses = '0.0.0.0' or '::')

2015-10-22 Thread Tom Lane
Tatsuo Ishii  writes:
>> The original post used only "0.0.0.0" and "::", not "localhost" or anything
>> else entailing name resolution.  As I wrote above, Kondo proposed for pg_ctl
>> to use PQping("host='127.0.0.1'") in place of PQping("host='0.0.0.0'").
>> That's all.  pg_ctl would continue to use PQping("host='localhost'") where
>> it's doing so today.

AFAICS, the only hard-wired hostname reference in pg_ctl is "localhost",
not "127.0.0.1" (much less "0.0.0.0").  So what you're proposing doesn't
seem to me to have anything to do with what's there.  I continue to think
that the OP's complaint is somehow founded on a bad address obtained by
looking up "localhost", because where else would it've come from?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making tab-complete.c easier to maintain

2015-10-22 Thread David G. Johnston
On Thu, Oct 22, 2015 at 7:05 PM, Tom Lane  wrote:

> I think it's already true that the previous lines of the query buffer
> are stashed somewhere that psql knows about,


​Just skimming but:​

​"""
​\p or \print
Print the current query buffer to the standard output.
"""
​http://www.postgresql.org/docs/9.4/interactive/app-psql.html

David J.
​


Re: [HACKERS] Making tab-complete.c easier to maintain

2015-10-22 Thread Tom Lane
David Fetter  writes:
> On Thu, Oct 22, 2015 at 02:36:53PM -0700, Tom Lane wrote:
>> I have no very good idea how to do that, though.  Bison does have a
>> notion of which symbols are possible as the next symbol at any given
>> parse point, but it doesn't really make that accessible.  There's a lack
>> of cooperation on the readline side too: we'd need to be able to see the
>> whole query buffer not just the current line.

> This may be on point:

> http://stackoverflow.com/questions/161495/is-there-a-nice-way-of-handling-multi-line-input-with-gnu-readline

> I suspect we might have to stop pretending to support alternatives to
> libreadline if we went that direction, not that that would necessarily
> be a bad idea.

Given the license issues around GNU readline, requiring it seems like
probably a non-starter.

It strikes me though that maybe we don't need readline's cooperation.
I think it's already true that the previous lines of the query buffer
are stashed somewhere that psql knows about, so in principle we could
sew them together with the current line.  That might be a project worth
tackling on its own, since we could make the existing code smarter about
multiline queries, whether or not we ever get to a grammar-based solution.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [BUGS] BUG #13611: test_postmaster_connection failed (Windows, listen_addresses = '0.0.0.0' or '::')

2015-10-22 Thread Tatsuo Ishii
> The original post used only "0.0.0.0" and "::", not "localhost" or anything
> else entailing name resolution.  As I wrote above, Kondo proposed for pg_ctl
> to use PQping("host='127.0.0.1'") in place of PQping("host='0.0.0.0'").
> That's all.  pg_ctl would continue to use PQping("host='localhost'") where
> it's doing so today.  A patch that changes the 0.0.0.0 case in this way should
> also change PQping("host='::'") to PQping("host='::1'"); I suspect that was
> implicit in the original proposal.

Does anybody already write a patch in this direction or willing to do
it? If not, I (or kondo) would like to write the patch.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid full page images in streaming replication?

2015-10-22 Thread Andres Freund
On 2015-10-22 17:59:06 -0500, Jim Nasby wrote:
> The WAL would *not* differ.

It would. Hint bits and all.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid full page images in streaming replication?

2015-10-22 Thread Jim Nasby

On 10/22/15 5:52 PM, Andres Freund wrote:

If the receiver didn't write the WAL before processing it then it can just
>stick the page image into the WAL it's writing for itself. Probably not good
>for syncrep, but I don't think you'd want this on for syncrep anyway.

To me this sounds like a recipe for disaster (i.e. complex bugs). WAL
(and thus CRC checksums) differing between nodes. Ugh.


The WAL would *not* differ. This would only affect streaming 
replication, and only the stream itself.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Change behavior of (m)xid_age

2015-10-22 Thread Jim Nasby

On 10/22/15 5:07 PM, Robert Haas wrote:

On Thu, Oct 22, 2015 at 5:51 PM, Jim Nasby  wrote:

>It's also a permanent ID when the relation is first created.

No it isn't.


Is there no case where it can be a permanent XID for a table or toast table?

The other issue is relminmxid, which if you're looking at relfrozenxid 
you'd want to look at as well. So you can't do a simple WHERE here.


Perhaps a better way to handle this is to just add correctly computed 
age fields to pg_stat_all_tables and pg_stat_database.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid full page images in streaming replication?

2015-10-22 Thread Alvaro Herrera
Jim Nasby wrote:

> But yes, this is all very hand-wavy without any actual data on what
> percentage of the WAL stream is FPIs. Looks like pageinspect doesn't work
> for WAL... does anyone have a script/tool that breaks out what percentage of
> a WAL file is FPIs?

pg_xlogdump -z

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid full page images in streaming replication?

2015-10-22 Thread Andres Freund
On 2015-10-22 17:47:01 -0500, Jim Nasby wrote:
> On 10/22/15 5:11 PM, Robert Haas wrote:
> >It's true that if the standby didn't have the master's FPIs, it could
> >generate its own in some side location that behaves like a separate
> >WAL stream or a double-write buffer.  But that would be a heck of a
> >lot of work to implement for an uncertain benefit.
> 
> If the receiver didn't write the WAL before processing it then it can just
> stick the page image into the WAL it's writing for itself. Probably not good
> for syncrep, but I don't think you'd want this on for syncrep anyway.

To me this sounds like a recipe for disaster (i.e. complex bugs). WAL
(and thus CRC checksums) differing between nodes. Ugh.

> But yes, this is all very hand-wavy without any actual data on what
> percentage of the WAL stream is FPIs. Looks like pageinspect doesn't work
> for WAL... does anyone have a script/tool that breaks out what percentage of
> a WAL file is FPIs?

pg_xlogdump --stats


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid full page images in streaming replication?

2015-10-22 Thread Jim Nasby

On 10/22/15 5:11 PM, Robert Haas wrote:

It's true that if the standby didn't have the master's FPIs, it could
generate its own in some side location that behaves like a separate
WAL stream or a double-write buffer.  But that would be a heck of a
lot of work to implement for an uncertain benefit.


If the receiver didn't write the WAL before processing it then it can 
just stick the page image into the WAL it's writing for itself. Probably 
not good for syncrep, but I don't think you'd want this on for syncrep 
anyway.


But yes, this is all very hand-wavy without any actual data on what 
percentage of the WAL stream is FPIs. Looks like pageinspect doesn't 
work for WAL... does anyone have a script/tool that breaks out what 
percentage of a WAL file is FPIs?

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid full page images in streaming replication?

2015-10-22 Thread Jim Nasby

On 10/22/15 5:03 PM, Andres Freund wrote:

On 2015-10-22 16:34:38 -0500, Jim Nasby wrote:

ISTM it should be possible to avoid sending full page writes to a streaming
replica once the replica has reached a consistent state. I assume that the
replica would still need to write full pages to it's disk in case of a
crash, but the sender could insert special WAL records to tell it when to do
so, instead of sending the full page image. Presumably this would be a big
win for replication over a WAN.


Note that FPIs are often pretty good for replay performance, avoiding
lots of synchronous random reads.


Right. You'd only want to use this if you're streaming over a slow link 
(like a WAN).



I think FPI compression is a better solution for now. I found it to be
extremely effective in some benchmarks I recently ran.


What I'm wondering is how compressible a 'normal' FPI is. Certainly if 
the hole is zero'd out and the page is mostly empty you'll get great 
compression. What about other workloads? For reference, if a 'FPI 
placeholder' WAL record is 16 bytes, that's 51,200% compression. If it's 
12 bytes, it's 68,200% compression. (I'm assuming we write the hole too, 
but maybe that's not true?)


FWIW, I started thinking about this when a client overwhelmed a remote 
slave doing VACUUM FREEZE after a Slony upgrade to 9.4. Granted, that's 
not normal, but it looks like normal vacuuming generates 2-6 bytes per 
modified tuple (depending on what was done). So even if you vacuumed 100 
rows on a page (which seems pretty high for most cases) that's only 
~200-600 bytes, compared to ~8200 bytes for the FPI.


The other interesting thing is that even their local slaves (with 20Gbps 
bandwidth) fell behind with vacuum_cost_delay=0, because replay was 
CPU-bound. But presumably that's not due to FPIs.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Change behavior of (m)xid_age

2015-10-22 Thread Alvaro Herrera
Andres Freund wrote:

> FWIW, adding an <> operator for xid seems like a perfectly good idea.

+1 (two of them actually -- another one for <>(xid,int) which mirrors
the =(xid,int) we already have).

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL function to report log message

2015-10-22 Thread Pavel Stehule
2015-10-23 0:07 GMT+02:00 Jim Nasby :

> On 10/22/15 4:59 PM, Pavel Stehule wrote:
>
>> It prevents everyone from reinventing the 'create a function wrapper
>> around RAISE' wheel that several people on this list alone have
>> admitted to. I think there's plenty of value in that.
>>
>>
>> I have different opinion, I am sorry. The RAISE statement is differently
>> designed with different possibility - the function is limited by using
>> variadic function, and should not to have same behave as RAISE. And I
>> don't like a idea to push RAISE to behave of variadic function.
>>
>
> I thought the only issue here was that RAISE currently pukes on a NULL
> input, and I thought you'd changed your mind and agreed that it makes sense
> for RAISE to just silently ignore anything that's NULL (except maybe for
> message). Am I wrong on one or both counts?
>
>
Maybe I don't use some words exactly - but I newer though so RAISE can
ignore NULLs. Current behave of RAISE is probably too strict - the
exception is too hard, but the NULL value should be displayed. In the
function, the NULL can be ignored, because we cannot to do better (we have
not same possibility like Python has)  - and I am able to accept it in this
case.


> IIRC 3 or 4 people on this list liked the idea of a function roughly
> equivalent to RAISE, to avoid the make-work of writing that function.
> That's why I disagree with your statement that there's no point to this
> function even if it acts the same as RAISE.


I didn't see any strong agreement to change RAISE statement here. The talk
here was about displayed informations - the function should to display all
possible fields - but it is based more on ErrorData structure, than on
RAISE statement.



>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>


Re: [HACKERS] Making tab-complete.c easier to maintain

2015-10-22 Thread David Fetter
On Thu, Oct 22, 2015 at 02:36:53PM -0700, Tom Lane wrote:
> Robert Haas  writes:
> > On Wed, Oct 21, 2015 at 8:54 PM, Thomas Munro
> >  wrote:
> >> (Apologies for sending so many versions.  tab-complete.c keeps moving
> >> and I want to keep a version that applies on top of master out there,
> >> for anyone interested in looking at this.  As long as no one objects
> >> and there is interest in the patch, I'll keep doing that.)
> 
> > I don't want to rain on the parade since other people seem to like
> > this, but I'm sort of unimpressed by this.  Yes, it removes >1000
> > lines of code, and that's not nothing.  But it's all mechanical code,
> > so, not to be dismissive, but who really cares?  Is it really worth
> > replacing the existing notation that we all know with a new one that
> > we have to learn?  I'm not violently opposed if someone else wants to
> > commit this, but I'm unexcited about it.
> 
> What I would like is to find a way to auto-generate basically this entire
> file from gram.y.

I've been hoping we could use a principled approach for years.  My
fondest hope along that line would also involve catalog access, so it
could correctly tab-complete user-defined things, but I have the
impression that the catalog access variant is "much later" even if
autogeneration from gram.y is merely "soon."  I'd love to be wrong
about that.

> That would imply going over to something at least
> somewhat parser-based, instead of the current way that is more or less
> totally ad-hoc.  That would be a very good thing though, because the
> current way gives wrong answers not-infrequently, even discounting cases
> that it's simply not been taught about.

Indeed.

> I have no very good idea how to do that, though.  Bison does have a
> notion of which symbols are possible as the next symbol at any given
> parse point, but it doesn't really make that accessible.  There's a lack
> of cooperation on the readline side too: we'd need to be able to see the
> whole query buffer not just the current line.

This may be on point:

http://stackoverflow.com/questions/161495/is-there-a-nice-way-of-handling-multi-line-input-with-gnu-readline

I suspect we might have to stop pretending to support alternatives to
libreadline if we went that direction, not that that would necessarily
be a bad idea.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Change behavior of (m)xid_age

2015-10-22 Thread Andres Freund
On 2015-10-22 18:07:06 -0400, Robert Haas wrote:
> > BTW, ignoring relfrozenxid = 0 also isn't as easy as you'd think:
> >
> > select count(*) from pg_class where relfrozenxid <> 0;
> > ERROR:  operator does not exist: xid <> integer at character 50
> 
> It takes a few more characters than that, but it's not really that hard.
> 
> rhaas=# select count(*) from pg_class where relfrozenxid::text <> '0';

"WHERE NOT relfrozenxid = 0" should work as well and is a bit nicer.

FWIW, adding an <> operator for xid seems like a perfectly good idea.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid full page images in streaming replication?

2015-10-22 Thread Robert Haas
On Thu, Oct 22, 2015 at 5:57 PM, Jim Nasby  wrote:
>> We could in theory send a "this would be been a fpi but it's skipped"
>> record which would only exist in streaming and just make the standby
>> write a noop of some kind? It would still be on the standby but it would
>> at least not consume the bandwidth. Just skip sending the actual
>> contents of the fpi.
>
> I don't think it can be a noop on the receiver though... doesn't the
> receiver still need full page images in case of a crash? (Assuming
> full_page_writes is enabled...)

Yes. If the standby is in the middle of writing a page updated by a
WAL record and crashes, it can end up with a torn page.  We restart
from a restartpoint at a location where the master checkpointed so
that we can be certain that replay of the FPI will fix the problem.
If you got rid of the FPIs, you'd be dead.

This is true both before and after reaching a consistent state, which
seems like a fatal flaw in this plan.

It's true that if the standby didn't have the master's FPIs, it could
generate its own in some side location that behaves like a separate
WAL stream or a double-write buffer.  But that would be a heck of a
lot of work to implement for an uncertain benefit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL function to report log message

2015-10-22 Thread Jim Nasby

On 10/22/15 4:59 PM, Pavel Stehule wrote:

It prevents everyone from reinventing the 'create a function wrapper
around RAISE' wheel that several people on this list alone have
admitted to. I think there's plenty of value in that.


I have different opinion, I am sorry. The RAISE statement is differently
designed with different possibility - the function is limited by using
variadic function, and should not to have same behave as RAISE. And I
don't like a idea to push RAISE to behave of variadic function.


I thought the only issue here was that RAISE currently pukes on a NULL 
input, and I thought you'd changed your mind and agreed that it makes 
sense for RAISE to just silently ignore anything that's NULL (except 
maybe for message). Am I wrong on one or both counts?


IIRC 3 or 4 people on this list liked the idea of a function roughly 
equivalent to RAISE, to avoid the make-work of writing that function. 
That's why I disagree with your statement that there's no point to this 
function even if it acts the same as RAISE.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Change behavior of (m)xid_age

2015-10-22 Thread Robert Haas
On Thu, Oct 22, 2015 at 5:51 PM, Jim Nasby  wrote:
> It's also a permanent ID when the relation is first created.

No it isn't.  If it were, the first insert into the table would have
to update the pg_class tuple, which it certainly doesn't.  Before we
had MVCC catalog scans, that wouldn't have been possible with less
than AccessExclusiveLock, and it would still require a self-exclusive
relation lock, which would be a deadlock hazard if multiple processes
tried to access the relation at once.  Also:

rhaas=# create table foo (a int);
CREATE TABLE
rhaas=# select relfrozenxid from pg_class where relname = 'foo';
 relfrozenxid
--
  946
(1 row)

> I agree that you can just ignore relfrozenxid = 0, but it seems kinda silly
> to force everyone to do that (unless there's some use case for the current
> 'infinity behavior' that I'm not seeing).

Well, if the only purpose of age() were to be applied to every
pg_class.relfrozenxid value, I might agree with you.  But I'm not sure
that's so; for example, it could be applied to XID fields from
individual tuples.  And there is certainly a backward-compatibility
argument for not changing the semantics now.

> BTW, ignoring relfrozenxid = 0 also isn't as easy as you'd think:
>
> select count(*) from pg_class where relfrozenxid <> 0;
> ERROR:  operator does not exist: xid <> integer at character 50

It takes a few more characters than that, but it's not really that hard.

rhaas=# select count(*) from pg_class where relfrozenxid::text <> '0';
 count
---
81
(1 row)

You can alternatively search for the correct set of relkinds.

> So first we make the user add the WHERE clause, then we make them figure out
> how to work around the missing operator...

Before any of that, we make them learn what relfrozenxid is and what
age() does.  Once they've learned that, I don't think the few extra
characters to filter out zeroes is really a big deal.  Most of these
queries are presumably being issued by monitoring software anyway, and
hopefully commonly-used monitoring tools already include a suitable
query.  Rolling your own monitoring queries from scratch for a
high-value production system is not an especially good idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid full page images in streaming replication?

2015-10-22 Thread Andres Freund
On 2015-10-22 16:34:38 -0500, Jim Nasby wrote:
> ISTM it should be possible to avoid sending full page writes to a streaming
> replica once the replica has reached a consistent state. I assume that the
> replica would still need to write full pages to it's disk in case of a
> crash, but the sender could insert special WAL records to tell it when to do
> so, instead of sending the full page image. Presumably this would be a big
> win for replication over a WAN.

Note that FPIs are often pretty good for replay performance, avoiding
lots of synchronous random reads.

I think FPI compression is a better solution for now. I found it to be
extremely effective in some benchmarks I recently ran.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL function to report log message

2015-10-22 Thread Pavel Stehule
2015-10-22 23:54 GMT+02:00 Jim Nasby :

> On 10/22/15 4:42 PM, Pavel Stehule wrote:
>
>> the behave of pg_report_log should not be exactly same as RAISE
>> statement in PLpgSQL.
>>
>
> That makes no sense to me. Having different behaviors is just going to
> lead to confusion.
>
> If this function will be exactly same, then it
>> lost a sense and anybody can use RAISE statement.
>>
>
> It prevents everyone from reinventing the 'create a function wrapper
> around RAISE' wheel that several people on this list alone have admitted
> to. I think there's plenty of value in that.


I have different opinion, I am sorry. The RAISE statement is differently
designed with different possibility - the function is limited by using
variadic function, and should not to have same behave as RAISE. And I don't
like a idea to push RAISE to behave of variadic function.

Regards

Pavel




>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>


Re: [HACKERS] Avoid full page images in streaming replication?

2015-10-22 Thread Jim Nasby

On 10/22/15 4:42 PM, Magnus Hagander wrote:

 > How are you going to make that work without LSNs in the WAL received by
 > the replica diverging from those in the master's WAL?
 >

We could in theory send a "this would be been a fpi but it's skipped"
record which would only exist in streaming and just make the standby
write a noop of some kind? It would still be on the standby but it would
at least not consume the bandwidth. Just skip sending the actual
contents of the fpi.


I don't think it can be a noop on the receiver though... doesn't the 
receiver still need full page images in case of a crash? (Assuming 
full_page_writes is enabled...)


The other issue is chained replicas, where one of the children may need 
full page writes (during initial copy).

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL function to report log message

2015-10-22 Thread Pavel Stehule
2015-10-22 20:15 GMT+02:00 Jim Nasby :

> On 10/22/15 2:20 AM, dinesh kumar wrote:
>
>>
>> 2. Using this function, if we provide any "NULL" argument to the function,
>>   we should either skip it or report it. I see this is what the function
>> is doing.
>>
>> postgres=# SELECT pg_report_log('INFO', 'NULL', false, NULL, NULL);
>> INFO:  NULL
>>
>> postgres=# SELECT pg_report_log('INFO', 'NULL', false, 'NULL', 'NULL');
>> INFO:  NULL
>> DETAIL:  NULL /-- Are you suggesting to change this behaviour/
>> HINT:  NULL
>>
>
> It should operate the same as what was decided for RAISE.
>

I am sorry, there was more opinions - what was decided for RAISE?


>
> I'd say it should also support the remaining RAISE options as well
> (COLUMN, CONSTRAINT, DATATYPE, TABLE, SCHEMA).
>
> I think hide_statement is a better name than ishidestmt. It would be nice
> if RAISE supported that too...
>
> I think the function should also allow specifying a condition name instead
> of a SQL state, same as RAISE does.
>
> In other words, this function and raise should operate exactly the same
> unless there's a really strong reason not to. Otherwise it's just going to
> create confusion.


I have different opinion - if RAISE and this function is exactly same,then
the function has not sense. There should not be principal difference, but
in same behave I don't see any sense.


>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>


Re: [HACKERS] [PATCH] SQL function to report log message

2015-10-22 Thread Jim Nasby

On 10/22/15 4:42 PM, Pavel Stehule wrote:

the behave of pg_report_log should not be exactly same as RAISE
statement in PLpgSQL.


That makes no sense to me. Having different behaviors is just going to 
lead to confusion.



If this function will be exactly same, then it
lost a sense and anybody can use RAISE statement.


It prevents everyone from reinventing the 'create a function wrapper 
around RAISE' wheel that several people on this list alone have admitted 
to. I think there's plenty of value in that.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Change behavior of (m)xid_age

2015-10-22 Thread Jim Nasby

On 10/22/15 4:18 PM, Robert Haas wrote:

On Wed, Oct 21, 2015 at 1:33 PM, Jim Nasby  wrote:

Currently, xid_age() returns INT_MAX for a permanent xid. The comment in the
function that 'Permanent XIDs are always infinitely old' may be technically
correct, but returning INT_MAX is a useless behavior because it actually
makes it look like that XID is in immediate wraparound danger. I think we
should change it to return either 0, -1, or INT_MIN. To me, 0 makes the most
sense for monitoring relfrozenxid.


As far as I know, relfrozenxid is only a permanent XID for relkinds
that don't have storage; then it's zero.  So I think you should just
change your query to ignore pg_class rows where relfrozenxid = 0, and
leave xid_age() alone.


It's also a permanent ID when the relation is first created.

I agree that you can just ignore relfrozenxid = 0, but it seems kinda 
silly to force everyone to do that (unless there's some use case for the 
current 'infinity behavior' that I'm not seeing).


BTW, ignoring relfrozenxid = 0 also isn't as easy as you'd think:

select count(*) from pg_class where relfrozenxid <> 0;
ERROR:  operator does not exist: xid <> integer at character 50

So first we make the user add the WHERE clause, then we make them figure 
out how to work around the missing operator...

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making tab-complete.c easier to maintain

2015-10-22 Thread Alvaro Herrera
Tom Lane wrote:

> What I would like is to find a way to auto-generate basically this entire
> file from gram.y.  That would imply going over to something at least
> somewhat parser-based, instead of the current way that is more or less
> totally ad-hoc.  That would be a very good thing though, because the
> current way gives wrong answers not-infrequently, even discounting cases
> that it's simply not been taught about.

I did discuss exactly this topic with Thomas a month ago or so in
private email, and our conclusion was that it would be a really neat
project but a lot more effort than this patch.  And after skimming over
the patch back then, I think this is well worth the reduced maintenance
effort.

> I have no very good idea how to do that, though.  Bison does have a
> notion of which symbols are possible as the next symbol at any given
> parse point, but it doesn't really make that accessible.  There's a lack
> of cooperation on the readline side too: we'd need to be able to see the
> whole query buffer not just the current line.

At the current pace, a project like this might take years to
turn into a real patch.  My own vote is for applying this for the time
being.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Avoid full page images in streaming replication?

2015-10-22 Thread Magnus Hagander
On Oct 22, 2015 23:38, "Tom Lane"  wrote:
>
> Jim Nasby  writes:
> > ISTM it should be possible to avoid sending full page writes to a
> > streaming replica once the replica has reached a consistent state. I
> > assume that the replica would still need to write full pages to it's
> > disk in case of a crash, but the sender could insert special WAL records
> > to tell it when to do so, instead of sending the full page image.
> > Presumably this would be a big win for replication over a WAN.
>
> How are you going to make that work without LSNs in the WAL received by
> the replica diverging from those in the master's WAL?
>

We could in theory send a "this would be been a fpi but it's skipped"
record which would only exist in streaming and just make the standby write
a noop of some kind? It would still be on the standby but it would at least
not consume the bandwidth. Just skip sending the actual contents of the
fpi.

/Magnus


Re: [HACKERS] [PATCH] SQL function to report log message

2015-10-22 Thread Pavel Stehule
Hi

2015-10-22 22:03 GMT+02:00 Peter Eisentraut :

> On 10/22/15 3:20 AM, dinesh kumar wrote:
> > postgres=# SELECT pg_report_log('INFO', 'NULL', false, NULL, NULL);
> > INFO:  NULL
> >
> > postgres=# SELECT pg_report_log('INFO', 'NULL', false, 'NULL', 'NULL');
> > INFO:  NULL
> > DETAIL:  NULL  /-- Are you suggesting to change this behaviour/
> > HINT:  NULL
>
> These look wrong to me.
>
> I'd throw an error if a null message is passed.
>
> (Not particularly in favor of this patch, but just saying ...)
>
>
We talked about this behave - and in this case, I am thinking the any
fields with same value with default value should be ignored.

the behave of pg_report_log should not be exactly same as RAISE statement
in PLpgSQL. If this function will be exactly same, then it lost a sense and
anybody can use RAISE statement. RAISE statement is strict - in this moment
to strict (can be little bit less), and pg_report_log can be NULL tolerant.
It is limmited by our implementation of keyword parameters that needs some
default value.

Regards

Pavel


Re: [HACKERS] Avoid full page images in streaming replication?

2015-10-22 Thread Tom Lane
Jim Nasby  writes:
> ISTM it should be possible to avoid sending full page writes to a 
> streaming replica once the replica has reached a consistent state. I 
> assume that the replica would still need to write full pages to it's 
> disk in case of a crash, but the sender could insert special WAL records 
> to tell it when to do so, instead of sending the full page image. 
> Presumably this would be a big win for replication over a WAN.

How are you going to make that work without LSNs in the WAL received by
the replica diverging from those in the master's WAL?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making tab-complete.c easier to maintain

2015-10-22 Thread Tom Lane
Robert Haas  writes:
> On Wed, Oct 21, 2015 at 8:54 PM, Thomas Munro
>  wrote:
>> (Apologies for sending so many versions.  tab-complete.c keeps moving
>> and I want to keep a version that applies on top of master out there,
>> for anyone interested in looking at this.  As long as no one objects
>> and there is interest in the patch, I'll keep doing that.)

> I don't want to rain on the parade since other people seem to like
> this, but I'm sort of unimpressed by this.  Yes, it removes >1000
> lines of code, and that's not nothing.  But it's all mechanical code,
> so, not to be dismissive, but who really cares?  Is it really worth
> replacing the existing notation that we all know with a new one that
> we have to learn?  I'm not violently opposed if someone else wants to
> commit this, but I'm unexcited about it.

What I would like is to find a way to auto-generate basically this entire
file from gram.y.  That would imply going over to something at least
somewhat parser-based, instead of the current way that is more or less
totally ad-hoc.  That would be a very good thing though, because the
current way gives wrong answers not-infrequently, even discounting cases
that it's simply not been taught about.

I have no very good idea how to do that, though.  Bison does have a
notion of which symbols are possible as the next symbol at any given
parse point, but it doesn't really make that accessible.  There's a lack
of cooperation on the readline side too: we'd need to be able to see the
whole query buffer not just the current line.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Avoid full page images in streaming replication?

2015-10-22 Thread Jim Nasby
ISTM it should be possible to avoid sending full page writes to a 
streaming replica once the replica has reached a consistent state. I 
assume that the replica would still need to write full pages to it's 
disk in case of a crash, but the sender could insert special WAL records 
to tell it when to do so, instead of sending the full page image. 
Presumably this would be a big win for replication over a WAN.


Am I missing something? I see that pglesslog is no longer supported but 
couldn't find any particular reason for that...

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making tab-complete.c easier to maintain

2015-10-22 Thread Robert Haas
On Wed, Oct 21, 2015 at 8:54 PM, Thomas Munro
 wrote:
> Here is a new version merging the recent CREATE EXTENSION ... VERSION
> patch from master.
>
> (Apologies for sending so many versions.  tab-complete.c keeps moving
> and I want to keep a version that applies on top of master out there,
> for anyone interested in looking at this.  As long as no one objects
> and there is interest in the patch, I'll keep doing that.)

I don't want to rain on the parade since other people seem to like
this, but I'm sort of unimpressed by this.  Yes, it removes >1000
lines of code, and that's not nothing.  But it's all mechanical code,
so, not to be dismissive, but who really cares?  Is it really worth
replacing the existing notation that we all know with a new one that
we have to learn?  I'm not violently opposed if someone else wants to
commit this, but I'm unexcited about it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Change behavior of (m)xid_age

2015-10-22 Thread Robert Haas
On Wed, Oct 21, 2015 at 1:33 PM, Jim Nasby  wrote:
> Currently, xid_age() returns INT_MAX for a permanent xid. The comment in the
> function that 'Permanent XIDs are always infinitely old' may be technically
> correct, but returning INT_MAX is a useless behavior because it actually
> makes it look like that XID is in immediate wraparound danger. I think we
> should change it to return either 0, -1, or INT_MIN. To me, 0 makes the most
> sense for monitoring relfrozenxid.

As far as I know, relfrozenxid is only a permanent XID for relkinds
that don't have storage; then it's zero.  So I think you should just
change your query to ignore pg_class rows where relfrozenxid = 0, and
leave xid_age() alone.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COPY FREEZE and PD_ALL_VISIBLE

2015-10-22 Thread Robert Haas
On Wed, Oct 21, 2015 at 1:31 PM, Jeff Janes  wrote:
> It turns out it was pretty easy to set PD_ALL_VISIBLE on the new pages,
> since the code in hio that requests the relation to be extended already has
> info on the tuple's intended freeze status.
>
> Then you just need to refrain from clearing PD_ALL_VISIBLE when that tuple
> is actually written into the page.  Not only because clearing would defeat
> the purpose, but also because it will cause an error--apparently the
> incipient page is not yet in a state where visibilitymap_clear is willing to
> deal with it.

Wouldn't it be better to instead fill the page with tuples first and
THEN set PD_ALL_VISIBLE?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting sorted data from foreign server

2015-10-22 Thread Robert Haas
On Wed, Oct 21, 2015 at 5:23 AM, Ashutosh Bapat
 wrote:
> Increasing the sorting cost factor (when use_remote_estimates = false) from
> 1.1 to 1.2 makes the difference disappear.
>
> Since the startup costs for postgres_fdw are large portion of total cost,
> extra 10% of rest of the cost is comparable to 1% fuzzy limit. IMO, we
> shouldn't bother too much about it as the path costs are not much different.

My feeling is that cranking the sorting cost factor up to 20-25% would
be a good idea, just so we have less unnecessary plan churn.  I dunno
if sorting always costs that much, but if a 10% cost overhead is
really 1% because it only applies to a fraction of the cost, I don't
think that's good.  The whole point was to pick something large enough
that we wouldn't take the sorted path unless we will benefit from the
sort, and clearly that's not what happened here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] clearing opfuncid vs. parallel query

2015-10-22 Thread Robert Haas
On Thu, Oct 22, 2015 at 5:09 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Thu, Oct 22, 2015 at 4:32 PM, Tom Lane  wrote:
>>> I'm not sure that you could get to a point where you were
>>> generating this stuff from anything that wasn't in essence an arcane
>>> representation of the .c files.  It might be slightly harder to make
>>> errors of omission that way, but I'm suspicious that that would come at
>>> the cost of a net loss of readability.
>
>> That is possible, but the current situation isn't good either.
>> Despite everybody's best efforts, we mess this up more than is really
>> desirable. Commit b8fe12a83622b350dc6849f8bb933bd8a86c1424 fixed bugs
>> in a whole bunch of preceding commits, and I wonder if anybody else
>> would have found those if Noah hadn't.  It's just too easy to miss
>> these things today.  If generating the .c files isn't practical,
>> another alternative worth exploring would be a tool to cross-check
>> them against the .h files.
>
> Yeah, I could get on board with that.  It doesn't seem terribly hard:
> just make sure that all fields mentioned in the struct declaration are
> mentioned in each relevant routine.  (Cases where we intentionally skip
> a field could be handled by adding a no-op macro, eg "COPY_IGNORE(foo);")
>
> It would be nice if we could also check that the macro type is sane for
> the field datatype (eg, not use COPY_SCALAR_FIELD() for a pointer field).
> But that would probably increase the difficulty very substantially for
> just a small gain in error detection.

I actually think that's quite an easy mistake to make, so I'd be happy
if we could catch it.  But anything would be better than nothing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] clearing opfuncid vs. parallel query

2015-10-22 Thread Tom Lane
Robert Haas  writes:
> On Thu, Oct 22, 2015 at 4:32 PM, Tom Lane  wrote:
>> I'm not sure that you could get to a point where you were
>> generating this stuff from anything that wasn't in essence an arcane
>> representation of the .c files.  It might be slightly harder to make
>> errors of omission that way, but I'm suspicious that that would come at
>> the cost of a net loss of readability.

> That is possible, but the current situation isn't good either.
> Despite everybody's best efforts, we mess this up more than is really
> desirable. Commit b8fe12a83622b350dc6849f8bb933bd8a86c1424 fixed bugs
> in a whole bunch of preceding commits, and I wonder if anybody else
> would have found those if Noah hadn't.  It's just too easy to miss
> these things today.  If generating the .c files isn't practical,
> another alternative worth exploring would be a tool to cross-check
> them against the .h files.

Yeah, I could get on board with that.  It doesn't seem terribly hard:
just make sure that all fields mentioned in the struct declaration are
mentioned in each relevant routine.  (Cases where we intentionally skip
a field could be handled by adding a no-op macro, eg "COPY_IGNORE(foo);")

It would be nice if we could also check that the macro type is sane for
the field datatype (eg, not use COPY_SCALAR_FIELD() for a pointer field).
But that would probably increase the difficulty very substantially for
just a small gain in error detection.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ATT_FOREIGN_TABLE and ATWrongRelkindError()

2015-10-22 Thread Robert Haas
On Wed, Oct 21, 2015 at 1:51 AM, Amit Langote
 wrote:
> This may just be nitpicking but I noticed that ATWrongRelkindError() could
> emit a better message in case of such errors during ALTER COLUMN DEFAULT
> and  ALTER COLUMN SET STORAGE than "%s is of the wrong type" which is what
> it would emit now. Just need to add a couple of cases to the switch there:
>
> + case ATT_TABLE | ATT_VIEW | ATT_FOREIGN_TABLE:
> + msg = _("\"%s\" is not a table, view or foreign table");
> + break;
>
> + case ATT_TABLE | ATT_MATVIEW | ATT_FOREIGN_TABLE:
> + msg = _("\"%s\" is not a table, materialized view, or foreign table");
> + break;
>
> Attached adds those.

Good catch.  Committed and back-patched to 9.5.

(Yes, one of these problems goes back to 9.3, but it's a minor issue
so I didn't bother.  If someone really feels the urge, have at it.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] clearing opfuncid vs. parallel query

2015-10-22 Thread Robert Haas
On Thu, Oct 22, 2015 at 4:32 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Thu, Oct 22, 2015 at 1:40 PM, YUriy Zhuravlev
>>  wrote:
>>> I can gen xml/json from actual struct. I offered XML/JSON as the analysis 
>>> of C
>>> code much more difficult. But my current generator just use the structure 
>>> from
>>> the header files (by pycparser).
>
>> Anything that is part of the build process will have to be done in C or Perl.
>
> Yeah.  The bar for introducing new build tool requirements is very high;
> *way* higher than the likely benefit from not having to hand-maintain
> outfuncs.c et al.  If you wanna do this in Perl, fine, but we're not going
> to introduce a requirement to have Python to build just because somebody
> wants to write a tool in the latter not the former.
>
> Having said that, there is more knowledge embedded in the nodes/*.c files
> than there is in the nodes/*.h headers; an example being that there are
> certain fields that we deliberately don't dump and restore.  (This is
> still true despite Robert's recent changes to take opfuncid out of that
> class.)

Are those things, ah, documented somewhere?

> I'm not sure that you could get to a point where you were
> generating this stuff from anything that wasn't in essence an arcane
> representation of the .c files.  It might be slightly harder to make
> errors of omission that way, but I'm suspicious that that would come at
> the cost of a net loss of readability.

That is possible, but the current situation isn't good either.
Despite everybody's best efforts, we mess this up more than is really
desirable. Commit b8fe12a83622b350dc6849f8bb933bd8a86c1424 fixed bugs
in a whole bunch of preceding commits, and I wonder if anybody else
would have found those if Noah hadn't.  It's just too easy to miss
these things today.  If generating the .c files isn't practical,
another alternative worth exploring would be a tool to cross-check
them against the .h files.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shm_mq fix for non-blocking mode

2015-10-22 Thread Robert Haas
On Fri, Oct 16, 2015 at 5:08 PM, Robert Haas  wrote:
> The shm_mq code handles blocking mode and non-blocking mode
> asymmetrically in a couple of places, with the unfortunate result that
> if you are using non-blocking mode, and your counterparty dies before
> attaching the queue, operations on the queue continue to return
> SHM_MQ_WOULD_BLOCK instead of, as they should, returning
> SHM_MQ_DETACHED.  The attached patch fixes the problem.  Thanks to my
> colleague Rushabh Lathia for helping track this down.
>
> (There's are some further bugs in this area outside the shm_mq code
> ... but I'm still trying to figure out exactly what they are and what
> we should do about them.  This much, however, seems clear-cut.)

...and so I've committed it and back-patched to 9.4.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] clearing opfuncid vs. parallel query

2015-10-22 Thread Tom Lane
Robert Haas  writes:
> On Thu, Oct 22, 2015 at 1:40 PM, YUriy Zhuravlev
>  wrote:
>> I can gen xml/json from actual struct. I offered XML/JSON as the analysis of 
>> C
>> code much more difficult. But my current generator just use the structure 
>> from
>> the header files (by pycparser).

> Anything that is part of the build process will have to be done in C or Perl.

Yeah.  The bar for introducing new build tool requirements is very high;
*way* higher than the likely benefit from not having to hand-maintain
outfuncs.c et al.  If you wanna do this in Perl, fine, but we're not going
to introduce a requirement to have Python to build just because somebody
wants to write a tool in the latter not the former.

Having said that, there is more knowledge embedded in the nodes/*.c files
than there is in the nodes/*.h headers; an example being that there are
certain fields that we deliberately don't dump and restore.  (This is
still true despite Robert's recent changes to take opfuncid out of that
class.)  I'm not sure that you could get to a point where you were
generating this stuff from anything that wasn't in essence an arcane
representation of the .c files.  It might be slightly harder to make
errors of omission that way, but I'm suspicious that that would come at
the cost of a net loss of readability.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dangling Client Backend Process

2015-10-22 Thread Robert Haas
On Tue, Oct 20, 2015 at 11:42 PM, Rajeev rastogi
 wrote:
> Agreed. Attached is the patch with changes.

Well, I'm not buying this extra PostmasterIsAlive() call on every pass
through the main loop.  That seems more expensive than we can really
justify. Checking this when we're already calling WaitLatchOrSocket is
basically free, but the other part is not.

Here's a version with that removed and some changes to the comments.
I still don't think this is quite working right, though, because
here's what happened when I killed the postmaster:

rhaas=# select 1;
 ?column?
--
1
(1 row)

rhaas=# \watch
Watch every 2sThu Oct 22 16:24:10 2015

 ?column?
--
1
(1 row)

Watch every 2sThu Oct 22 16:24:12 2015

 ?column?
--
1
(1 row)

Watch every 2sThu Oct 22 16:24:14 2015

 ?column?
--
1
(1 row)

Watch every 2sThu Oct 22 16:24:16 2015

 ?column?
--
1
(1 row)

server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

Note that the error message doesn't actually show up on the client (it
did show up in the log).  I guess that may be inevitable if we're
blocked in secure_write(), but if we're in secure_read() maybe it
should work?  I haven't investigated why it doesn't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 26d8faa..089435d 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -36,7 +36,7 @@
 #include "tcop/tcopprot.h"
 #include "utils/memutils.h"
 #include "storage/proc.h"
-
+#include "storage/ipc.h"
 
 char	   *ssl_cert_file;
 char	   *ssl_key_file;
@@ -144,9 +144,31 @@ retry:
 		Assert(waitfor);
 
 		w = WaitLatchOrSocket(MyLatch,
-			  WL_LATCH_SET | waitfor,
+			  WL_LATCH_SET | WL_POSTMASTER_DEATH | waitfor,
 			  port->sock, 0);
 
+		/*
+		 * If the postmaster has died, it's not safe to continue running,
+		 * because it is the postmaster's job to kill us if some other backend
+		 * exists uncleanly.  Moreover, we won't run very well in this state;
+		 * helper processes like walwriter and the bgwriter will exit, so
+		 * performance may be poor.  Finally, if we don't exit, pg_ctl will
+		 * be unable to restart the postmaster without manual intervention,
+		 * so no new connections can be accepted.  Exiting clears the deck
+		 * for a postmaster restart.
+		 *
+		 * (Note that we only make this check when we would otherwise sleep
+		 * on our latch.  We might still continue running for a while if the
+		 * postmaster is killed in mid-query, or even through multiple queries
+		 * if we never have to wait for read.  We don't want to burn too many
+		 * cycles checking for this very rare condition, and this shouuld cause
+		 * us to exit quickly in most cases.)
+		 */
+		if (w & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+	(errcode(ERRCODE_ADMIN_SHUTDOWN),
+	errmsg("terminating connection due to unexpected postmaster exit")));
+
 		/* Handle interrupt. */
 		if (w & WL_LATCH_SET)
 		{
@@ -223,9 +245,15 @@ retry:
 		Assert(waitfor);
 
 		w = WaitLatchOrSocket(MyLatch,
-			  WL_LATCH_SET | waitfor,
+			  WL_LATCH_SET | WL_POSTMASTER_DEATH | waitfor,
 			  port->sock, 0);
 
+		/* See comments in secure_read. */
+		if (w & WL_POSTMASTER_DEATH)
+			ereport(FATAL,
+	(errcode(ERRCODE_ADMIN_SHUTDOWN),
+	errmsg("terminating connection due to unexpected postmaster exit")));
+
 		/* Handle interrupt. */
 		if (w & WL_LATCH_SET)
 		{

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL function to report log message

2015-10-22 Thread Peter Eisentraut
On 10/22/15 3:20 AM, dinesh kumar wrote:
> postgres=# SELECT pg_report_log('INFO', 'NULL', false, NULL, NULL);
> INFO:  NULL
> 
> postgres=# SELECT pg_report_log('INFO', 'NULL', false, 'NULL', 'NULL');
> INFO:  NULL
> DETAIL:  NULL  /-- Are you suggesting to change this behaviour/
> HINT:  NULL

These look wrong to me.

I'd throw an error if a null message is passed.

(Not particularly in favor of this patch, but just saying ...)



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make Gather node projection-capable

2015-10-22 Thread Robert Haas
On Thu, Oct 22, 2015 at 3:18 PM, Alvaro Herrera
 wrote:
>> You probably would, but sometimes that might not be possible; for
>> example, the tlist might contain a parallel-restricted function (which
>> therefore has to run in the leader).
>
> I don't understand your reply.  Failing to parallelize the child node
> does not prevent it from doing the projection itself, does it?

OK, now I'm confused.  If you can't parallelize the child, there's no
Gather node, and this discussion is irrelevant.  The case that matters
is something like:

SELECT pg_backend_pid(), a, b, c FROM foo WHERE x = 1;

What happens today is that the system first produces a SeqScan plan
with an all-Var tlist, which may either be a physical tlist or the
exact columns needed.  Then, after join planning, which is trivial
here, we substitute for that tlist the output tlist of the plan.
Since SeqScan is projection-capable, we just emit a single-node plan
tree: SeqScan.  But say we choose a PartialSeqScan plan.  The tlist
can't be pushed down because pg_backend_pid() must run in the leader.
So, if Gather can't do projection, we'll end up with
Result->Gather->PartialSeqScan.  If we make Gather projection-capable,
we can just end up with Gather->PartialSeqScan.

I prefer the second outcome.  TBH, the major reason is that I've just
been experimenting with injecting single-copy Gather nodes into Plan
trees above the join nest and below any upper plan nodes that get
stuck on top and seeing which regression tests fail.  Since we do a
lot of EXPLAIN-ing in the plan, I hacked EXPLAIN not to show the
Gather nodes.  But it does show the extra Result nodes which get
generated because Gather isn't projection-capable, and that causes a
huge pile of spurious test failures.  Even with the patch I posted
applies, there are some residual failures that don't look simple to
resolve, because sometimes an initplan or subplan attaches to the
Gather node since something is being projected there.  So if you just
have EXPLAIN look through those nodes and show their children instead,
you still don't get the same plans you would without the test code in
all cases, but it helps a lot.

> That said, I don't understand Tom's comment either.  Surely the planner
> is going to choose to do the projection in the innermost node possible,
> so that the children nodes are going to do the projections most of the
> time.  But if for whatever reason this fails to happen, wouldn't it make
> more sense to do it at Gather than having to put a Result on top?

The planner doesn't seem to choose to do projection in the innermost
node possible.  The final tlist only gets projected at the top of the
join tree.  Beneath that, it seems like we project in order to avoid
carrying Vars through nodes where that would be a needless expense,
but that's just dropping columns, not computing anything.  That having
been said, I don't think that takes anything away from your chain of
reasoning here, and I agree with your conclusion.  There seems to be
little reason to force a Result node atop a Gather node when we don't
do that for most other node types.

Also, the patch I'm proposing I think actually makes things quite a
bit cleaner than the status quo, because the current code initializes
the projection info and then ignores the projection info itself, while
using the slot that gets set up by initializing the projection info.
That just seems goofy.  If there's some reason not to do what I'm
proposing here, I'm happy to do whatever we agree is better, but I
don't think leaving it the way it is now makes any sense.

> Projections are a weird construct as implemented, yeah.  I imagine it's
> because of performance reasons, because having separate Result nodes
> everywhere would be a lot slower, wouldn't it?

I'm not sure.  It'd be my guess that this is why it wasn't made a
separate node to begin with, but I don't know.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] DIAGNOSTICS = SKIPPED_ROW_COUNT

2015-10-22 Thread dinesh kumar
On Wed, Oct 14, 2015 at 4:06 PM, Robert Haas  wrote:

> On Wed, Oct 14, 2015 at 6:28 PM, dinesh kumar 
> wrote:
> > I see this feature as an add on to do the parallel DML operations.
> > There won't be any problem, if operations are mutually exclusive.
> > I mean, each session operates on unique set of tuples.
> >
> > In the above case, we don't even need of SKIP LOCKED wait policy.
> >
> > But, when it comes to mutually depend operations, isn't it nice to
> provide,
> > how much were locked by the other sessions. OR atlest a HINT to the other
> > session like,
> >
> > GET DIAGNOSTICS var = DID_I_MISS_ANYTHING_FROM_OTHER_SESSIONS;
> >
> > I agree that, adding counter will take a performance hit.
> > Rather going to my actual proposal on providing the counter value,
> > isn't it good to provide a boolean type HINT, if we miss atleast a single
> > tuple.
>
> Suppose there are 5 locked rows and 5 unlocked rows in the heap and you do
> this:
>
> select * from t1 for share skip locked limit 5
>
> The Boolean you propose will be false if the first 5 rows in physical
> order are locked, and otherwise it will be false.  But there's no
> difference between those two scenarios from the perspective of the
> application.  Here's another example:
>
> with foo as (select * from t1 for share skip locked) select * from foo
> where a = 2;
>
> If foo contains any locked rows at all, this will return true,
> regardless of whether a = 2.
>
> It's true that, for a lot of normal-ish queries, LockRows is applied
> late enough that your proposed Boolean would return the intended
> answer.  But there are a bunch of exceptions, like the ones shown
> above, and there might be more in the future.
>
>
Hi Robert,

As usual, a great guidance from you. Thanks :-)

But I'm still trying to see, is there a way we can implement this for all
use cases.
Will update this thread with my findings.


> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>



-- 

Regards,
Dinesh
manojadinesh.blogspot.com


Re: [HACKERS] make Gather node projection-capable

2015-10-22 Thread Alvaro Herrera
Robert Haas wrote:
> On Thu, Oct 22, 2015 at 1:38 PM, Tom Lane  wrote:
> > Robert Haas  writes:
> >> The Gather node, as currently committed, is neither projection-capable
> >> nor listed as an exception in is_projection_capable_plan.  Amit
> >> discovered this in testing, and I hit it in my testing as well.  We
> >> could just mark it as being not projection-capable, but I think it
> >> might be better to go the other way and give it projection
> >> capabilities.
> >
> > Um ... why would you not want the projections to happen in the child
> > nodes, where they could be parallelized?  Or am I missing something?
> 
> You probably would, but sometimes that might not be possible; for
> example, the tlist might contain a parallel-restricted function (which
> therefore has to run in the leader).

I don't understand your reply.  Failing to parallelize the child node
does not prevent it from doing the projection itself, does it?

That said, I don't understand Tom's comment either.  Surely the planner
is going to choose to do the projection in the innermost node possible,
so that the children nodes are going to do the projections most of the
time.  But if for whatever reason this fails to happen, wouldn't it make
more sense to do it at Gather than having to put a Result on top?


> >> While that's not the end of the world, it seems to needlessly fly in
> >> the face of the general principle that nodes should generally try to
> >> support projection.
> >
> > I'm not sure there is any such principle.
> 
> I just inferred that this was the principle from reading the code; it
> doesn't seem to be documented anywhere.  In fact, what projection
> actually means doesn't seem to be documented anywhere.  Feel free to
> set me straight.  That having been said, I hope there's SOME principle
> other than "whatever we happened to implement".  All of our scan and
> join nodes seem to have projection capability  - I assume that's not
> an accident.  It would simplify the executor code if we ripped all of
> that out and instead had a separate Project node (or used Result), but
> for some reason we have not.

Projections are a weird construct as implemented, yeah.  I imagine it's
because of performance reasons, because having separate Result nodes
everywhere would be a lot slower, wouldn't it?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Typos in comments

2015-10-22 Thread Robert Haas
On Tue, Oct 20, 2015 at 6:29 PM, Jim Nasby  wrote:
> On 10/20/15 11:08 AM, CharSyam wrote:
>>
>> I fixed some typos in posgres.
>> They are all in comments. :)
>
> These all look good to me. RFC.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make Gather node projection-capable

2015-10-22 Thread Robert Haas
On Thu, Oct 22, 2015 at 1:38 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> The Gather node, as currently committed, is neither projection-capable
>> nor listed as an exception in is_projection_capable_plan.  Amit
>> discovered this in testing, and I hit it in my testing as well.  We
>> could just mark it as being not projection-capable, but I think it
>> might be better to go the other way and give it projection
>> capabilities.
>
> Um ... why would you not want the projections to happen in the child
> nodes, where they could be parallelized?  Or am I missing something?

You probably would, but sometimes that might not be possible; for
example, the tlist might contain a parallel-restricted function (which
therefore has to run in the leader).

>> While that's not the end of the world, it seems to needlessly fly in
>> the face of the general principle that nodes should generally try to
>> support projection.
>
> I'm not sure there is any such principle.

I just inferred that this was the principle from reading the code; it
doesn't seem to be documented anywhere.  In fact, what projection
actually means doesn't seem to be documented anywhere.  Feel free to
set me straight.  That having been said, I hope there's SOME principle
other than "whatever we happened to implement".  All of our scan and
join nodes seem to have projection capability  - I assume that's not
an accident.  It would simplify the executor code if we ripped all of
that out and instead had a separate Project node (or used Result), but
for some reason we have not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] clearing opfuncid vs. parallel query

2015-10-22 Thread Robert Haas
On Thu, Oct 22, 2015 at 1:40 PM, YUriy Zhuravlev
 wrote:
> On Thursday 22 October 2015 13:25:52 you wrote:
>> It would be more useful, if we're going to autogenerate code,
> Are we going to use autogenerate code?

I thought that's what you were proposing.  Process the struct
definitions and emit .c files.

>> to do it from the actual struct definitions.
> I can gen xml/json from actual struct. I offered XML/JSON as the analysis of C
> code much more difficult. But my current generator just use the structure from
> the header files (by pycparser).

Anything that is part of the build process will have to be done in C or Perl.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL function to report log message

2015-10-22 Thread dinesh kumar
On Thu, Oct 22, 2015 at 11:15 AM, Jim Nasby 
wrote:

> On 10/22/15 2:20 AM, dinesh kumar wrote:
>
>>
>> 2. Using this function, if we provide any "NULL" argument to the function,
>>   we should either skip it or report it. I see this is what the function
>> is doing.
>>
>> postgres=# SELECT pg_report_log('INFO', 'NULL', false, NULL, NULL);
>> INFO:  NULL
>>
>> postgres=# SELECT pg_report_log('INFO', 'NULL', false, 'NULL', 'NULL');
>> INFO:  NULL
>> DETAIL:  NULL /-- Are you suggesting to change this behaviour/
>> HINT:  NULL
>>
>
> It should operate the same as what was decided for RAISE.
>
> I'd say it should also support the remaining RAISE options as well
> (COLUMN, CONSTRAINT, DATATYPE, TABLE, SCHEMA).
>
> I think hide_statement is a better name than ishidestmt. It would be nice
> if RAISE supported that too...
>
> I think the function should also allow specifying a condition name instead
> of a SQL state, same as RAISE does.
>
> In other words, this function and raise should operate exactly the same
> unless there's a really strong reason not to. Otherwise it's just going to
> create confusion.
>
>
Thanks Jim,

That make sense to me. Let me cover these options too, and will update here.


> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>



-- 

Regards,
Dinesh
manojadinesh.blogspot.com


Re: [HACKERS] pg_recvlogical fixes

2015-10-22 Thread Euler Taveira

On 21-10-2015 18:36, Andres Freund wrote:

On 2015-10-21 11:52:31 -0300, Euler Taveira wrote:

While testing wal2json, I faced some problems with pg_recvlogical. Attached
is a serie of patches that can improve pg_recvlogical. Patches #2 and #3 are
bugfixes (and should be applied to 9.5 too). Patch #1 is not mandatory to
9.5.



#1: add a bunch of checks to complain when using an option that is not
available in the specified action;


I'm not a fan of doing that. Doing that for every option tends to be
more annoying than helpful. E.g. pgbench's checks requires you to
pointlessly remove a lot of harmless options just to be able to pass -i.

Your comparison is not fair (8 x 28 options). I tend to agree that a lot 
of check is not nice to maintain. However, it is a good UI practice.



#2: there is a wrong check because startpos option can be specified with
--create-slot;



#3: doesn't ignore startpos in --create-slot because that action could be
specified together with --start action (that uses that option);


It doesn't make sense to specify it with --create-slot though - you
can't even know what a the start position would mean before the slot is
created. You can't get older records than the ones at slot creation
time, and you can't know what a feature xlog position would mean.

If it doesn't make sense, why --create-slot can be specified together 
with --start at all? Maybe a comment could clarify your point (circa 
line 902) because it is not clear in docs that a past LSN could not be 
specified or even a future LSN when both options are specified together.



--
   Euler Taveira   Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL function to report log message

2015-10-22 Thread Jim Nasby

On 10/22/15 2:20 AM, dinesh kumar wrote:


2. Using this function, if we provide any "NULL" argument to the function,
  we should either skip it or report it. I see this is what the function
is doing.

postgres=# SELECT pg_report_log('INFO', 'NULL', false, NULL, NULL);
INFO:  NULL

postgres=# SELECT pg_report_log('INFO', 'NULL', false, 'NULL', 'NULL');
INFO:  NULL
DETAIL:  NULL /-- Are you suggesting to change this behaviour/
HINT:  NULL


It should operate the same as what was decided for RAISE.

I'd say it should also support the remaining RAISE options as well 
(COLUMN, CONSTRAINT, DATATYPE, TABLE, SCHEMA).


I think hide_statement is a better name than ishidestmt. It would be 
nice if RAISE supported that too...


I think the function should also allow specifying a condition name 
instead of a SQL state, same as RAISE does.


In other words, this function and raise should operate exactly the same 
unless there's a really strong reason not to. Otherwise it's just going 
to create confusion.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] clearing opfuncid vs. parallel query

2015-10-22 Thread YUriy Zhuravlev
On Thursday 22 October 2015 13:25:52 you wrote:
> It would be more useful, if we're going to autogenerate code,
Are we going to use autogenerate code?

> to do it from the actual struct definitions.
I can gen xml/json from actual struct. I offered XML/JSON as the analysis of C 
code much more difficult. But my current generator just use the structure from 
the header files (by pycparser).

Thanks.
-- 
YUriy Zhuravlev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make Gather node projection-capable

2015-10-22 Thread Tom Lane
Robert Haas  writes:
> The Gather node, as currently committed, is neither projection-capable
> nor listed as an exception in is_projection_capable_plan.  Amit
> discovered this in testing, and I hit it in my testing as well.  We
> could just mark it as being not projection-capable, but I think it
> might be better to go the other way and give it projection
> capabilities.

Um ... why would you not want the projections to happen in the child
nodes, where they could be parallelized?  Or am I missing something?

> While that's not the end of the world, it seems to needlessly fly in
> the face of the general principle that nodes should generally try to
> support projection.

I'm not sure there is any such principle.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] clearing opfuncid vs. parallel query

2015-10-22 Thread Robert Haas
On Thu, Oct 22, 2015 at 1:19 PM, YUriy Zhuravlev
 wrote:
> And then another question:
> What do you think if the generated equalfuncs.c, copyfuncs.c, readfuncs.c,
> outfuncs.c from XML or JSON?
> In order not to change the code in four places when changing nodes.

It would be more useful, if we're going to autogenerate code, to do it
from the actual struct definitions.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] make Gather node projection-capable

2015-10-22 Thread Robert Haas
The Gather node, as currently committed, is neither projection-capable
nor listed as an exception in is_projection_capable_plan.  Amit
discovered this in testing, and I hit it in my testing as well.  We
could just mark it as being not projection-capable, but I think it
might be better to go the other way and give it projection
capabilities.  Otherwise, we're going to start generating lots of
plans like this:

Result
-> Gather
  -> Partial Seq Scan

While that's not the end of the world, it seems to needlessly fly in
the face of the general principle that nodes should generally try to
support projection.  So attached is a patch to make Gather
projection-capable (gather-project.patch).  It has a slight dependency
on my patch to fix up the tqueue machinery for record types, so I've
attached that patch here as well (tqueue-record-types.patch).

Comments?  Reviews?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
commit d17bac203638c0d74696602069693aa41dea1894
Author: Robert Haas 
Date:   Wed Oct 7 12:43:22 2015 -0400

Modify tqueue infrastructure to support transient record types.

Commit 4a4e6893aa080b9094dadbe0e65f8a75fee41ac6, which introduced this
mechanism, failed to account for the fact that the RECORD pseudo-type
uses transient typmods that are only meaningful within a single
backend.  Transferring such tuples without modification between two
cooperating backends does not work.  This commit installs a system
for passing the tuple descriptors over the same shm_mq being used to
send the tuples themselves.  The two sides might not assign the same
transient typmod to any given tuple descriptor, so we must also
substitute the appropriate receiver-side typmod for the one used by
the sender.  That adds some CPU overhead, but still seems better than
being unable to pass records between cooperating parallel processes.

diff --git a/src/backend/executor/nodeGather.c b/src/backend/executor/nodeGather.c
index 69df9e3..4791320 100644
--- a/src/backend/executor/nodeGather.c
+++ b/src/backend/executor/nodeGather.c
@@ -221,6 +221,7 @@ gather_getnext(GatherState *gatherstate)
 
 			/* wait only if local scan is done */
 			tup = TupleQueueFunnelNext(gatherstate->funnel,
+	   slot->tts_tupleDescriptor,
 	   gatherstate->need_to_scan_locally,
 	   &done);
 			if (done)
diff --git a/src/backend/executor/tqueue.c b/src/backend/executor/tqueue.c
index 67143d3..53b69e0 100644
--- a/src/backend/executor/tqueue.c
+++ b/src/backend/executor/tqueue.c
@@ -21,23 +21,55 @@
 #include "postgres.h"
 
 #include "access/htup_details.h"
+#include "catalog/pg_type.h"
 #include "executor/tqueue.h"
+#include "funcapi.h"
+#include "lib/stringinfo.h"
 #include "miscadmin.h"
+#include "utils/array.h"
+#include "utils/memutils.h"
+#include "utils/typcache.h"
 
 typedef struct
 {
 	DestReceiver pub;
 	shm_mq_handle *handle;
+	MemoryContext	tmpcontext;
+	HTAB	   *recordhtab;
+	char		mode;
 }	TQueueDestReceiver;
 
+typedef struct RecordTypemodMap
+{
+	int			remotetypmod;
+	int			localtypmod;
+} RecordTypemodMap;
+
 struct TupleQueueFunnel
 {
 	int			nqueues;
 	int			maxqueues;
 	int			nextqueue;
 	shm_mq_handle **queue;
+	char	   *mode;
+	HTAB	   *typmodmap;
 };
 
+#define		TUPLE_QUEUE_MODE_CONTROL			'c'
+#define		TUPLE_QUEUE_MODE_DATA'd'
+
+static void tqueueWalkRecord(TQueueDestReceiver *tqueue, Datum value);
+static void tqueueWalkRecordArray(TQueueDestReceiver *tqueue, Datum value);
+static void TupleQueueHandleControlMessage(TupleQueueFunnel *funnel,
+			Size nbytes, char *data);
+static HeapTuple TupleQueueHandleDataMessage(TupleQueueFunnel *funnel,
+			TupleDesc tupledesc, Size nbytes,
+			HeapTupleHeader data);
+static HeapTuple TupleQueueRemapTuple(TupleQueueFunnel *funnel,
+	 TupleDesc tupledesc, HeapTuple tuple);
+static Datum TupleQueueRemapRecord(TupleQueueFunnel *funnel, Datum value);
+static Datum TupleQueueRemapRecordArray(TupleQueueFunnel *funnel, Datum value);
+
 /*
  * Receive a tuple.
  */
@@ -46,12 +78,178 @@ tqueueReceiveSlot(TupleTableSlot *slot, DestReceiver *self)
 {
 	TQueueDestReceiver *tqueue = (TQueueDestReceiver *) self;
 	HeapTuple	tuple;
+	HeapTupleHeader tup;
+	AttrNumber	i;
 
 	tuple = ExecMaterializeSlot(slot);
+	tup = tuple->t_data;
+
+	/*
+	 * If any of the columns that we're sending back are records, special
+	 * handling is required, because the tuple descriptors are stored in a
+	 * backend-local cache, and the backend receiving data from us need not
+	 * have the same cache contents we do.  We grovel through the tuple,
+	 * find all the transient record types contained therein, and send
+	 * special control messages through the queue so that the receiving
+	 * process can interpret them correctly.
+	 */
+	for (i = 0; i < slot->tts_tupleDescriptor->natts; ++i)
+	{
+		Form_pg_attribute attr = slot->tts_tupleDescriptor->attrs[i];
+		MemoryContext	oldcontext;
+
+		/* Ignore null

Re: [HACKERS] clearing opfuncid vs. parallel query

2015-10-22 Thread YUriy Zhuravlev
On Thursday 22 October 2015 12:53:49 you wrote:
> On Thu, Oct 22, 2015 at 12:15 PM, YUriy Zhuravlev
> 
>  wrote:
> > Hello.
> > Currently using nodeToString and stringToNode you can not pass a full
> > plan. In this regard, what is the plan to fix it? Or in the under task
> > parallel query does not have such a problem?
> 
> It's already fixed.  See commits
> a0d9f6e434bb56f7e5441b7988f3982feead33b3 and
> 9f1255ac859364a86264a67729dbd1a36dd63ff2.

Ahh. Thanks.

And then another question:
What do you think if the generated equalfuncs.c, copyfuncs.c, readfuncs.c, 
outfuncs.c from XML or JSON?
In order not to change the code in four places when changing nodes. 

-- 
YUriy Zhuravlev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2015-10-22 Thread Masahiko Sawada
On Thu, Oct 22, 2015 at 4:11 PM, Torsten Zühlsdorff
 wrote:
> On 21.10.2015 02:05, Masahiko Sawada wrote:
>>
>> On Sat, Oct 10, 2015 at 4:20 AM, Robert Haas 
>> wrote:
>>>
>>> On Thu, Oct 8, 2015 at 1:52 PM, Andres Freund  wrote:

 I don't see the problem? I mean catversion will reliably tell you which
 format the vm is in?
>>>
>>>
>>> Totally agreed.
>>>
 We could additionally use the opportunity to as a metapage, but that
 seems like an independent thing.
>>>
>>>
>>> I agree with that, too.
>>>
>>
>> Attached the updated v18 patch fixes some bugs.
>> Please review the patch.
>
>
> I've just checked the comments:

Thank you for taking the time to review this patch.
Attached updated patch(v19).

> File: /doc/src/sgml/catalogs.sgml
>
> +Number of pages that are marked all-frozen in the tables's
> Should be:
> +Number of pages that are marked all-frozen in the tables

I changed it as follows.
+Number of pages that are marked all-frozen in the table's

The similar sentence of relallvisible is exist.

> +ANALYZE, and a few DDL coomand such as
> Should be:
> +ANALYZE, and a few DDL command such as

Fixed.

> File: doc/src/sgml/maintenance.sgml
>
> +When the all pages of table are eventually marked as frozen by
> VACUUM,
> Should be:
> +When all pages of the table are eventually marked as frozen by
> VACUUM,

Fixed.

> File: /src/backend/access/heap/visibilitymap.c
>
> + * visibility map bit.  Then, we lock the buffer.  But this creates a race
> Should be:
> + * visibility map bit.  Than we lock the buffer.  But this creates a race

I didn't change this sentence actually. so kept it.

> + * buffer, the PD_ALL_VISIBLE or PD_ALL_FROZEN bit gets set.  If that
> happens,
> Should be:
> + * buffer, the PD_ALL_VISIBLE or PD_ALL_FROZEN bit gets set. If that
> happens,
> (Remove duplicate white space before if)

The other sentence seems to have double white space after period.
I kept it.

Please review it.

Regards,

--
Masahiko Sawada


000_add_frozen_bit_into_visibilitymap_v19.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] clearing opfuncid vs. parallel query

2015-10-22 Thread Robert Haas
On Thu, Oct 22, 2015 at 12:15 PM, YUriy Zhuravlev
 wrote:
> Hello.
> Currently using nodeToString and stringToNode you can not pass a full plan. In
> this regard, what is the plan to fix it? Or in the under task parallel query
> does not have such a problem?

It's already fixed.  See commits
a0d9f6e434bb56f7e5441b7988f3982feead33b3 and
9f1255ac859364a86264a67729dbd1a36dd63ff2.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Checkpoint throttling issues

2015-10-22 Thread Andres Freund
On 2015-10-22 09:52:25 -0400, Robert Haas wrote:
> On Mon, Oct 19, 2015 at 6:10 AM, Andres Freund  wrote:
> > 1) The progress passed to CheckpointWriteDelay() will often be wrong -
> >it's calculated as num_written / num_to_write, but num_written is only
> >incremented if the buffer hasn't since independently been written
> >out. That's bad because it mean's we'll think we're further and
> >further behind if there's independent writeout activity.
> >
> >Simple enough to fix, we gotta split num_written into num_written
> >(for stats purposes) and num_processed (for progress).
> >
> >This is pretty much a bug, but I'm a slightly worried about
> >backpatching a fix because it can have a rather noticeable
> >behavioural impact.
> 
> I think this is an algorithmic improvement, not a bug fix.

progress, passed to CheckpointWriteDelay, is defined as:
 * 'progress' is an estimate of how much of the work has been done, as a
 * fraction between 0.0 meaning none, and 1.0 meaning all done.

passing a potentially wildly inaccurate value (like close to 0, because
most buffers have since been displaced) seems more like a bug than an
algorithmic improvement... I can't see how that could have been
intentional.

But I'm now wild on backpatching these, so it's probably a moot
distinction.

> >I think the sleep time should be computed adaptively based on the
> >number of buffers remaining and the remaining time. There's probably
> >better formulations, but that seems like an easy enough improvement
> >and considerably better than now.
> 
> One thing to keep in mind here is that somebody did work a few years
> ago to reduce the number of wake-ups per second that PostgreSQL
> generates when idle.

As far as I can see checkpointing hasn't been touched in the course of
that work. For power usage it's probably not bad to complete checkpoints
faster than necessary and then sleep for longer :/. So it might be good
to have an upper limit to the sleep time.

> I like the idea of an adaptive sleep time.

Good. I do too ;)

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-10-22 Thread Alvaro Herrera
Syed, Rahila wrote:


> @@ -355,6 +356,7 @@ vacuum(int options, RangeVar *relation, Oid relid, 
> VacuumParams *params,
>   vac_update_datfrozenxid();
>   }
>  
> + pgstat_reset_activityflag;
>   /*
>* Clean up working storage --- note we must do this after
>* StartTransactionCommand, else we might be trying to delete the active

Does this actually compile?


> @@ -596,11 +630,42 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
>   /* Log cleanup info before we touch indexes */
>   vacuum_log_cleanup_info(onerel, vacrelstats);
>  
> + /*
> +  * If passes through indexes exceed 1 add
> +  * pages equal to rel_index_pages to the count of
> +  * total pages to be scanned.
> +  */
> + if (vacrelstats->num_index_scans >= 1)
> + {
> + total_index_pages = total_index_pages + 
> rel_index_pages;
> + total_pages = total_heap_pages + 
> total_index_pages;
> + }

Having the keep total_pages updated each time you change one of the
summands seems tedious and error-prone.  Why can't it be computed
whenever it is going to be used instead?

> + memcpy((char *) progress_message[0], 
> schemaname, schemaname_len);
> + progress_message[0][schemaname_len] = '\0';
> + strcat(progress_message[0],".");
> + strcat(progress_message[0],relname);

snprintf()?  I don't think you need to keep track of schemaname_len at
all.

> + scanned_index_pages += 
> RelationGetNumberOfBlocks(Irel[i]);
> + scanned_total_pages = scanned_total_pages + 
> RelationGetNumberOfBlocks(Irel[i]);
> + /* Report progress to the statistics collector */
> + progress_param[0] = total_pages;
> + progress_param[1] = scanned_total_pages;
> + progress_param[2] = total_heap_pages;
> + progress_param[3] = vacrelstats->scanned_pages;
> + progress_param[4] = total_index_pages;
> + progress_param[5] = scanned_index_pages;

In fact, I wonder if you need to send total_pages at all -- surely the
client can add both total_heap_pages and total_index_pages by itself ...

> + memcpy((char *) progress_message[0], schemaname, 
> schemaname_len);
> + progress_message[0][schemaname_len] = '\0';
> + strcat(progress_message[0],".");
> + strcat(progress_message[0],relname);

snprintf().

> diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
> index ab018c4..f97759e 100644
> --- a/src/backend/postmaster/pgstat.c
> +++ b/src/backend/postmaster/pgstat.c
> @@ -2851,6 +2851,55 @@ pgstat_report_activity(BackendState state, const char 
> *cmd_str)
>   pgstat_increment_changecount_after(beentry);
>  }
>  
> +/* -
> + * Called from VACUUM  after every heap page scan or index scan
> + * to report progress
> + * -
> + */
> +
> +void
> +pgstat_report_progress(uint *param1, int num_of_int, double *param2, int 
> num_of_float,
> + char 
> param3[PROGRESS_MESSAGE_LENGTH][N_PROGRESS_PARAM],
> + int num_of_string)
> +{
> + volatile PgBackendStatus *beentry = MyBEEntry;
> + int i;
> +
> + if (!beentry)
> + return;
> +
> + if (!pgstat_track_activities)
> + return;
> +
> + pgstat_increment_changecount_before(beentry);
> +
> + for(i = 0; i < num_of_int; i++)
> + {
> + beentry->progress_param[i] = param1[i];
> + }
> + for (i = 0; i < num_of_float; i++)
> + {
> + beentry->progress_param_float[i] = param2[i];
> + }
> + for (i = 0; i < num_of_string; i++)
> + {
> + strcpy((char *)beentry->progress_message[i], param3[i]);
> + }
> + pgstat_increment_changecount_after(beentry);
> +}

It seems a bit strange that the remaining progress_param entries are not
initialized to anything.  Also, why aren't the number of params of each
type saved too?  In the receiving code you check whether each value
equals 0, and if it does then report NULL, but imagine vacuuming a table
with no indexes where the number of index pages is going to be zero.
Shouldn't we display zero there rather than null?  Maybe I'm missing
something and that does work fine.

This patch lacks a comment somewhere explaining how this whole thing
works.

> diff --git a/src/include/pgstat.h b/src/include/pgstat.h
> index 9ecc163..4214b3d 100644
> --- a/src/include/pgsta

Re: [HACKERS] Re: [BUGS] BUG #13694: Row Level Security by-passed with CREATEUSER permission

2015-10-22 Thread Tom Lane
Stephen Frost  writes:
> * Joe Conway (m...@joeconway.com) wrote:
>> On 10/21/2015 12:46 PM, Tom Lane wrote:
>>> Attached patch rips out CREATEUSER and NOCREATEUSER options lock, stock,
>>> and barrel.

>> Looks good to me.

>>> Another possibility is to change them to actually mean CREATEROLE and
>>> NOCREATEROLE.  I think probably a clean break is better though.

>> I think that would be too confusing. I'd rather see them go away ala
>> your patch.

> Agreed.

Hearing no objections, done that way.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] clearing opfuncid vs. parallel query

2015-10-22 Thread David Fetter
On Thu, Oct 22, 2015 at 07:15:35PM +0300, YUriy Zhuravlev wrote:
> Hello.
> Currently using nodeToString and stringToNode you can not pass a
> full plan. In this regard, what is the plan to fix it? Or in the
> under task parallel query does not have such a problem?
> 
> > This turns out not to be straightforward to code, because we don't
> > have a generic plan tree walker, 
> 
> I have an inner development. I am using python analyzing header
> files and generates a universal walker (parser, paths ,executer and
> etc trees), as well as the serializer and deserializer to jsonb.
> Maybe I should publish this code?

Please do.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] clearing opfuncid vs. parallel query

2015-10-22 Thread YUriy Zhuravlev
Hello.
Currently using nodeToString and stringToNode you can not pass a full plan. In 
this regard, what is the plan to fix it? Or in the under task parallel query 
does not have such a problem?

> This turns out not to be straightforward to code, because we
> don't have a generic plan tree walker, 

I have an inner development. I am using python analyzing header files and 
generates a universal walker (parser, paths ,executer and etc trees), as well 
as the serializer and deserializer to jsonb. Maybe I should publish this code?

Thanks.

-- 
YUriy Zhuravlev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] checkpointer continuous flushing

2015-10-22 Thread Fabien COELHO


Hello Andres,


there would be as much as was written since the last sleep, about 100
ms ago, which is not likely to be gigabytes?


In many cases we don't sleep all that frequently - after one 100ms sleep
we're already behind a lot.


I think that "being behind" is not a problem as such, it is really the way 
the scheduler has been designed and works, by keeping pace with time & 
wall progress by little bursts of writes. If you reduce the sleep time a 
lot then it would end up having writes interleaved with small sleeps, but 
then this would be bad for performance has the OS would loose the ability 
to write much data sequentially on the disk.


It does not mean that the default 100 ms is a good figure, but the "being 
behind" is a feature, not an issue as such.


And even so, it's pretty easy to get into checkpoint scenarios with ~500 
mbyte/s as a writeout rate.


H. Not with my hardware:-)

Only issuing a sync_file_range() 10 times for that is obviously 
problematic.


Hmmm. Then it should depend on the expected write capacity of the 
underlying disks...



The implementation pretty always will go behind schedule for some
time. Since sync_file_range() doesn't flush in the foreground I don't
think it's important to do the flushing in concert with sleeping.


For me it is important to avoid accumulating too large flushes, and that is
the point of the call before sleeping.


I don't follow this argument. It's important to avoid large flushes,
therefore we potentially allow large flushes to accumulate?


On my simple test hardware the flushes are not large, I think, so the 
problem does not arise. Maybe I should check.



My testing seems to show that just adding a limit of 32 buffers to
FileAsynchronousFlush() leads to markedly better results.


Hmmm. 32 buffers means 256 KB, which is quite small.


Why?


Because the point of sorting is to generate sequential writes so that the
HDD has a lot of aligned stuff to write without moving the head, and 32 is
rather small for that.


A sync_file_range(SYNC_FILE_RANGE_WRITE) doesn't synchronously write
data back. It just puts it into the write queue.


Yes.

You can have merging between IOs from either side.  But more importantly 
you can't merge that many requests together anyway.


Probably.


The aim is to not overwhelm the request queue - which is where the
coalescing is done. And usually that's rather small.


That is an argument. How small, though? It seems to be 128 by default, so
I'd rather have 128? Also, it can be changed, so maybe it should really be a
guc?


I couldn't see any benefits above (and below) 32 on a 20 drive system,


So it is one kind of (big) hardware. Assuming that pages are contiguous, 
how much is written on each disk depends on the RAID type, the stripe 
size, and when it is really written depends on the various cache (in the 
RAID HW card if any, on the disk, ...), so whether 32 at the OS level is 
the right size is pretty unclear to me. I would have said the larger the 
better, but indeed you should avoid blocking.



so I doubt it's worthwhile. It's actually good for interactivity to
allow other requests into the queue concurrently - otherwise other
reads/writes will obviously have a higher latency...


Sure. Now on my tests, with my (old & little) hardware it seemed quite 
smooth. What I'm driving at is that what is good may be relative and 
depend on the underlying hardware, which makes it not obvious to choose 
the right parameter.



If you flush much more sync_file_range starts to do work in the
foreground.


Argh, too bad. I would have hoped that the would just deal with in an
asynchronous way,


It's even in the man page:
"Note  that  even  this  may  block if you attempt to write more than
request queue size."


Hmmm. What about choosing "request queue size * 0.5", then ?


Because it should be in shared buffers where pg needs it?


Huh? I'm just suggesting p = mmap(fd, offset, bytes);msync(p, bytes);munmap(p);
instead of sync_file_range().


I think that I do not really understand how it may work, but possible it 
could.



ISTM that it is rather an argument for taking the tablespace into the
sorting, not necessarily for a binary heap.


I don't understand your problem with that. The heap specific code is
small, smaller than your NextBufferToWrite() implementation?


You have not yet posted the updated version of the patch.

Thee complexity of the round robin scan on the array is O(1) and very few 
instructions, plus some stop condition which is mostly true I think if the 
writes are balanced between table spaces, there is no dynamic allocation 
in the data structure (it is an array). The binary heap is O(log(n)), 
probably there are dynamic allocations and frees when extracting/inserting 
something, there are functions calls to rebalance the tree, and so on. Ok, 
"n" is expected to be small.


So basically, for me it is not obviously superior to the previous version. 
Now I'm also tired, so if it works reasonably I

Re: [HACKERS] Re: [BUGS] BUG #13694: Row Level Security by-passed with CREATEUSER permission

2015-10-22 Thread Stephen Frost
* Joe Conway (m...@joeconway.com) wrote:
> On 10/21/2015 12:46 PM, Tom Lane wrote:
> > Attached patch rips out CREATEUSER and NOCREATEUSER options lock, stock,
> > and barrel.
> 
> Looks good to me.
> 
> > Another possibility is to change them to actually mean CREATEROLE and
> > NOCREATEROLE.  I think probably a clean break is better though.
> 
> 
> I think that would be too confusing. I'd rather see them go away ala
> your patch.

Agreed.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pgbench throttling latency limit

2015-10-22 Thread Fabien COELHO



Argh. It's just because I used -P5. It's a bit confusing that the other
options are per second, and this is per interval...


I agree, but I'm unsure of a fix, beyond what is already done which is to
show units next to the figures...

ISTM that people expect "tps" for performance, even on several seconds.
When it comes to skipped transactions, a count seemed more natural. I
really just see this as an indicator that things are not going smoothly.

Maybe it could be shown as a percentage of scheduled transactions,
possibly with an option?

A mitigation is to always run with -P 1 :-).


Wouldn't printing average (per second) over the interval work?


Yes it would. That would be "skipped tps". Why not. The percentage seems 
also attractive to me, because it does not matter whether you get big 
figures or small figures as it is relative.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] a raft of parallelism-related bug fixes

2015-10-22 Thread Robert Haas
On Wed, Oct 21, 2015 at 9:04 AM, Amit Langote  wrote:
> ... node *need* not be parallel aware?

Yes, thanks.  Committed that way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] a raft of parallelism-related bug fixes

2015-10-22 Thread Robert Haas
On Tue, Oct 20, 2015 at 6:12 PM, Simon Riggs  wrote:
> Not on your case in a big way, just noting the need for change there.

Yes, I appreciate your attitude.  I think we are on the same wavelength.

> I'll help as well, but if you could start with enough basics to allow me to
> ask questions that will help. Thanks.

Will try to keep pushing in that direction.  May be easier once some
of the dust has settled.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH v3] GSSAPI encryption support

2015-10-22 Thread Robbie Harwood
Andres Freund  writes:

> On 2015-10-22 16:47:09 +0900, Michael Paquier wrote:
>> Hm, and that's why you chose this way of going. My main concern about
>> this patch is that it adds on top of the existing Postgres protocol a
>> layer to encrypt and decrypt the messages between server and client
>> based on GSSAPI. All messages transmitted between client and server
>> are changed to 'g' messages on the fly and switched back to their
>> original state at reception. This is symbolized by the four routines
>> you added in the patch in this purpose, two for frontend and two for
>> backend, each one for encryption and decryption. I may be wrong of
>> course, but it seems to me that this approach will not survive
>> committer-level screening because of the fact that context-level
>> things invade higher level protocol messages.
>
> Agreed. At least one committer here indeed thinks this approach is not
> acceptable (and I've said so upthread).

Okay, I'll make some changes.  Before I do, though, since this is not
the approach I came up with, can you explicitly state what you're
looking for here?  It subjectively seems that I'm getting a lot of
feedback of "this feels wrong" without suggestion for improvement.

To be clear, what I need to know is:

- What changes do you want to see in the wire protocol?  (And how will
  fallback be supported if that's affected?)

- Since this seems to be an important sticking point, what files am I
  encouraged to change (or prohibited from changing)?  (Fallback makes
  this complex.)

- I've been assuming that we care about fallback, but I'd like to be
  told that it's something postgres actually wants to see because it's
  the most intricate part of these changes.  (I'm reasonably confident
  that the code becomes simpler without it, and I myself have no use for
  it.)

If I understand what you're asking for (and the above is intended to be
sure that I will), this will not be a trivial rework, so I want to be
really sure before doing that because writing this code a third time is
something I don't relish.

Thanks,
--Robbie


signature.asc
Description: PGP signature


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-10-22 Thread Robert Haas
On Tue, Oct 20, 2015 at 4:58 AM, Syed, Rahila  wrote:
>>I think that you should add the flag or something which indicates whether 
>>this backend is running VACUUM or not, into PgBackendStatus.
>>pg_stat_vacuum_progress should display the entries of only backends with that 
>>flag set true. This design means that you need to set the flag to true when 
>>starting VACUUM and reset at the end of VACUUM progressing.
> Please find attached  updated patch which adds a flag in PgBackendStatus 
> which indicates whether this backend in running VACUUM.

Flag isn't reset on error.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >