Re: [HACKERS] UPDATE of partition key

2018-01-21 Thread Amit Khandekar
On 22 January 2018 at 02:40, Robert Haas  wrote:
> On Sun, Jan 21, 2018 at 1:43 AM, Tom Lane  wrote:
>> Robert Haas  writes:
>>> Committed with a bunch of mostly-cosmetic revisions.
>>
>> Buildfarm member skink has been unhappy since this patch went in.
>> Running the regression tests under valgrind easily reproduces the
>> failure.  Now, I might be wrong about which of the patches committed
>> on Friday caused the unhappiness, but the valgrind backtrace sure
>> looks like it's to do with partition routing:
>
> Yeah, that must be the fault of this patch.  We assign to
> proute->subplan_partition_offsets[update_rri_index] from
> update_rri_index = 0 .. num_update_rri, and there's an Assert() at the
> bottom of this function that checks this, so probably this is indexing
> off the end of the array.  I bet the issue happens when we find all of
> the UPDATE result rels while there are still partitions left; then,
> subplan_index will be equal to the length of the
> proute->subplan_partition_offsets array and we'll be indexing just off
> the end.

Yes, right, that's what is happening. It is not happening on an Assert
though (there is no assert in that function). It is happening when we
try to access the array here :

if (proute->subplan_partition_offsets &&
proute->subplan_partition_offsets[subplan_index] == i)

Attached is a fix, where I have introduced another field
PartitionTupleRouting.num_ subplan_partition_offsets, so that above,
we can add another condition (subplan_index <
proute->num_subplan_partition_offsets) in order to stop accessing the
array once we are done with all the offset array elements.

Ran the update.sql test with valgrind enabled on my laptop, and the
valgrind output now does not show errors.

-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company


fix_valgrind_issue.patch
Description: Binary data


Handling better supported channel binding types for SSL implementations

2018-01-21 Thread Michael Paquier
Hi all,

Per the recent discussions around support of new SSL implementations for
Postgres, it is becoming rather clear to me that the backend needs to be
a bit smarter with the way it needs to decide if it should publish or
not SCRAM-SHA-256-PLUS in the list that the clients can use to choose a
SASL mechanism for authentication.

This has been discussed here in the MacOS SSL implementation:
https://www.postgresql.org/message-id/20180122014637.GE22690%40paquier.xyz
As well as here for GnuTLS:
https://www.postgresql.org/message-id/CAB7nPqRJteyoyyj8E__v1D1SMoj8jpv6ZPyHuc7Md45%2BED-uMA%40mail.gmail.com

Attached is a patch which is an attempt to make this choice cleaner for
new SSL implementations. As we are (rightly!) calling the APIs to fetch
the channel binding data only when necessary, I think that we need an
extra API for SSL implementations to let the server decide if channel
binding mechanisms should be published or not. There could be multiple
possibilities, like making this API return only a boolean flag. However
I have made a more extensible choice by having each SSL implementation
build a list of supported channel bindings. This matters for each
implementation as:
- GnuTLS only supports tls-unique.
- OpenSSL supports both tls-unique and tls-server-end-point.
- MacOS would support none.
However there is as well the argument that this list's contents are not
directly used now, and based on what I saw from the MacOS SSL and GnuTLS
patches that would not be the case after either.

I am adding that to the next CF for consideration.

Thoughts?
--
Michael
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 746d7cbb8a..ecaab21e13 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -873,6 +873,7 @@ CheckSCRAMAuth(Port *port, char *shadow_pass, char **logdetail)
 	int			inputlen;
 	int			result;
 	bool		initial;
+	List	   *channel_bindings = NIL;
 
 	/*
 	 * SASL auth is not supported for protocol versions before 3, because it
@@ -898,7 +899,17 @@ CheckSCRAMAuth(Port *port, char *shadow_pass, char **logdetail)
 		strlen(SCRAM_SHA256_NAME) + 3);
 	p = sasl_mechs;
 
-	if (port->ssl_in_use)
+#ifdef USE_SSL
+	/*
+	 * Get the list of channel binding types supported by this SSL
+	 * implementation to determine if server should publish -PLUS
+	 * mechanisms or not.
+	 */
+	channel_bindings = be_tls_list_channel_bindings();
+#endif
+
+	if (port->ssl_in_use &&
+		list_length(channel_bindings) > 0)
 	{
 		strcpy(p, SCRAM_SHA256_PLUS_NAME);
 		p += strlen(SCRAM_SHA256_PLUS_NAME) + 1;
diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c
index fc6e8a0a88..95511a61b3 100644
--- a/src/backend/libpq/be-secure-openssl.c
+++ b/src/backend/libpq/be-secure-openssl.c
@@ -58,6 +58,7 @@
 #include 
 #endif
 
+#include "common/scram-common.h"
 #include "libpq/libpq.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -1215,6 +1216,18 @@ be_tls_get_peerdn_name(Port *port, char *ptr, size_t len)
 		ptr[0] = '\0';
 }
 
+/*
+ * Routine to get the list of channel binding types available in this SSL
+ * implementation. For OpenSSL, both tls-unique and tls-server-end-point
+ * are supported.
+ */
+List *
+be_tls_list_channel_bindings(void)
+{
+	return list_make2(pstrdup(SCRAM_CHANNEL_BINDING_TLS_UNIQUE),
+	  pstrdup(SCRAM_CHANNEL_BINDING_TLS_END_POINT));
+}
+
 /*
  * Routine to get the expected TLS Finished message information from the
  * client, useful for authorization when doing channel binding.
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index 49cb263110..3c37e800c1 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -209,6 +209,7 @@ extern bool be_tls_get_compression(Port *port);
 extern void be_tls_get_version(Port *port, char *ptr, size_t len);
 extern void be_tls_get_cipher(Port *port, char *ptr, size_t len);
 extern void be_tls_get_peerdn_name(Port *port, char *ptr, size_t len);
+extern List *be_tls_list_channel_bindings(void);
 extern char *be_tls_get_peer_finished(Port *port, size_t *len);
 extern char *be_tls_get_certificate_hash(Port *port, size_t *len);
 #endif


signature.asc
Description: PGP signature


Re: pgbench - add \if support

2018-01-21 Thread Pavel Stehule
2018-01-21 23:31 GMT+01:00 Craig Ringer :

> On 16 January 2018 at 06:28, Fabien COELHO  wrote:
>
>>
>> Here is a rebase. I made some tests use actual expressions instead of
 just 0 and 1. No other changes.

>>>
>>> Sigh. Better with the attachment. Sorry for the noise.
>>>
>>
>> Here is a very minor rebase.
>>
>
> I'm a smidge worried about this. It seems like psql is growing a scripting
> language. Do we want to go our own way with a kind of organically grown
> scripting system? Or should we be looking at embedding client-side
> scripting in a more structured, formal way instead? Embed a lua interpreter
> or something?
>

few scripting features doesn't mean scripting language. \if in psql is nice
feature that reduce duplicate code, unreadable code, and helps with
deployment and test scripts. pgbench and psql should to have similar
environment - and I am thinking so \if should be there.

Using Lua is not bad idea - in psql too - I though about it much, but in
this case is better to start from zero.

Regards

Pavel

>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>


Re: Doc tweak for huge_pages?

2018-01-21 Thread Justin Pryzby
On Mon, Jan 22, 2018 at 07:10:33PM +1300, Thomas Munro wrote:
> On Mon, Jan 22, 2018 at 6:30 PM, Justin Pryzby  wrote:
> > On Mon, Jan 22, 2018 at 03:54:26PM +1300, Thomas Munro wrote:
> >> On Fri, Jan 12, 2018 at 1:12 PM, Thomas Munro
> >>  wrote:
> >> > On Tue, Jan 9, 2018 at 6:24 AM, Catalin Iacob  
> >> > wrote:
> >> > I don't know enough about this to make such a strong recommendation
> >> > myself, which is why I was only trying to report that bad performance
> >> > had been observed on some version, not that you shouldn't do it.  Any
> >> > other views on this stronger statement?
> >>
> >> Now that the Windows huge pages patch has landed, here is a rebase.  I
> >> took your alternative and tweaked it a tiny bit more.  Thoughts?
>
> Sorry, right, that was 100% wrong.  It would probably be correct to
> remove the "not", but let's just remove that bit.  New version
> attached.

+PostgreSQL. On Linux, this is called
+"transparent huge pages", but since that feature is known to cause
+performance degradation with
+PostgreSQL on current Linux versions
+(unlike explicit use of huge_pages), its use is
+discouraged.

Consider this shorter, less-severe sounding alternative:
"... (but note that this feature can degrade performance of some
PostgreSQL workloads)."

Justin



Re: Doc tweak for huge_pages?

2018-01-21 Thread Thomas Munro
On Mon, Jan 22, 2018 at 6:30 PM, Justin Pryzby  wrote:
> On Mon, Jan 22, 2018 at 03:54:26PM +1300, Thomas Munro wrote:
>> On Fri, Jan 12, 2018 at 1:12 PM, Thomas Munro
>>  wrote:
>> > On Tue, Jan 9, 2018 at 6:24 AM, Catalin Iacob  
>> > wrote:
>> > I don't know enough about this to make such a strong recommendation
>> > myself, which is why I was only trying to report that bad performance
>> > had been observed on some version, not that you shouldn't do it.  Any
>> > other views on this stronger statement?
>>
>> Now that the Windows huge pages patch has landed, here is a rebase.  I
>> took your alternative and tweaked it a tiny bit more.  Thoughts?
>
> +   
> +Note that, besides explicitly requesting huge pages via
> +huge_pages,
> => I would just say:
> "Note that, besides huge pages requested explicitly, ..."

+1

> + In Linux this automatic use is
> => ON Linux comma?

+1

> +called "transparent huge pages" and is not enabled by default in
> +popular distributions as of the time of writing, but since 
> transparent
>
> => really ?  I don't know if I've ever seen it not enabled.  In any case,
> that's a strong statement to make (to be disabled in ALL popular 
> distributions).

Argh.

> https://blog.nelhage.com/post/transparent-hugepages/
> => It is enabled (”enabled=always”) by default in most Linux distributions.

Sorry, right, that was 100% wrong.  It would probably be correct to
remove the "not", but let's just remove that bit.  New version
attached.

Thanks.

-- 
Thomas Munro
http://www.enterprisedb.com


huge-pages-doc-tweak-v5.patch
Description: Binary data


Re: Doc tweak for huge_pages?

2018-01-21 Thread Justin Pryzby
On Mon, Jan 22, 2018 at 03:54:26PM +1300, Thomas Munro wrote:
> On Fri, Jan 12, 2018 at 1:12 PM, Thomas Munro
>  wrote:
> > On Tue, Jan 9, 2018 at 6:24 AM, Catalin Iacob  
> > wrote:
> > I don't know enough about this to make such a strong recommendation
> > myself, which is why I was only trying to report that bad performance
> > had been observed on some version, not that you shouldn't do it.  Any
> > other views on this stronger statement?
> 
> Now that the Windows huge pages patch has landed, here is a rebase.  I
> took your alternative and tweaked it a tiny bit more.  Thoughts?

+   
+Note that, besides explicitly requesting huge pages via
+huge_pages,
=> I would just say:
"Note that, besides huge pages requested explicitly, ..."

+ In Linux this automatic use is
=> ON Linux comma?

+called "transparent huge pages" and is not enabled by default in
+popular distributions as of the time of writing, but since transparent

=> really ?  I don't know if I've ever seen it not enabled.  In any case,
that's a strong statement to make (to be disabled in ALL popular distributions).

I checked all our servers, including centos6 and ubuntu t-LTS and x-LTS.  On a
limited few where it was disabled, I'd explicitly done so.

On a server on which I just installed ubuntu-x LTS, with 4.13.0-26-generic:
pryzbyj@gta-ubuntu:~$ cat /sys/kernel/mm/transparent_hugepage/enabled
always [madvise] never

https://github.com/torvalds/linux/commit/13ece886d99cd668483113f7238e419d5331af26
=> the compile time default is to disable, but (if enabled at compile time),
the runtime default is "always".

On centos7
Linux template0 3.10.0-693.11.6.el7.x86_64 #1 SMP Thu Jan 4 01:06:37 UTC 2018 
x86_64 x86_64 x86_64 GNU/Linux
$ cat /sys/kernel/mm/transparent_hugepage/enabled 
[always] madvise never
$ grep TRANS /boot/config-3.10.0-693.11.6.el7.x86_64 
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
# CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set

https://blog.nelhage.com/post/transparent-hugepages/
=> It is enabled (”enabled=always”) by default in most Linux distributions.

Justin



Re: pgbench - add \if support

2018-01-21 Thread Fabien COELHO


Helo Craig,

I'm a smidge worried about this. It seems like psql is growing a 
scripting language.


The patch is about aligning pgbench with psql, which already has \if.


Do we want to go our own way with a kind of organically grown
scripting system? Or should we be looking at embedding client-side
scripting in a more structured, formal way instead? Embed a lua interpreter
or something?


My 0.02€ is that the point is to deal with useful/needed simple client 
capabilities while integrating gracefully with bare server-side executed 
SQL.


As for useful client-side capabilities, for both psql & pgbench ISTM that 
it is more in line with a limited cpp-like thing: include, expressions, 
variables, conditions... maybe minimal error handling. No loop.


As for a language interpreter, it would raise the question of which 
language (lua, tcl, python, perl, VB, sh, R, ...) and the graceful (upward 
compatible) integration of any such language: eg how do have pieces of 
bare SQL and any other existing language would require some scanning 
conventions that do not exist.


psql & pgbench already have ":x" variables. psql has the ability to set 
variable from SQL (\gset), and pgbench could do limited expressions to set 
these variables with (\set), which have been extended to be more complete 
, and there was use cases which motivate an (\if).


ISTM enough to align both tools for reasonnably simple use cases that 
could arise when running a basic SQL script of bench. If you have 
something really complicated, then full fledge programming is the answer, 
which cannot be bare-SQL compatible.


So the answer is that it is okay to aim at "limited" scripting because it 
covers useful use cases.


--
Fabien.

Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-01-21 Thread Peter Geoghegan
On Sun, Jan 21, 2018 at 6:34 PM, Amit Kapila  wrote:
>> Why is this okay for Gather nodes, though? nodeGather.c looks at
>> pcxt->nworkers_launched during initialization, and appears to at least
>> trust it to indicate that more than zero actually-launched workers
>> will also show up when "nworkers_launched > 0". This trust seems critical
>> when parallel_leader_participation is off, because "node->nreaders ==
>> 0" overrides the parallel_leader_participation GUC's setting (note
>> that node->nreaders comes directly from pcxt->nworkers_launched). If
>> zero workers show up, and parallel_leader_participation is off, but
>> pcxt->nworkers_launched/node->nreaders is non-zero, won't the Gather
>> never make forward progress?
>
> Ideally, that situation should be detected and we should throw an
> error, but that doesn't happen today.  However, it will be handled
> with Robert's patch on the other thread for CF entry [1].

I knew that, but I was confused by your sketch of the
WaitForParallelWorkerToAttach() API [1]. Specifically, your suggestion
that the problem was unique to nbtsort.c, or was at least something
that nbtsort.c had to take a special interest in. It now appears more
like a general problem with a general solution, and likely one that
won't need *any* changes to code in places like nodeGather.c (or
nbtsort.c, in the case of my patch).

I guess that you meant that parallel CREATE INDEX is the first thing
to care about the *precise* number of nworkers_launched -- that is
kind of a new thing. That doesn't seem like it makes any practical
difference to us, though. I don't see why nbtsort.c should take a
special interest in this problem, for example by calling
WaitForParallelWorkerToAttach() itself. I may have missed something,
but right now ISTM that it would be risky to make the API anything
other than what both nodeGather.c and nbtsort.c already expect (that
they'll either have nworkers_launched workers show up, or be able to
propagate an error).

[1] 
https://postgr.es/m/caa4ek1kzvxtcff8inhceviupxp4ywcs3rzuwjfqmttf75x2...@mail.gmail.com
-- 
Peter Geoghegan



Re: Doc tweak for huge_pages?

2018-01-21 Thread Thomas Munro
On Fri, Jan 12, 2018 at 1:12 PM, Thomas Munro
 wrote:
> On Tue, Jan 9, 2018 at 6:24 AM, Catalin Iacob  wrote:
>> So I tried to redo the second paragraph and ended up with the
>> attached. Rationale for the changes:
>> * changed "this feature" to "explicitly requesting huge pages" to
>> contrast with the automatic one described below
>> * made the wording of Linux THP more negative (but still with some
>> wiggle room for future kernel versions which might improve THP),
>> contrasting with the positive explicit request from this GUC
>> * integrated your mention of other OSes with automatic huge pages
>> * moved the new text to the last paragraph to lower its importance
>>
>> What do you think?
>
> I don't know enough about this to make such a strong recommendation
> myself, which is why I was only trying to report that bad performance
> had been observed on some version, not that you shouldn't do it.  Any
> other views on this stronger statement?

Now that the Windows huge pages patch has landed, here is a rebase.  I
took your alternative and tweaked it a tiny bit more.  Thoughts?

-- 
Thomas Munro
http://www.enterprisedb.com


huge-pages-doc-tweak-v4.patch
Description: Binary data


Re: [Sender Address Forgery]Re: pg_(total_)relation_size and partitioned tables

2018-01-21 Thread Michael Paquier
On Sun, Jan 21, 2018 at 07:16:38PM +1300, David Rowley wrote:
> On 20 January 2018 at 23:14, Michael Paquier  
> wrote:
>> If pg_partition_tree_tables() uses the top of the partition as input,
>> all the tree should show up. If you use a leaf, anything under the leaf
>> should show up. If a leaf is defined and it has no underlying leaves,
>> then only this outmost leaf should be listed.
> 
> hmm, that's thoroughly confusing. Just in case anyone else is stuck on
> that, I just need to mention that a leaf is the does not have
> branches, in nature or computer science.

OK, sorry if my words are confusing. Imagine that you have the following
partition set:

   p
  / \
 /   \
p1   p2
/  \
   /\
  p21   p22

If pg_partition_tree_tables() uses 'p' as input argument, then I would
imagine that it should return p, p1, p2, p21 and p22. If 'p2' is used,
then p2, p21 and p22 are the results. If either one of p1, p21 or p22 is
used as input, then the result is respectively p1, p21 or p22.

Amit's patch seems to be doing that kind of logic by using
find_all_inheritors, which is good. We need to make the difference
between relations that are part of a partition set and the other ones
which are part of an INHERIT link, and, at least it seems to me, the
patch is not careful with that. I haven't tested what is proposed
though, but this portion likely needs more thoughts.
--
Michael


signature.asc
Description: PGP signature


Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-01-21 Thread Amit Kapila
On Mon, Jan 22, 2018 at 12:50 AM, Peter Geoghegan  wrote:
> On Sat, Jan 20, 2018 at 8:38 PM, Amit Kapila  wrote:
>> It would have been better if there were some comments besides that
>> field, but I think it has been covered at another place in the code.
>> See comments in LaunchParallelWorkers().
>>
>> /*
>> * Start workers.
>> *
>> * The caller must be able to tolerate ending up with fewer workers than
>> * expected, so there is no need to throw an error here if registration
>> * fails.  It wouldn't help much anyway, because registering the worker in
>> * no way guarantees that it will start up and initialize successfully.
>> */
>
> Why is this okay for Gather nodes, though? nodeGather.c looks at
> pcxt->nworkers_launched during initialization, and appears to at least
> trust it to indicate that more than zero actually-launched workers
> will also show up when "nworkers_launched > 0". This trust seems critical
> when parallel_leader_participation is off, because "node->nreaders ==
> 0" overrides the parallel_leader_participation GUC's setting (note
> that node->nreaders comes directly from pcxt->nworkers_launched). If
> zero workers show up, and parallel_leader_participation is off, but
> pcxt->nworkers_launched/node->nreaders is non-zero, won't the Gather
> never make forward progress?
>

Ideally, that situation should be detected and we should throw an
error, but that doesn't happen today.  However, it will be handled
with Robert's patch on the other thread for CF entry [1].


[1] - https://commitfest.postgresql.org/16/1341/

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] Supporting huge pages on Windows

2018-01-21 Thread Thomas Munro
On Mon, Jan 22, 2018 at 3:45 AM, Magnus Hagander  wrote:
> I got myself a working build env now so I can at least verify it builds,
> which it does.
>
> With that, I'm pushing this. Let's see what the buildfarm thinks of it. And
> if others end up complaining about the platform drop, but I doubt that.
>
> Once again, apologies for the delay, and thanks for the patch!

+To start the database server on the command prompt as a
standalone process,
+not as a Windows service, the command prompt must be run as
an administrator
+User Access Control (UAC) must be disabled. When the UAC is
enabled, the normal
+command prompt revokes the user right Lock Pages in Memory
when started.

Is that first sentence  missing the word "and" after "administrator"?

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] Support for Secure Transport SSL library on macOS as OpenSSL alternative

2018-01-21 Thread Michael Paquier
On Mon, Jan 22, 2018 at 12:08:03AM +0100, Daniel Gustafsson wrote:
> The attached patchset rebases Secure Transport support over HEAD and adds stub
> functions for that the SCRAM support added to make everything compile and run
> the SSL testsuite.  There are no new features or bugfixes over the previously
> posted patches.
> 
> Wrt SCRAM, I’m probably thick but I can’t really see what I need to do to
> handle SCRAM, so I wouldn’t mind some cluesticks on that.  The Secure 
> Transport
> API doesn’t allow for getting the TLS Finished message (at least I haven’t 
> been
> able to find a way), so channel binding can’t be supported afaict.

OK, thanks. That sucks, perhaps Apple will improve things in the future,
this is the kind of areas where there is always interest.

> The testcode has been updated to handle Secure Transport, but it’s not
> in a clean form, rather a quick hack to get something running while the 
> project
> settles on how to handle multiple SSL implementations.
> 
> I have for now excluded the previous doc changes awating the discussion on the
> patch in 1f34fa82-52a0-1682-87ba-4c3c3d0af...@2ndquadrant.com, once that
> settles I’ll revive and write the documentation.  The same goes for GUCs etc
> which are discussed in other threads.
> 
> As per before, my patch for running tests against another set of binaries is
> included as well as a fix for connstrings with spaces, but with the recent
> hacking by Peter I assume this is superfluous.  It was handy for development 
> so
> I’ve kept it around though.

Yes that looks useful to test.

be-secure-securetransport.c is quite a long name by the way, perhaps we
could just live with be-secure-osx.c or be-secure-mac.c?

Here are some comments about the SCRAM/channel binding part..

+be_tls_get_peer_finished(Port *port, size_t *len)
+{
+   ereport(ERROR,
+   (errcode(ERRCODE_PROTOCOL_VIOLATION),
+errmsg("channel binding is not supported by this build")));
+   return NULL;
Those should mention the channel binding type.

In CheckSCRAMAuth(), you are missing the fact that SCRAM-SHA-256-PLUS is
still published to the client. I think that this is a mistake as no
channel binding types are supported. We may want to add an API for each
SSL implementation to help with that as I want to keep the code paths
fetching the channel binding data only invoked when necessary. So adding
a new API which returns a list of channel binding types supported by a
given backend would make the most sense to me. If the list is empty,
then -PLUS is not published by the backend. This is not a problem
related to your patch, more a problem that I need to solve as gnutls
only supports tls-unique. So I'll create a new thread on the matter with
a proper patch.

 note "setting up data directory";
-my $node = get_new_node('master');
+my $node = get_new_node('master', $ENV{SSLTEST_SERVER_BIN});
 $node->init;
This bit is in 0001, but this concept is introduced in 0002. (Not sure
how to feel about that yet, there are similar use-cases with
pg_upgrade's TAP tests where you may want to enforce a PATH.)

Nit: patch has some whitespaces. You may want to run a git diff --check
or such before sending.
--
Michael


signature.asc
Description: PGP signature


Re: Bogus tags for comments, ACLs, and security labels in pg_dump

2018-01-21 Thread Tom Lane
I wrote:
> What I think we should do is fix pg_dump so that these classes of
> TOC entries do indeed have tags that meet the expectation held by
> _tocEntryRequired, as per the first attached patch.

Some further poking around identified another place that was taking
dubious shortcuts to identify large-object TOC entries, further down
in _tocEntryRequired.  Since that code is executed only during
binary upgrade, which will be working from a schema-only dump anyway,
it *might* be harmless.  But I'm not sure, and it's certainly far from
future-proof.  I also noted a copy-and-pasteo in dumpSecLabel.

Hence, I intend to add the attached to the back-patched fix proposed
previously.

regards, tom lane

diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 41741ae..acef20f 100644
*** a/src/bin/pg_dump/pg_backup_archiver.c
--- b/src/bin/pg_dump/pg_backup_archiver.c
*** _tocEntryRequired(TocEntry *te, teSectio
*** 2944,2957 
  	if (ropt->schemaOnly)
  	{
  		/*
  		 * In binary-upgrade mode, even with schema-only set, we do not mask
  		 * out large objects.  Only large object definitions, comments and
  		 * other information should be generated in binary-upgrade mode (not
  		 * the actual data).
  		 */
  		if (!(ropt->sequence_data && strcmp(te->desc, "SEQUENCE SET") == 0) &&
! 			!(ropt->binary_upgrade && strcmp(te->desc, "BLOB") == 0) &&
! 			!(ropt->binary_upgrade && strncmp(te->tag, "LARGE OBJECT ", 13) == 0))
  			res = res & REQ_SCHEMA;
  	}
  
--- 2944,2965 
  	if (ropt->schemaOnly)
  	{
  		/*
+ 		 * The sequence_data option overrides schema-only for SEQUENCE SET.
+ 		 *
  		 * In binary-upgrade mode, even with schema-only set, we do not mask
  		 * out large objects.  Only large object definitions, comments and
  		 * other information should be generated in binary-upgrade mode (not
  		 * the actual data).
  		 */
  		if (!(ropt->sequence_data && strcmp(te->desc, "SEQUENCE SET") == 0) &&
! 			!(ropt->binary_upgrade &&
! 			  (strcmp(te->desc, "BLOB") == 0 ||
! 			   (strcmp(te->desc, "ACL") == 0 &&
! strncmp(te->tag, "LARGE OBJECT ", 13) == 0) ||
! 			   (strcmp(te->desc, "COMMENT") == 0 &&
! strncmp(te->tag, "LARGE OBJECT ", 13) == 0) ||
! 			   (strcmp(te->desc, "SECURITY LABEL") == 0 &&
! strncmp(te->tag, "LARGE OBJECT ", 13) == 0
  			res = res & REQ_SCHEMA;
  	}
  
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 0bdd398..8e5e2d5 100644
*** a/src/bin/pg_dump/pg_dump.c
--- b/src/bin/pg_dump/pg_dump.c
*** dumpSecLabel(Archive *fout, const char *
*** 14811,14817 
  	if (dopt->no_security_labels)
  		return;
  
! 	/* Comments are schema not data ... except blob comments are data */
  	if (strncmp(target, "LARGE OBJECT ", 13) != 0)
  	{
  		if (dopt->dataOnly)
--- 14811,14817 
  	if (dopt->no_security_labels)
  		return;
  
! 	/* Security labels are schema not data ... except blob labels are data */
  	if (strncmp(target, "LARGE OBJECT ", 13) != 0)
  	{
  		if (dopt->dataOnly)


Re: BUGFIX: standby disconnect can corrupt serialized reorder buffers

2018-01-21 Thread Craig Ringer
On 6 January 2018 at 08:28, Alvaro Herrera  wrote:

> I think this should use ReadDirExtended with an elevel less than ERROR,
> and do nothing.
>
> Why have strcmp(.) and strcmp(..)?  These are going to be skipped by the
> comparison to "xid" prefix anyway.  Looks like strcmp processing power
> waste.
>
> Please don't use bare sprintf() -- upgrade to snprintf.
>
> With this coding, if I put a root-owned file "xidfoo" in a replslot
> directory, it will PANIC the server.  Is that okay?  Why not read the
> file name with sscanf(), since we know the precise format it has?  Then
> we don't need to bother with random crap left around.  Maybe a good time
> to put the "xid-%u-lsn-%X-%X.snap" literal in a macro?   I guess the
> rationale is that if you let random people put "xidfoo" files in your
> replication slot dirs, you deserve a PANIC anyway, but I'm not sure.
>

I'm happy to address those comments.

The PANIC probably made sense when it was only done on startup, but not now
it's at runtime.

The rest is mainly retained from the prior code, but it's a good chance to
make those changes.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: let's make the list of reportable GUCs configurable (was Re: Add %r substitution for psql prompts to show recovery status)

2018-01-21 Thread Craig Ringer
On 11 January 2018 at 09:55, Tom Lane  wrote:

> Robert Haas  writes:
> > But if we add this feature and somebody wants to use it for
> > server_version_num, it's really pretty simple.  In the startup packet,
> > you say _pq_.report=server_version_num.  Then, you call
> > PQparameterStatus(conn, "server_version_num").  If you don't get a
> > value, you try calling PQparameterStatus(conn, "server_version") and
> > extracting the second word.  If that still doesn't work then you give
> > up.  That doesn't seem either useless or difficult to implement
> > correctly from here.
>
> Yeah, but what's the point?  If yuou have to maintain the server_version
> parsing code anyway, you're not saving any complexity with this.  You're
> just creating a code backwater that you probably won't test adequately.
>
>
If we'd done server_version_num in 9.5, for example, less stuff would've
broken with pg10.

I just don't get it. The argument you use can be applied to almost any
change, to say "why bother making change  because people can't rely on
it for  years, so it's pointless to have it at all". Why add protocol v3?

PostgreSQL is a stable, long term project, and I'd like to plan for the
future. I also think people are way more likely to handle things like
--with-extra-version correctly when dealing with server_version_num, where
I don't much *care* if code parsing old server versions breaks.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: bytea bitwise logical operations implementation (xor / and / or / not)

2018-01-21 Thread Craig Ringer
On 13 January 2018 at 01:57, Christian Rossow 
wrote:

> Hi Fabien,
>
> > I think that the probability of getting these useful things into pg is
> > alas small. In the mean time, you may package and register it as an
> > extension?
> I aimed to close the asymmetry between bit vector operations (they also
> offer xor/and/etc.) and bytea operations. My code is more or less a 1:1
> copy of the corresponding bit vector operations (in fact, just a tiny
> bit easier, since bytea is byte-aligned, wheres bit vectors aren't).
>
> I just wanted to dump the code here, given that Tom suggested (in 2006)
> such an implementation might be a useful addition to bytea. But I won't
> feel offended in any way if the code won't be merged.
>

Seems like a good idea to me.

Conversion between bit varying and bytea would be a big help.

So would casts from bit varying or bytea <-> integer and other types.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: pgbench - add \if support

2018-01-21 Thread Craig Ringer
On 16 January 2018 at 06:28, Fabien COELHO  wrote:

>
> Here is a rebase. I made some tests use actual expressions instead of just
>>> 0 and 1. No other changes.
>>>
>>
>> Sigh. Better with the attachment. Sorry for the noise.
>>
>
> Here is a very minor rebase.
>

I'm a smidge worried about this. It seems like psql is growing a scripting
language. Do we want to go our own way with a kind of organically grown
scripting system? Or should we be looking at embedding client-side
scripting in a more structured, formal way instead? Embed a lua interpreter
or something?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: pl/perl extension fails on Windows

2018-01-21 Thread Noah Misch
On Sun, Jan 21, 2018 at 11:34:07AM +0100, Christian Ullrich wrote:
> * Christian Ullrich wrote:
> >* Noah Misch wrote:
> >>On Wed, Nov 29, 2017 at 11:45:35PM -0500, Tom Lane wrote:
> >>>Oh, OK.  In that case, we need to get some representatives of these
> >>>more modern builds into the buildfarm while we're at it.
> >>
> >>Yep.  Among machines already in the buildfarm, the one running member
> >>woodlouse is the best candidate for this.  Its owner could install
> >>http://strawberryperl.com/download/5.26.1.1/strawberry-perl-5.26.1.1-32bit.msi
> >>and setup another animal on the same machine that builds 32-bit and enables
> >>Perl.  Christian, are you interested in doing this?
> >
> >Ready to go, waiting for animal assignment. For now, I can confirm that it 
> >works, that is, the buildfarm --test run is successful.
> 
> Up and running now, name is whelk, first report on REL9_6_STABLE.
> 
> Sorry it took me another ten days to complete the configuration.

This is great.  Thanks.

Buildfarm metadata reports whelk using "Microsoft Visual C++ 2010", but the
run logs show it's using MSVC 2013, like woodlouse does.  Would you update
that buildfarm metadata (with update_personality.pl)?



Re: [HACKERS] Supporting huge pages on Windows

2018-01-21 Thread Andrew Dunstan


On 01/21/2018 01:02 PM, Andres Freund wrote:
> Hi,
>
> On 2018-01-21 13:42:13 +0200, Magnus Hagander wrote:
>> To add some more notes on this. Again, the API appears in Vista/2003.
>> Windows Vista went EOL (out of extended support even) in April 2017,
>> Windows 2003 did so in July 2015. Those are the versions that it's *in* --
>> obviously the versions without it are even older.
>>
>> Our binaries only support Windows 2008 and up (at least the ones by EDB,
>> which are the ones that have a supported-version matrix documented on our
>> site).
>>
>> We have traditionally supported older versions of Windows as long as people
>> build from source. But perhaps I'm way overreading that and we should just
>> bite the bullet, commit this patch, and declare those platforms as
>> completely dead by PostgreSQL 11?
> Yea, I think it's beyond time that we declare some old windows versions
> dead. There's enough weird behaviour in supported versions of windows
> (especially its socket API) that I really don't want to support more
> than necessary. And supporting versions that've been out of date for a
> while seems more than unnecessary.
>


I'll be quite happy to retire the XP machine running brolga, currawong
and frogmouth, if that's the consensus. XP is now long out of support.
OTOH I have personal experience of it running in many potentially
critical situations, still (hospitals, for example). I can, if people
want, keep the machine running just building the back branches.

I should probably look at setting up a modern 32-bit replacement (say
Windows 10 Pro-32).

cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [HACKERS] [PATCH] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind for custom AM

2018-01-21 Thread Nikolay Shaplov
В письме от 3 сентября 2017 11:45:43 пользователь Alvaro Herrera написал:

> I think we should split this in at least two commits,
Added another part for commit:
https://www.postgresql.org/message-id/43332102.S2V5pIjXRx@x200m
This one adds an enum relation option type.

-- 
Do code for fun.

signature.asc
Description: This is a digitally signed message part.


[PATCH][PROPOSAL] Add enum releation option type

2018-01-21 Thread Nikolay Shaplov

Hi!
While working with my big reloption patch,
https://www.postgresql.org/message-id/flat/2146419.veIEZdk4E4@x200m#2146419.veIEZdk4E4@x200m
I found out, that all relation options of string type in current postgres, are
actually behaving as "enum" type. But each time this behavior is implemented
as validate function plus strcmp to compare option value against one of the
possible values.

I think it is not the best practice. It is better to have enum type where it
is technically enum, and keep sting type for further usage (for example for
some kind of index patterns or something like this).

Now strcmp in this case does not really slow down postgres, as both string
options are checked when index is created. One check there will not really
slow down. But if in future somebody would want to have such option checked on
regular basis, it is better to have it as enum type: once "strcmp" it while
parsing, and then just "==" when one need to check option value in runtime.

The patch is in attachment. I hope the code is quite clear.

Possible flaws:

1. I've changed error message from 'Valid values are "XXX", "YYY" and "ZZZ".'
to 'Valid values are "XXX", "YYY", "ZZZ".' to make a code a bit simpler. If it
is not acceptable, please let me know, I will add "and" to the string.

2. Also about the string with the list of acceptable values: the code that
creates this string is inside parse_one_reloption function now. It is a bit
too complex. May be there is already exists some helper function that creates
a string "XXX", "YYY", "ZZZ" from the list of XXX YYY ZZZ values, I do not
know of? Or may be it is reason to create such a function. If so where to
create it? Inside "reloptions.c"? Or it can be reused and I'd better put it
somewhere else?

I hope there would be not further difficulties with this patch. Hope it can be
committed in proper time.

--
Do code for fun.diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 274f7aa..3d12a30 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -404,7 +404,9 @@ static relopt_real realRelOpts[]  	{{NULL}}
 };

-static relopt_string stringRelOpts[] +static const char *gist_buffering_enum_names[] = GIST_OPTION_BUFFERING_VALUE_NAMES;
+static const char *view_check_option_names[] = VIEW_OPTION_CHECK_OPTION_VALUE_NAMES;
+static relopt_enum enumRelOpts[]  {
 	{
 		{
@@ -413,10 +415,8 @@ static relopt_string stringRelOpts[]  			RELOPT_KIND_GIST,
 			AccessExclusiveLock
 		},
-		4,
-		false,
-		gistValidateBufferingOption,
-		"auto"
+		gist_buffering_enum_names,
+		GIST_OPTION_BUFFERING_AUTO
 	},
 	{
 		{
@@ -425,11 +425,14 @@ static relopt_string stringRelOpts[]  			RELOPT_KIND_VIEW,
 			AccessExclusiveLock
 		},
-		0,
-		true,
-		validateWithCheckOption,
-		NULL
+		view_check_option_names,
+		VIEW_OPTION_CHECK_OPTION_NOT_SET
 	},
+	{{NULL}}
+};
+
+static relopt_string stringRelOpts[] +{
 	/* list terminator */
 	{{NULL}}
 };
@@ -476,6 +479,12 @@ initialize_reloptions(void)
    realRelOpts[i].gen.lockmode));
 		j++;
 	}
+	for (i = 0; enumRelOpts[i].gen.name; i++)
+	{
+		Assert(DoLockModesConflict(enumRelOpts[i].gen.lockmode,
+   enumRelOpts[i].gen.lockmode));
+		j++;
+	}
 	for (i = 0; stringRelOpts[i].gen.name; i++)
 	{
 		Assert(DoLockModesConflict(stringRelOpts[i].gen.lockmode,
@@ -514,6 +523,14 @@ initialize_reloptions(void)
 		j++;
 	}

+	for (i = 0; enumRelOpts[i].gen.name; i++)
+	{
+		relOpts[j] = [i].gen;
+		relOpts[j]->type = RELOPT_TYPE_ENUM;
+		relOpts[j]->namelen = strlen(relOpts[j]->name);
+		j++;
+	}
+
 	for (i = 0; stringRelOpts[i].gen.name; i++)
 	{
 		relOpts[j] = [i].gen;
@@ -611,6 +628,9 @@ allocate_reloption(bits32 kinds, int type, const char *name, const char *desc)
 		case RELOPT_TYPE_REAL:
 			size = sizeof(relopt_real);
 			break;
+		case RELOPT_TYPE_ENUM:
+			size = sizeof(relopt_enum);
+			break;
 		case RELOPT_TYPE_STRING:
 			size = sizeof(relopt_string);
 			break;
@@ -690,6 +710,24 @@ add_real_reloption(bits32 kinds, const char *name, const char *desc, double defa
 }

 /*
+ * add_enuum_reloption
+ *		Add a new enum reloption
+ */
+void
+add_enum_reloption(bits32 kinds, const char *name, const char *desc,
+	 const char **allowed_values, int default_val)
+{
+	relopt_enum *newoption;
+
+	newoption = (relopt_enum *) allocate_reloption(kinds, RELOPT_TYPE_ENUM,
+   name, desc);
+	newoption->allowed_values = allowed_values;
+	newoption->default_val = default_val;
+
+	add_reloption((relopt_gen *) newoption);
+}
+
+/*
  * add_string_reloption
  *		Add a new string reloption
  *
@@ -1192,6 +1230,78 @@ parse_one_reloption(relopt_value *option, char *text_str, int text_len,
 	   optreal->min, optreal->max)));
 			}
 			break;
+		case RELOPT_TYPE_ENUM:
+			{
+relopt_enum *opt_enum = (relopt_enum *) option->gen;
+int			i = 0;
+
+parsed = false;
+while (opt_enum->allowed_values[i])
+{
+	if (pg_strcasecmp(value, 

Re: [HACKERS] UPDATE of partition key

2018-01-21 Thread Robert Haas
On Sun, Jan 21, 2018 at 1:43 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> Committed with a bunch of mostly-cosmetic revisions.
>
> Buildfarm member skink has been unhappy since this patch went in.
> Running the regression tests under valgrind easily reproduces the
> failure.  Now, I might be wrong about which of the patches committed
> on Friday caused the unhappiness, but the valgrind backtrace sure
> looks like it's to do with partition routing:

Yeah, that must be the fault of this patch.  We assign to
proute->subplan_partition_offsets[update_rri_index] from
update_rri_index = 0 .. num_update_rri, and there's an Assert() at the
bottom of this function that checks this, so probably this is indexing
off the end of the array.  I bet the issue happens when we find all of
the UPDATE result rels while there are still partitions left; then,
subplan_index will be equal to the length of the
proute->subplan_partition_offsets array and we'll be indexing just off
the end.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Remove PARTIAL_LINKING?

2018-01-21 Thread Tom Lane
Andres Freund  writes:
> We've still some support for building the backend with PARTIAL_LINKING /
> SUBSYS.o instead of the current objfiles.txt approach.
> Any objections to removing that?

+1.  I think the reason for holding onto the old code was mainly fear of
hitting command-line length limits in the link step.  Since that still
hasn't happened 10 years later, even on the oldest buildfarm critters,
it seems moot.  We'd probably just decide to desupport any platform where
that did happen in future.

regards, tom lane



Remove PARTIAL_LINKING?

2018-01-21 Thread Andres Freund
Hi,

We've still some support for building the backend with PARTIAL_LINKING /
SUBSYS.o instead of the current objfiles.txt approach.
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9956ddc19164b02dc1925fb389a1af77472eba5e

Any objections to removing that? Seems that's largely just removing a
bit of logic from common.mk.

Note I'm not talking about whatever that aix logic in src/backend is,
that seems independent except for sharing the SUBSYS.o name.


I just encountered this while polishing my JIT patch, which has
*optional* support for building inlinable version of backend
functions. It'd not be hard to make it work for PARTIAL_LINKING, but I
kinda don't see the point 10 years later...

Greetings,

Andres Freund



Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-01-21 Thread Peter Geoghegan
On Sat, Jan 20, 2018 at 8:38 PM, Amit Kapila  wrote:
> It would have been better if there were some comments besides that
> field, but I think it has been covered at another place in the code.
> See comments in LaunchParallelWorkers().
>
> /*
> * Start workers.
> *
> * The caller must be able to tolerate ending up with fewer workers than
> * expected, so there is no need to throw an error here if registration
> * fails.  It wouldn't help much anyway, because registering the worker in
> * no way guarantees that it will start up and initialize successfully.
> */

Why is this okay for Gather nodes, though? nodeGather.c looks at
pcxt->nworkers_launched during initialization, and appears to at least
trust it to indicate that more than zero actually-launched workers
will also show up when "nworkers_launched > 0". This trust seems critical
when parallel_leader_participation is off, because "node->nreaders ==
0" overrides the parallel_leader_participation GUC's setting (note
that node->nreaders comes directly from pcxt->nworkers_launched). If
zero workers show up, and parallel_leader_participation is off, but
pcxt->nworkers_launched/node->nreaders is non-zero, won't the Gather
never make forward progress?

Parallel CREATE INDEX does go a bit further. It assumes that
nworkers_launched *exactly* indicates the number of workers that
successfully underwent parallel initialization, and therefore can be
expected to show up.

Is there actually a meaningful difference between the way
nworkers_launched is depended upon in each case, though?

-- 
Peter Geoghegan



Re: [HACKERS] Supporting huge pages on Windows

2018-01-21 Thread Andres Freund
Hi,

On 2018-01-21 13:42:13 +0200, Magnus Hagander wrote:
> To add some more notes on this. Again, the API appears in Vista/2003.
> Windows Vista went EOL (out of extended support even) in April 2017,
> Windows 2003 did so in July 2015. Those are the versions that it's *in* --
> obviously the versions without it are even older.
> 
> Our binaries only support Windows 2008 and up (at least the ones by EDB,
> which are the ones that have a supported-version matrix documented on our
> site).
> 
> We have traditionally supported older versions of Windows as long as people
> build from source. But perhaps I'm way overreading that and we should just
> bite the bullet, commit this patch, and declare those platforms as
> completely dead by PostgreSQL 11?

Yea, I think it's beyond time that we declare some old windows versions
dead. There's enough weird behaviour in supported versions of windows
(especially its socket API) that I really don't want to support more
than necessary. And supporting versions that've been out of date for a
while seems more than unnecessary.

Greetings,

Andres Freund



Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall

2018-01-21 Thread Tom Lane
I've been hacking away at this patch, and attached is what I've got
so far.  I think this is committable, but if anyone wants to do
further review and testing, that'd be fine.

Per discussion, I got rid of the separate --set-db-properties switch:
additional database properties are now applied if and only if --create is
specified.  But the DATABASE TOC entry itself still contains only CREATE
DATABASE; the additional commands are carried in a "DATABASE PROPERTIES"
TOC entry so that they will be issued after reconnecting to the new DB.
This dodges the "ALTER DATABASE SET default_transaction_read_only"
problem.  (Furthermore, it turns out that we have to do it like that
because pg_restore issues any one TOC entry's contents as a single PQexec.
If you try to cram anything else in with the CREATE DATABASE, then the
server spits up because CREATE DATABASE becomes part of an implicit
transaction block.)

I also fixed it so that a database's comment, security label, and ACL are
restored only when saying --create.  This is different from the previous
behavior for DB comments and security labels, but it seems a great deal
more useful and consistent.  In no other case would pg_dump/pg_restore
attempt to restore a comment, security label, or ACL for an object it
hadn't created, so why should it act differently for databases?

Worth noting here is that if we accept that behavior, the problem for
which "COMMENT ON CURRENT_DATABASE" was proposed goes away, because
there's no longer a risk of trying to apply COMMENT ON DATABASE to the
wrong database.  We might still want that patch as a standalone feature,
but pg_dump no longer needs it.

Another point worth commenting on (this change was also in the v13 patch)
is that pg_dumpall formerly took some pains to omit the encoding and
locale specifications in its CREATE DATABASE commands, for databases whose
settings matched the cluster default.  This new implementation won't do
that.  We could change it to do so, but then a standalone "pg_dump
--create" would also act that way, which is not really easy to defend.
IIRC, the argument for pg_dumpall doing it like that was to make it
less painful to migrate a cluster to a new machine that might not have the
identical set of locales, or if one wanted to migrate everything to a new
encoding.  But those are not mainstream use-cases, and if you really need
to do that you can still get there by dumping databases individually
without using --create.  On the other hand, there are obvious gotchas
involved in letting a dump/reload silently change to another locale or
encoding.  So I think this is an improvement overall, but it bears
noting as a behavioral change.

Another point to note is that I dithered about whether to bump the
pg_dump archive version number, which would have the effect of preventing
pre-v11 versions of pg_restore from processing dumps made by this version.
The argument for breaking archive compatibility is that older pg_restore
versions will not know that it'd be a good idea to skip DATABASE
PROPERTIES TOC entries and database ACL entries if not using --create.
However, in default cases there won't be a DATABASE PROPERTIES entry to
skip.  Moreover, applying these entries unconditionally isn't that much
different conceptually from applying database comments or sec labels
unconditionally, as such older pg_restore versions would do anyway.
It also seems like if your back were against the wall, being able to
read a newer archive file with an older pg_restore is better than being
locked out of it completely.  So I'm leaning to no version bump, but
it could still be discussed.

One thing we could possibly use here is more regression test cases.
The only existing regression test that's affected by this patch is
002_pg_dump.pl's expectations about which cases will print a database
comment, so it seems like we're missing some behavioral coverage.
Not sure what that should look like though.

This patch has to be applied over the patches proposed in
https://www.postgresql.org/message-id/21714.1516553459%40sss.pgh.pa.us
Aside from touching some of the same code, this is dependent on
the changes made there to make comment, seclabel, and ACL entries
reliably identifiable.

As far as notable code changes go, I got rid of the previous patch's
move of executeQuery() into dumputils.c.  That had some undesirable
knock-on effects in terms of creating even more coupling between
different modules (through the g_verbose global), and it was basically
misguided anyway.  pg_dump executes queries via ExecuteSqlQuery; we
do not need a few of its functions to be using low-level routines with
behavior different from that.

If anyone wants to do further review on this, let me know; otherwise
I'll push it in a day or so.

regards, tom lane

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 08cad68..11582dd 100644
*** a/doc/src/sgml/ref/pg_dump.sgml
--- b/doc/src/sgml/ref/pg_dump.sgml

Bogus tags for comments, ACLs, and security labels in pg_dump

2018-01-21 Thread Tom Lane
There is a convention in pg_dump/pg_restore that the "tag" field in a
COMMENT TOC entry should be exactly what you'd write after COMMENT ON
to specify the comment's target; for example it might be "INDEX foo".
This is depended on in _tocEntryRequired, which wants to know which
comments are for large objects:

if (strcmp(te->desc, "SEQUENCE SET") == 0 ||
strcmp(te->desc, "BLOB") == 0 ||
(strcmp(te->desc, "ACL") == 0 &&
 strncmp(te->tag, "LARGE OBJECT ", 13) == 0) ||
--> (strcmp(te->desc, "COMMENT") == 0 &&
-->  strncmp(te->tag, "LARGE OBJECT ", 13) == 0) ||
(strcmp(te->desc, "SECURITY LABEL") == 0 &&
 strncmp(te->tag, "LARGE OBJECT ", 13) == 0))
res = res & REQ_DATA;
else
res = res & ~REQ_DATA;

While fooling with the pg_dump/pg_dumpall refactoring patch, I noticed
that somebody broke this for comments on databases, back around 8.2:
if you look at the code path for emitting database comments from >= 8.2,
it just uses the raw database name as the tag.  The pre-8.2 code path gets
that right, and it's easy to see by inspection that everyplace else that
creates a COMMENT TOC entry also gets it right.

This is problematic because a database named "LARGE OBJECT something"
will fool the above-mentioned code into misclassifying its comment
as data rather than schema.

As you can see from the above stanza, there's a similar expectation
for SECURITY LABEL TOC entries, and again dumpDatabase() is alone in
getting it wrong.

And we also see a similar expectation for ACLs, and that's a bit more of
a problem because there is not, in fact, any convention about including
an object type prefix in ACL tags.  Large objects do use tags like "LARGE
OBJECT nnn", but for other kinds of objects the tag is generally just the
bare object name.  So any object named "LARGE OBJECT something" will
have its ACL misclassified by this code.

While there's no visible effect in a pg_dump run with default options,
it's easy to demonstrate that pg_dump with -s will indeed omit comments,
ACLs, etc on objects named this way, which in turn means that pg_upgrade
would lose those properties.  Conversely, a data-only dump might
unexpectedly include commands to set comments, ACLs, etc on objects
named this way.

There's a potential security issue here, but the security team discussed
it and decided that there's not much chance for an interesting exploit.
An object owner can just change the comment or ACL on his object; there's
no need to try to trick someone else into doing it.  The security-label
case is more interesting, but since it only applies to databases the
attack surface seems small, and anyway there seem to be few people using
security labels so far.  Hence, I'm just treating this as a regular bug.

What I think we should do is fix pg_dump so that these classes of
TOC entries do indeed have tags that meet the expectation held by
_tocEntryRequired, as per the first attached patch.  (That patch also
fixes some lesser errors in the same place: dumpDatabase was passing the
database's OID as catalogId for its comment and seclabel TOC entries,
and the pre-8.2 code path neglected to specify the owner of the comment.
I'm not sure that these mistakes can cause any visible symptoms right
now, but they're still wrong.)

The second attached patch just rearranges some code so that pg_dump
uniformly emits comments, sec labels, and ACLs immediately after emitting
the TOC entry for the object they belong to.  In dumpDatabase, somebody
decided to stick a bunch of binary_upgrade TOC items in between.  That's
harmless at the moment, AFAIK, since pg_restore isn't tasked with making
very interesting choices during a binary upgrade, but it seems like
trouble waiting to happen; at minimum it violates the API contract
specified for dumpComment.  Similarly, dumpForeignDataWrapper was randomly
doing things in an atypical order (perhaps as a result of careless patch
application?) while dumpForeignServer dumped a server's comment only after
dumping user mappings for it.  Again I don't know of live bugs there, but
I have little patience for code that seems to have been assembled with the
aid of a dartboard.

I propose to apply the first patch to all supported branches.  I have
no reason to think the second patch is more than cosmetics and perhaps
future-proofing, so I only propose it for HEAD.

If there are not objections, I plan to push these pretty soon.

regards, tom lane

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 0bdd398..a8a41d3 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -2529,6 +2529,7 @@ dumpDatabase(Archive *fout)
 	PQExpBuffer dbQry = createPQExpBuffer();
 	PQExpBuffer delQry = createPQExpBuffer();
 	PQExpBuffer creaQry = createPQExpBuffer();
+	PQExpBuffer labelq = createPQExpBuffer();
 	PGconn	   *conn = GetConnection(fout);
 	PGresult   *res;
 	int			

Re: [HACKERS] Supporting huge pages on Windows

2018-01-21 Thread Magnus Hagander
On Sun, Jan 21, 2018 at 6:11 PM, Tom Lane  wrote:

> Magnus Hagander  writes:
> > With that, I'm pushing this. Let's see what the buildfarm thinks of it.
> And
> > if others end up complaining about the platform drop, but I doubt that.
>
> frogmouth:
>
> pg_shmem.c: In function 'PGSharedMemoryCreate':
> pg_shmem.c:205:3: warning: implicit declaration of function
> 'GetLargePageMinimum'
> pg_shmem.c:222:38: error: 'SEC_LARGE_PAGES' undeclared (first use in this
> function)
> pg_shmem.c:222:38: note: each undeclared identifier is reported only once
> for each function it appears in
> make[3]: *** [pg_shmem.o] Error 1
>
> so you were right to guess that this functionality isn't in XP.
>
> I wonder whether this could be band-aided around by using "#ifdef
> SEC_LARGE_PAGES" to protect the new code.  I have no particular desire to
> spend effort on supporting old Windows versions, but if there's someone
> out there who does, they could be asked to look into that.
>

I think that is not actually XP, in this case it's a case of the SDK being
too old. Which *might* happen on mingw on a modern platform as well.

The question is do we care enough. Or do we just declare XP as unsupported,
in which case frogmouth should stop building master.

I think the second one is OK, as long as it's only things that are as old
as XP. But I think we have to wait for a modre modern mingw (jacana?) to
complete before we can be sure.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: [HACKERS] Supporting huge pages on Windows

2018-01-21 Thread Tom Lane
Magnus Hagander  writes:
> With that, I'm pushing this. Let's see what the buildfarm thinks of it. And
> if others end up complaining about the platform drop, but I doubt that.

frogmouth:

pg_shmem.c: In function 'PGSharedMemoryCreate':
pg_shmem.c:205:3: warning: implicit declaration of function 
'GetLargePageMinimum'
pg_shmem.c:222:38: error: 'SEC_LARGE_PAGES' undeclared (first use in this 
function)
pg_shmem.c:222:38: note: each undeclared identifier is reported only once for 
each function it appears in
make[3]: *** [pg_shmem.o] Error 1

so you were right to guess that this functionality isn't in XP.

I wonder whether this could be band-aided around by using "#ifdef
SEC_LARGE_PAGES" to protect the new code.  I have no particular desire to
spend effort on supporting old Windows versions, but if there's someone
out there who does, they could be asked to look into that.

regards, tom lane



Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall

2018-01-21 Thread Tom Lane
Bruce Momjian  writes:
> On Fri, Jan 19, 2018 at 02:54:25PM -0500, Tom Lane wrote:
>> Command was: DROP DATABASE "template1";

> Uh, the oid of the template1 database is 1, and I assume we would want
> to preserve that too.

I don't feel any huge attachment to that.  In the first place, under
this proposal recreating template1 is something you would only need to do
if you weren't satisfied with its default properties as set by initdb.
Which ought to be a minority of users.  In the second place, if you are
changing those properties from the way initdb set it up, it's not really
virgin template1 anymore, so why shouldn't it have a new OID?

regards, tom lane



Re: [HACKERS] Refactor handling of database attributes between pg_dump and pg_dumpall

2018-01-21 Thread Bruce Momjian
On Fri, Jan 19, 2018 at 02:54:25PM -0500, Tom Lane wrote:
> Hmm ... so there's a small problem with this idea of dropping and
> recreating template1:
> 
> pg_restore: connecting to database for restore
> pg_restore: dropping DATABASE template1
> pg_restore: [archiver (db)] Error while PROCESSING TOC:
> pg_restore: [archiver (db)] Error from TOC entry 3024; 1262 1 DATABASE 
> template1
>  postgres
> pg_restore: [archiver (db)] could not execute query: ERROR:  cannot drop a 
> templ
> ate database
> Command was: DROP DATABASE "template1";

Uh, the oid of the template1 database is 1, and I assume we would want
to preserve that too.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +



Re: WIP: Covering + unique indexes.

2018-01-21 Thread Andrey Borodin
I feel sorry for the noise, switching this patch there and back. But the patch 
needs rebase again. It still applies with -3, but do not compile anymore.

Best regards, Andrey Borodin.

The new status of this patch is: Waiting on Author


Re: [HACKERS] Supporting huge pages on Windows

2018-01-21 Thread Magnus Hagander
On Sun, Jan 21, 2018 at 2:30 PM, Michael Paquier 
wrote:

> On Sun, Jan 21, 2018 at 01:42:13PM +0200, Magnus Hagander wrote:
> > We have traditionally supported older versions of Windows as long as
> people
> > build from source. But perhaps I'm way overreading that and we should
> just
> > bite the bullet, commit this patch, and declare those platforms as
> > completely dead by PostgreSQL 11?
>
> Yeah, I'd like to think that this is a good idea for HEAD. Microsoft is
> pushing hard to have all its users move to newer versions like 10
> anyway.
>

I got myself a working build env now so I can at least verify it builds,
which it does.

With that, I'm pushing this. Let's see what the buildfarm thinks of it. And
if others end up complaining about the platform drop, but I doubt that.

Once again, apologies for the delay, and thanks for the patch!

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: GSoC 2018 Project Ideas & Mentors - Last Call

2018-01-21 Thread Amit Kapila
On Thu, Jan 18, 2018 at 10:29 PM, Stephen Frost  wrote:
> Greetings!
>
> I've gone through and cleaned up our GSoC 2018 Wiki page:
>
> https://wiki.postgresql.org/wiki/GSoC_2018
>
> Please review!  If you have any last-minute items, please add them!
>

How about adding a project to support Unique capability for the Hash
Index?  I have previously shared some ideas on how to implement it [1]
and I can further elaborate it if required, but I think some
discussion is required.  I think it will be a good project for a new
person and I can help in completing it once we decide on some
particular design.  Now, if the student can complete this in time and
have the bandwidth to do more, we can ask him to support multi-column
hash indexes and the idea described by Tom to implement the same in
one of his emails [2] sounds doable to me in a reasonable amount of
time.

[1] - 
https://www.postgresql.org/message-id/CAA4eK1%2Bf%3DKpUW3TOW0P9LUCA8xhFujcjKhEnYRtoaB83LSsSMg%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/7192.1506527843%40sss.pgh.pa.us

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Use of term hostaddrs for multiple hostaddr values

2018-01-21 Thread Michael Paquier
On Sun, Jan 21, 2018 at 02:44:43PM +0200, Magnus Hagander wrote:
> Fair enough. And since you do the translation merging at least, I'll go
> with that.
> 
> Thus, applied and backpatched to 10.
> 
> Thanks!

OK, thanks all!
--
Michael


signature.asc
Description: PGP signature


Re: Use of term hostaddrs for multiple hostaddr values

2018-01-21 Thread Magnus Hagander
On Sun, Jan 21, 2018 at 12:45 AM, Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 1/20/18 17:39, Michael Paquier wrote:
> > On Sat, Jan 20, 2018 at 08:30:43PM +0200, Magnus Hagander wrote:
> >> On Tue, Jan 9, 2018 at 5:34 AM, Michael Paquier <
> michael.paqu...@gmail.com>
> >> wrote:
> >> These are both clear bugs, and the docs one should definitely be both
> >> applied and backpatched.
> >>
> >> How much do we care about the error message when it comes to
> backpatching?
> >> Maybe we should leave that one to 11 only, to avoid breaking that, as
> the
> >> way it's written it's actually less wrong there.
> >>
> >> Thoughts?
> >
> > Thanks for your input!
> >
> > Applying the error message portion only on HEAD is a good plan, there is
> > no point to make the life of translaters unnecessary painful.
>
> I would backpatch both.  The updated error message is arguably easier to
> translate.
>

Fair enough. And since you do the translation merging at least, I'll go
with that.

Thus, applied and backpatched to 10.

Thanks!

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: [HACKERS] Supporting huge pages on Windows

2018-01-21 Thread Michael Paquier
On Sun, Jan 21, 2018 at 01:42:13PM +0200, Magnus Hagander wrote:
> We have traditionally supported older versions of Windows as long as people
> build from source. But perhaps I'm way overreading that and we should just
> bite the bullet, commit this patch, and declare those platforms as
> completely dead by PostgreSQL 11?

Yeah, I'd like to think that this is a good idea for HEAD. Microsoft is
pushing hard to have all its users move to newer versions like 10
anyway.
--
Michael


signature.asc
Description: PGP signature


Re: [HACKERS] Supporting huge pages on Windows

2018-01-21 Thread Magnus Hagander
 Sat, Jan 20, 2018 at 2:33 PM, Magnus Hagander  wrote:

>
>
> On Fri, Dec 1, 2017 at 5:02 AM, Michael Paquier  > wrote:
>
>> On Fri, Nov 24, 2017 at 9:28 AM, Tsunakawa, Takayuki
>>  wrote:
>> > From: Thomas Munro [mailto:thomas.mu...@enterprisedb.com]
>> >> I hope Tsunakawa-san doesn't mind me posting another rebased version of
>> >> his patch.  The last version conflicted with the change from SGML to
>> XML
>> >> that just landed in commit 3c49c6fa.
>> >
>> > Thank you very much for keeping it fresh.  I hope the committer could
>> have some time.
>>
>> I have moved this patch to next CF. Magnus, you are registered as a
>> reviewer of this patch. Are you planning to look at it and potentially
>> commit it?
>>
>
> Apologies for the delays here. Yes, I was and am.
>
> I took a look at this patch again. I've made some small updates to
> comments etc. There was also from what I can tell one actual bug in the
> code -- a missing free() of delPrivs.
>
> The debug message for ERROR_NO_SYSTEM_RESOURCES said it turned off huge
> pages because of not enough huge pages, but AIUI it can be because of other
> resources as well. So I dropped that specific.
>
>
> However, reading through a few things -- we now use the
> API GetLargePageMinimum() unconditionally. This API appeared in Windows
> Vista and Windows 2003. That means that if we apply this patch, those are
> now our minimum versions of Windows. At least unless Microsoft backpatched
> things.
>
> This is probably equally true of some other things we do, but those are
> runtime, and we would just give an error on old platforms. If I'm thinking
> right, then direct linking to GetLargePageMinimum() will simply make it
> impossible to start a PostgreSQL backend with or without huge pages on
> previous versions, because it will give a linker error.
>
> I wonder if we need to do something similar to what we already do for
> CreateRestrictedToken() in pg_ctl.c for example. That one actually is done
> for compatibility with NT4/2000 -- CreateRestrictedToken was added in XP.
> So while we could consider getting rid of that workaround now, perhaps we
> need to put a similar one in place for this?
>
> I don't have a Windows build box around ATM, or a Windows XP, so if
> somebody could try the attached version, build a postgres and see if it
> even starts on a Windows XP machine, that would be useful input!
>
> If we can confirm/deny the situation around XP and decide what to do about
> it, I am now happy to commit the rest of this patch.
>
>
To add some more notes on this. Again, the API appears in Vista/2003.
Windows Vista went EOL (out of extended support even) in April 2017,
Windows 2003 did so in July 2015. Those are the versions that it's *in* --
obviously the versions without it are even older.

Our binaries only support Windows 2008 and up (at least the ones by EDB,
which are the ones that have a supported-version matrix documented on our
site).

We have traditionally supported older versions of Windows as long as people
build from source. But perhaps I'm way overreading that and we should just
bite the bullet, commit this patch, and declare those platforms as
completely dead by PostgreSQL 11?


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: [HACKERS] WIP: Covering + unique indexes.

2018-01-21 Thread Andrey Borodin

> 21 янв. 2018 г., в 3:36, Peter Geoghegan  написал(а):
> 
> On Wed, Jan 17, 2018 at 12:45 AM, Andrey Borodin  wrote:
>> Unfortunately, amcheck_next does not work currently on HEAD (there are 
>> problems with AllocSetContextCreate() signature), but I've tested 
>> bt_index_check() before, during and after pgbench, on primary and on slave. 
>> Also, I've checked bt_index_parent_check() on master.
> 
> I fixed that recently. It should be fine now.
Oh, sorry, missed that I'm using patched stale amcheck_next. Thanks!
Affirmative, amcheck_next works fine.

I run pgbench against several covering indexes. Checking before load, during 
and after, both on master and slave.
I do not observe any errors besides infrequent "canceling statement due to 
conflict with recovery", which is not a sign of any malfunction.

Best regards, Andrey Borodin.


Re: MCV lists for highly skewed distributions

2018-01-21 Thread Dean Rasheed
On 21 January 2018 at 07:26, John Naylor  wrote:
> I spent a few hours hacking on this, and it turns out calculating the
> right number of MCVs taking into account both uniform and highly
> non-uniform distributions is too delicate a problem for me to solve
> right now. The logic suggested by Dean Rasheed in [1] always produces
> no MCVs for a perfectly uniform distribution (which is good), but very
> often also for other distributions, which is not good. My efforts to
> tweak that didn't work, so I didn't get as far as adapting it for the
> problem Jeff is trying to solve.

Hmm, Tom suggested that the test based on the average frequency over
all values might be too strict because the estimated number of
distinct values is often too low, so that might explain what you're
seeing.

It occurs to me that maybe a better test to exclude a value from the
MCV list would be to demand that its relative standard error not be
too high. Such a test, in addition to the existing tests, might be
sufficient to solve the opposite problem of too many values in the MCV
list, because the real problem there is including a value after having
seen relatively few occurrences of it in the sample, and thus having a
wildly inaccurate estimate for it. Setting a bound on the relative
standard error would mean that we could have a reasonable degree of
confidence in estimates produced from the sample.

Regards,
Dean



Re: New gist vacuum.

2018-01-21 Thread Andrey Borodin
Hello, Alexander!
> 16 янв. 2018 г., в 21:42, Andrey Borodin  написал(а):
> Please find README patch attached.

Here's v2 version. Same code, but x2 comments. Also fixed important typo in 
readme BFS->DFS. Feel free to ping me any time with questions.

Best regards, Andrey Borodin.


0001-Delete-pages-during-GiST-VACUUM-v2.patch
Description: Binary data


0002-Physical-GiSC-scan-during-VACUUM-v2.patch
Description: Binary data


0003-Update-README-with-info-on-new-GiST-VACUUM-v2.patch
Description: Binary data


Re: pl/perl extension fails on Windows

2018-01-21 Thread Christian Ullrich

* Christian Ullrich wrote:


* Noah Misch wrote:


On Wed, Nov 29, 2017 at 11:45:35PM -0500, Tom Lane wrote:

Oh, OK.  In that case, we need to get some representatives of these
more modern builds into the buildfarm while we're at it.


Yep.  Among machines already in the buildfarm, the one running member
woodlouse is the best candidate for this.  Its owner could install
http://strawberryperl.com/download/5.26.1.1/strawberry-perl-5.26.1.1-32bit.msi
and setup another animal on the same machine that builds 32-bit and enables
Perl.  Christian, are you interested in doing this?


Ready to go, waiting for animal assignment. For now, I can confirm that it 
works, that is, the buildfarm --test run is successful.


Up and running now, name is whelk, first report on REL9_6_STABLE.

Sorry it took me another ten days to complete the configuration.

--
Christian