Re: [BUG] recovery of prepared transactions during promotion can fail

2023-06-21 Thread Julian Markwort
First off, thanks for the quick reaction and reviews, I appreciate it.

On Wed, 2023-06-21 at 14:14 +0900, Michael Paquier wrote:
> But that won't connect work as the segment requested is now a partial
> one in the primary's pg_wal, still the standby wants it.

I think since 009_twophase.pl doesn't use archiving so far, it's not a good 
idea to enable it generally, for all those
tests. It changes too much of the behaviour.

> I think that it is better
> for now to just undo has_archiving in has_archiving, and tackle the
> coverage with a separate test, perhaps only for HEAD.

I see you've already undone it.
Attached is a patch for 009_twophase.pl to just try this corner case at the 
very end, so as not to influence other
existing tests in suite.

When I run this on REL_14_8 I get the error again, sort of as a sanity check...

Kind regards
Julian
diff --git a/src/test/recovery/t/009_twophase.pl b/src/test/recovery/t/009_twophase.pl
index 900d181788..706fc1bc10 100644
--- a/src/test/recovery/t/009_twophase.pl
+++ b/src/test/recovery/t/009_twophase.pl
@@ -7,7 +7,7 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 24;
+use Test::More tests => 26;
 
 my $psql_out = '';
 my $psql_rc  = '';
@@ -480,3 +480,34 @@ $cur_standby->psql(
 is( $psql_out,
 	qq{27|issued to paris},
 	"Check expected t_009_tbl2 data on standby");
+
+###
+# Check recovery of prepared transaction with DDL inside after a hard restart
+# of the primary, now with archiving enabled, to test handling of .partial files.
+###
+
+$cur_primary->enable_archiving;
+$cur_primary->restart;
+
+$cur_primary->psql(
+	'postgres', "
+	BEGIN;
+	CREATE TABLE t_009_tbl7 (id int, msg text);
+	SAVEPOINT s1;
+	INSERT INTO t_009_tbl7 VALUES (31, 'issued to ${cur_primary_name}');
+	PREPARE TRANSACTION 'xact_009_18';");
+
+$cur_primary->stop('immediate');
+$cur_primary->set_standby_mode;
+$cur_primary->start;
+
+$cur_primary->promote;
+
+my $logfile = slurp_file($cur_primary->logfile());
+ok( $logfile =~
+	  qr/recovering prepared transaction .* from shared memory/,
+	'want to see that a prepared transaction was recovered');
+
+# # verify that recovery and promotion finished and that the prepared transaction still exists.
+$psql_rc = $cur_primary->psql('postgres', "COMMIT PREPARED 'xact_009_18'");
+is($psql_rc, '0', 'Commit prepared transaction after crash recovery');


[BUG] recovery of prepared transactions during promotion can fail

2023-06-16 Thread Julian Markwort
Hey everyone,

I've discovered a serious bug that leads to a server crash upon promoting an 
instance that crashed previously and did
recovery in standby mode.

The bug is present in PostgreSQL versions 13 and 14 (and in earlier versions, 
though it doesn't manifest itself so
catastrophically).
The circumstances to trigger the bug are as follows:
- postgresql is configured for hot_standby, archiving, and prepared transactions
- prepare a transaction
- crash postgresql
- create standby.signal file
- start postgresql, wait for recovery to finish
- promote

The promotion will fail with a FATAL error, stating that "requested WAL segment 
.* has already been removed".
The FATAL error causes the startup process to exit, so postmaster shuts down 
again.

Here's an exemplary log output, maybe this helps people to find this issue when 
they search for it online:

LOG:  consistent recovery state reached at 0/15D8AB0
LOG:  database system is ready to accept read only connections
LOG:  received promote request
LOG:  redo done at 0/15D89B8
LOG:  last completed transaction was at log time 2023-06-16 13:09:53.71118+02
LOG:  selected new timeline ID: 2
LOG:  archive recovery complete
FATAL:  requested WAL segment pg_wal/00010001 has already been 
removed
LOG:  startup process (PID 1650358) exited with exit code 1
LOG:  terminating any other active server processes
LOG:  database system is shut down


The cause of this failure is an oversight (rather obvious in hindsight):
The renaming of the WAL file (that was last written to before the crash 
happened) to .partial is done *before* PostgreSQL
might have to read this very file to recover prepared transactions from it.
The relevant function calls here are durable_rename() and 
RecoverPreparedTransactions() in xlog.c .

Note that it is important that the PREPARE entry is in the WAL file that 
PostgreSQL is writing to prior to the inital
crash.
This has happened repeatedly in production already with a customer that uses 
prepared transactions quite frequently.
I assume that this has happened for others too, but the circumstances of the 
crash and the cause are very dubious, and
troubleshooting it is pretty difficult.


This behaviour has - apparently unintentionally - been fixed in PG 15 and 
upwards (see commit 811051c ), as part of a
general restructure and reorganization of this portion of xlog.c (see commit 
6df1543 ).

Furthermore, it seems this behaviour does not appear in PG 12 and older, due to 
another possible bug:
In PG 13 and newer, the XLogReaderState is reset in XLogBeginRead() before 
reading WAL in XlogReadTwoPhaseData() in
twophase.c .
In the older releases (PG <= 12), this reset is not done, so the requested LSN 
containing the prepared transaction can
(by happy coincidence) be read from in-memory buffers, and PostgreSQL 
consequently manages to come up just fine (as the
WAL has already been read into buffers prior to the .partial rename).
If the older releases also where to properly reset the XLogReaderState, they 
would also fail to find the LSN on disk, and
hence PostgreSQL would crash again.

I've attached patches for PG 14 and PG 13 that mimic the change in PG15 (commit 
811051c ) and reorder the crucial events,
placing the recovery of prepared transactions *before* renaming the file.
I've also attached recovery test scripts for PG >= 12 and PG <= 11 that can be 
used to verify that promote after recovery
with prepared transactions works.

A note for myself in the future and whomever may find it useful:
The test can be copied to src/test/recovery/t/ and selectively run (after 
you've ./configure'd for TAP testing and
compiled everything) from within the src/test/recovery directory using 
something like:
make check PROVE_TESTS='t/PG_geq_12_promote_prepare_xact.pl'


My humble opinion is that this fix should be backpatched to PG 14 and PG 13.
It's debatable whether the fix needs to be brought back to 12 and older also, 
as those do not exhibit this issue, but the
order of renaming is still wrong.
I'm not sure if there could be cases where the in-memory buffers of the 
walreader are too small to cover a whole WAL
file.
There could also be other issues from operations that require reading WAL that 
happen after the .partial rename, I
haven't checked in depth what else happens in the affected codepath.
Please let me know if you think this should also be fixed in PG 12 and earlier, 
so I can produce the patches for those
versions as well.


Kind regards
Julian

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index a05a82119e..e0d9b89d78 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7858,6 +7858,60 @@ StartupXLOG(void)
 			CreateCheckPoint(CHECKPOINT_END_OF_RECOVERY | CHECKPOINT_IMMEDIATE);
 	}
 
+	/*
+	 * Preallocate additional log files, if wanted.
+	 */
+	PreallocXlogFiles(EndOfLog);
+
+	/*
+	 * Okay, we're officially UP.
+	 */
+	InRecovery = false;
+
+	/* 

Re: [PATCH] pg_hba.conf : new auth option : clientcert=verify-full

2018-08-03 Thread Julian Markwort

On 03.08.2018 at 08:09 David Fetter wrote:


I've rebased the patch atop master so it applies and passes 'make
check-world'. I didn't make any other changes.

Best,
David.

Much appreciated!



Re: [PATCH] pg_hba.conf : new auth option : clientcert=verify-full

2018-07-30 Thread Julian Markwort

On 07/19/2018 03:00 AM, Thomas Munro wrote:

Some more comments:

 if (parsedline->auth_method == uaCert)
 {
-   parsedline->clientcert = true;
+   parsedline->clientcert = clientCertOn;
 }

The "cert" method is technically redundant with this patch, because
it's equivalent to "trust clientcert=verify-full", right?  That gave
me an idea: wouldn't the control flow be a bit nicer if you just
treated uaCert the same as uaTrust in the big switch statement, but
also enforced that clientcert = clientCertFull in the hunk above?
Then you could just have one common code path to reach CheckCertAuth()
for all auth methods after that switch statement, instead of the more
complicated conditional coding you have now with two ways to reach it.
Would that be better?
That would result in a couple less LOC and a bit clearer conditionals, I 
agree.
If there are no objections to make uaCert a quasi-alias of uaTrust with 
clientcert=verify-full, I'll go ahead and change the code accordingly.
I'll make sure that the error messages are still handled based on the 
auth method and not only depending on the clientcert flag.



In the "auth-cert" section there are already a couple of examples
using lower case "cn":

 will be sent to the client.  The cn (Common Name)

 is a check that the cn attribute matches the database

I guess you should do the same, or if there is a good reason to use
upper case "CN" instead then you should change those to match.
I see that there are currently no places that use CN  
right now.
However, we refer to Certification Authority as CA in most cases 
(without the literal tag surrounding it).
The different fields of certificates seem to be abbreviated with capital 
letters in most literature; That was my reasoning for capitalizing it in 
the first place.
I'm open for suggestions, but in absence of objections I might just 
capitalize all occurrences of CN.



There is still a place in the documentation where we refer to "1":

 In a pg_hba.conf record specifying certificate
 authentication, the authentication option clientcert is
 assumed to be 1, and it cannot be turned off
since a client
 certificate is necessary for this method.  What the cert
 method adds to the basic clientcert certificate
validity test
 is a check that the cn attribute matches the database
 user name.

Maybe we should use "verify-ca" here, though as I noted above I think
it should really "verify-full" and we should simplify the flow.

Yes, we should adopt the new style in all places.
I'll rewrite that passage to indicate that cert authentication 
essentially results in the same behaviour as clientcert=verify-full.
The existing text is somewhat ambiguous anyway, in case somebody decides 
to skip over the restriction described in the second sentence.



I think we should also document that "1" and "0" are older spellings
still accepted, where the setting names are introduced.

+typedef enum ClientCertMode
+{
+   clientCertOff,
+   clientCertOn,
+   clientCertFull
+} ClientCertMode;
+

+1

What do you think about using clientCertCA for the enumerator name
instead of clientCertOn?  That would correspond better to the names
"verify-ca" and "verify-full".


+1
I'm not sure if Magnus had any other cases in mind when he named it 
clientCertOn?



Should I mark this entry as "Needs review" in the commitfest? It seems 
some discussion is still needed to get this commited...


kind regards
Julian




Re: [PATCH] pg_hba.conf : new auth option : clientcert=verify-full

2018-07-14 Thread Julian Markwort
Hi Thomas,

here's a rebased patch, with your observations corrected.

Thomas Munro wrote on 2018-07-13:
> +   In this case, the CN (nommon name) provided in
> "common name"
> +   CN (Common Name) in the certificate matches
> "common"? (why a capital letter here?)

I've resorted to "CN (Common Name)" on all occurences in 
this patch now.

Also, while writing this part of the docs, I tried to stay below 80 characters, 
but I've exceeded it in some places.
There are several other places (several in the .sgml files touched by this 
patch), where 80 characters are exceeded; How close should one adhere to that 
limit nowadays?


> This line isn't modified by your patch, but I saw it while in
> proof-reading mode:
>   *err_msg = "clientcert can not be set to 0 when using \"cert\"
> authentication";
> I think "can not" is usually written "cannot"?

I'm not sure about can not, cannot, can't... There are 56, respectively 12697, 
and 2024 occurrences in master right now.
We could touch those lines now and change them to the more common cannot, or we 
can leave it as is...


> Yeah.  The packages to install depend on your operating system, and in
> some cases (macOS, Windows?) which bolt-on package thingamajig you
> use, though.  Perhaps the READMEs could be improved with details for
> systems we have reports about (like the recently added "Requirements"
> section of src/test/ldap/README).

That would be nice, however I could only provide the package names for Fedora 
right now...
Would It make sense to add those on their own?
Or should somebody (maybe myself, when I'm less busy) gather those for most 
supported systems and commit them as a whole?

kind regards
Julian
diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml
index 656d5f94..40dc0ef7 100644
--- a/doc/src/sgml/client-auth.sgml
+++ b/doc/src/sgml/client-auth.sgml
@@ -563,10 +563,17 @@ hostnossl  database  user
   
In addition to the method-specific options listed below, there is one
method-independent authentication option clientcert, which
-   can be specified in any hostssl record.  When set
-   to 1, this option requires the client to present a valid
-   (trusted) SSL certificate, in addition to the other requirements of the
-   authentication method.
+   can be specified in any hostssl record.
+   This option can be set to verify-ca or
+   verify-full. Both options require the client
+   to present a valid (trusted) SSL certificate, while
+   verify-full additionally enforces that the
+   CN (Common Name) in the certificate matches
+   the username or an applicable mapping.
+   This behavior is similar to the cert authentication method
+   (see  ) but enables pairing
+   the verification of client certificates with any authentication
+   method that supports hostssl entries.
   
  
 
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 1c92e7df..afdcc729 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -2287,13 +2287,25 @@ pg_dumpall -p 5432 | psql -d postgres -p 5433
(CAs) you trust in a file in the data
directory, set the parameter  in
postgresql.conf to the new file name, and add the
-   authentication option clientcert=1 to the appropriate
+   authentication option clientcert=verify-ca or
+   clientcert=verify-full to the appropriate
hostssl line(s) in pg_hba.conf.
A certificate will then be requested from the client during SSL
connection startup.  (See  for a description
-   of how to set up certificates on the client.)  The server will
-   verify that the client's certificate is signed by one of the trusted
-   certificate authorities.
+   of how to set up certificates on the client.)
+  
+
+  
+   For a hostssl entry with 
+   clientcert=verify-ca, the server will verify
+   that the client's certificate is signed by one of the trusted
+   certificate authorities. If clientcert=verify-full
+   is specified, the server will not only verify the certificate
+   chain, but it will also check whether the username or its mapping
+   matches the CN (Common Name) of the provided certificate.
+   Note that certificate chain validation is always ensured when the
+   cert authentication method is used
+   (see ).
   
 
   
@@ -2317,14 +2329,33 @@ pg_dumpall -p 5432 | psql -d postgres -p 5433
client certificates against its CA file, if one is configured  but
it will not insist that a client certificate be presented.
   
+  
 
+  
+   Enforcing Client Certificates
   
-   If you are setting up client certificates, you may wish to use
-   the cert authentication method, so that the certificates
-   control user authentication as well as providing connection security.
-   See  for details.  (It is not necessary to
-   specify clientcert=1 explicitly when using
-   the cert authentication method.)
+   There are two approaches to enforce that users provide a certificate 

Re: [FEATURE PATCH] pg_stat_statements with plans (v02)

2018-04-18 Thread Julian Markwort
On Fri, 2018-04-06 at 13:57 -0700, legrand legrand wrote:
> At startup time there are 2 identical plans found in the view
> I thought it should have be only one: the "initial" one 
> (as long as there is no "good" or "bad" one).

Yes, those are 'remnants' from the time where I had two columns, one
for good_plan and one for bad_plan.
Whenever only one plan existed (either because the statement hadn't
been executed more than once or because the deviation from the previous
plan's time wasn't significant enough), the patch would display the
same plan string for both columns.

I'll have to be a little creative, but I think I'll be able to print
only one plan if both the good and bad plan are essentially the
"inital" plan.

You can be assured that the plan string isn't saved twice for those
cases, the good_plan and bad_plan structs just used the same  offset
for the qtexts file.

> maybe 3 "plan_types" like "init", "good" and "bad" should help.

I think, the categorization into good and bad plans can be made based
on the execution time of said plan.
I'd suggest using something like
SELECT * FROM pg_stat_statements_plans GROUP BY queryid ORDER BY
plan_time;
This way, if there is only one row for any queryid, this plan is
essentially the "initial" one. If there are two plans, their goodness
can be infered from the exection times...

> Will a new line be created for good or bad plan if the plan is the
> same ?
> shouldn't we capture "constants" and "parameters" inspite ?

Capturing constants and parameters would require us to normalize the
plan, which isn't done currently. (Primarily. because it would involve
a lot of logic; Secondly, because for my use case, I'm interested in
the specific constants and parameters that led to a good or bad plan)

Do you have a use case in mind which would profit from normalized
plans?

> Is query text needed in plan? 
> it can be huge and is already available (in normalized format) 
> in pg_stat_statements. If realy needed in raw format 
> (with constants) we could force pgss to store it (in replacement of
> normalize one)
> using a guc pg_stat_statements.normalized = false (I have a patch to
> do
> that).

The query is part of what's returned by an explain statement, so I've
thought to keep the 'plan' intact.
Since all long strings (queries and plans alike) are stored in a file
on disk, I'm not too much worried about the space these strings take
up.
I'd think that a hugely long query would lead to an even longer plan,
in which case the problem to focus on might not be to reduce the length
of the string stored by a statistical plugin...

> 
> In Thread:
> http://www.postgresql-archive.org/Poc-pg-stat-statements-with-planid-
> td6014027.html
> pg_stat_statements view has a new key with  dbid,userid,queryid +
> planid
> and 2 columns added: "created" and "last_updated".

I've taken a look at your patch. I agree that having a plan identifier
would be great, but I'm not a big fan of the jumbling. That's a lot of
hashing that needs to be done to decide wether two plans are
essentially equivalent or not.

Maybe, in the future, in a different patch, we could add functionality
that would allow us to compare two plan trees. There is even a remark
in the header of src/backend/nodes/equalfuncs.c regarding this:
 * Currently, in fact, equal() doesn't know how to compare Plan trees
 * either.  This might need to be fixed someday.

> May be your view pg_stat_statements_plans could be transformed :
> same key as pgss: dbid,userid,queryid,planid 
> and THE plan column to store explain result (no more need for
> plan_time nor
> tz that 
> are already available in modified pgss).

The documentation [of pg_stat_statements] gives no clear indication
which fields qualify as primary keys, mainly because the hashing that
generates the queryid isn't guaranteed to produce unique results...
So I'm not sure which columns should be used to create associations
between pg_stat_statements and pg_stat_statements_plans.


> Thoses changes to pgss are far from being accepted by community:
> - planid calculation has to be discussed (I reused the one from
> pg_stat_plans, 
>   but I would have prefered explain command to provide it ...),
> - "created" and "last_updated" columns also,
> - ...
> 
> So maybe your stategy to keep pgss unchanged, and add extensions view
> is
> better.
> There is a risk of duplicated informations (like execution_time,
> calls, ...)

It might be possible to completely seperate these two views into two
extensions. (pg_stat_statements_plans could use it's own hooks to
monitor query execution)
But that would probably result in a lot of duplicated code (for the
querytexts file) and would be a lot more elaborate to create, as I'd
need to recreate most of the hashmaps and locks.

Piggybacking onto the existing logic of pg_stat_statements to store a
plan every once in a while seems simpler to me.


Regards
Julian

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH] pg_hba.conf : new auth option : clientcert=verify-full

2018-04-13 Thread Julian Markwort
On Tue, 2018-04-10 at 18:35 +0200, Magnus Hagander wrote:
> As Peter mentionde, there are in src/test/ssl. I forgot about those,
> but yes, it would be useful to have that.
I've added three tests:
- verify-full specified, CN and username match -- should connect ok
- verify-full specified, CN and username do not match -- should fail
- verify-ca specified, CN and username do not match -- should connect
ok (This is current behaviour)
> That depends on your authenticaiton method. Specifically for
> passwords, what happens is there are actually two separate
> connections -- the first one with no password supplied, and the
> second one with a password in it.
Makes sense.
> We could directly reject the connection after the first one and not
> ask for a password. I'm not sure if that's better though -- that
> would leak the information on *why* we rejected the connection.
This wouldn't be desirable, I think...
Most applications will probably supply the password in the connection
string anyway, so there would be only one connection, right?

> It might definitely be worth shorting it yeah. For one, we can just
> use "cn" :) 
I've shortened the clientcert=verify-full specific error-message to
say:
"certificate validation (clientcert=verify-full) failed for
user \"%s\": CN mismatch"


slightly offtopic opinion:
While creating the test cases, I stumbled upon the problem of missing
depencies to run the tests...
It's complicated enough that the binaries used by these perl tests are
not named similar to the packages which provide them (the 'prove'
binary is supplied by 'Test-Harness'), so maybe in the interest of
providing a lower entry-barrier to running these tests, we could give a
more detailed error message in the configure script, when using --
enable-tap-tests ?


Juliandiff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml
index 53832d0..5ee8cbe 100644
--- a/doc/src/sgml/client-auth.sgml
+++ b/doc/src/sgml/client-auth.sgml
@@ -563,10 +563,17 @@ hostnossl  database  user
   
In addition to the method-specific options listed below, there is one
method-independent authentication option clientcert, which
-   can be specified in any hostssl record.  When set
-   to 1, this option requires the client to present a valid
-   (trusted) SSL certificate, in addition to the other requirements of the
-   authentication method.
+   can be specified in any hostssl record.
+   This option can be set to verify-ca or
+   verify-full. Both options require the client
+   to present a valid (trusted) SSL certificate, while
+   verify-full additionally enforces that the
+   CN (Common Name) in the certificate matches
+   the username or an applicable mapping.
+   This behaviour is similar to the cert autentication method
+   (see  ) but enables pairing
+   the verification of client certificates with any authentication
+   method that supports hostssl entries.
   
  
 
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 330e38a..6502df3 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -2287,13 +2287,25 @@ pg_dumpall -p 5432 | psql -d postgres -p 5433
(CAs) you trust in a file in the data
directory, set the parameter  in
postgresql.conf to the new file name, and add the
-   authentication option clientcert=1 to the appropriate
+   authentication option clientcert=verify-ca or
+   clientcert=verify-full to the appropriate
hostssl line(s) in pg_hba.conf.
A certificate will then be requested from the client during SSL
connection startup.  (See  for a description
-   of how to set up certificates on the client.)  The server will
-   verify that the client's certificate is signed by one of the trusted
-   certificate authorities.
+   of how to set up certificates on the client.)
+  
+
+  
+   For a hostssl entry with 
+   clientcert=verify-ca, the server will verify
+   that the client's certificate is signed by one of the trusted
+   certificate authorities. If clientcert=verify-full
+   is specified, the server will not only verify the certificate
+   chain, but it will also check whether the username or it's
+   mapping match the common name (CN) of the provided certificate.
+   Note that certificate chain validation  is always ensured when
+   cert authentication method is used
+   (see ).
   
 
   
@@ -2317,14 +2329,33 @@ pg_dumpall -p 5432 | psql -d postgres -p 5433
client certificates against its CA file, if one is configured  but
it will not insist that a client certificate be presented.
   
+  
 
+  
+   Enforcing Client Certificates
   
-   If you are setting up client certificates, you may wish to use
-   the cert authentication method, so that the certificates
-   control user authentication as well as providing connection security.
-   See  for details.  (It is not necessary to
-   specify clientcert=1 explicitly when using
-   the 

Re: [PATCH] pg_hba.conf : new auth option : clientcert=verify-full

2018-04-10 Thread Julian Markwort
On Fri, 2018-04-06 at 20:31 +0200, Magnus Hagander wrote:
> I've been through this one again.
Thanks for taking the time!

> There is one big omission from it -- it fails to work with the view
> pg_hba_file_rules. When fixing that, things started to look kind of
> ugly with the "two booleans to indicate one thing", so I went ahead
> and changed it to instead be an enum of 3 values.
Oh, I missed that view. To be honest, it wasn't even on my mind that
there could be a view depending on pg_hba

> Also, now when using verify-full or verify-ca, I figured its a lot
> easier to misspell the value, and we don't want that to mean "off".
> Instead, I made it give an error in this case instead of silently
> treating it as 0.
Good catch!

> The option "2" for clientcert was never actually documented, and I
> think we should get rid of that. We need to keep "1" for backwards
> compatibility (which arguably was a bad choice originally, but that
> was many years ago), but let's not add another one. 
> I also made a couple of very small changes to the docs.
> 
> Attached is an updated patch with these changes. I'd appreciate it if
> you can run it through your tests to confirm that it didn't break any
> of those usecases.
I've tested a couple of things with this and it seems to work as
expected. Unforunately, there are no tests for libpq, afaik. But
testing such features would become complicated quite quickly, with the
need to generate certificates and such...

I've noticed that the error message for mismatching CN is now printed
twice when using password prompts, as all authentication details are
checked upon initiation of a connection and again later, after sending
the password.

> 2018-04-10 13:54:19.882 CEST [8735] LOG:  provided user name
> (testuser) and authenticated user name (nottestuser) do not match
> 2018-04-10 13:54:19.882 CEST [8735] LOG:  certificate validation
> failed for user "testuser": common name in client certificate does
> not match user name or mapping, but clientcert=verify-full is
> enabled.
> 2018-04-10 13:54:21.515 CEST [8736] LOG:  provided user name
> (testuser) and authenticated user name (nottestuser) do not match
> 2018-04-10 13:54:21.515 CEST [8736] LOG:  certificate validation
> failed for user "testuser": common name in client certificate does
> not match user name or mapping, but clientcert=verify-full is
> enabled.
> 2018-04-10 13:54:21.515 CEST [8736] FATAL:  password authentication
> failed for user "testuser"
> 2018-04-10 13:54:21.515 CEST [8736] DETAIL:  Connection matched
> pg_hba.conf line 94: "hostssl all testuser
> 127.0.0.1/32  password clientcert=verify-full"

Also, as you can see, there are two LOG messages indicating the
mismatch -- "provided user name (testuser) and authenticated user name
(nottestuser) do not match" comes from hba.c:check_usermap() and
"certificate validation failed for user "testuser": common name in
client certificate does not match user name or mapping, but
clientcert=verify-full is enabled." is the message added in
auth.c:CheckCertAuth()  by the patch.
Maybe we should shorten the latter LOG message?

regards
Julian

smime.p7s
Description: S/MIME cryptographic signature


Re: [FEATURE PATCH] pg_stat_statements with plans (v02)

2018-04-06 Thread Julian Markwort
On Thu, 2018-03-22 at 11:16 -0700, legrand legrand wrote:
> Reading other pg_stat_statements threads on this forum, there are
> also activ
> developments to add:
> - planing duration,
> - first date,
> - last_update date,

As I see it, planning duration, first date, and last update date would
be columns added to the pg_stat_statements view, i.e. they are tracked
for each kind of a (jumbled) query -- just as the good and bad plans,
their associated execution times and timestamps are.

> - parameters for normalized queries,

I've reviewed Vik Fearing's patch for this and have not heard back from
him. Also, as you have already explained in your summary post, these
parameters only aid in the examination of current plans and offers no
information regarding plans used in the past.

> I was wondering about how would your dev behave with all those new
> features.
> It seems to me that bad and good plans will not have any of thoses
> informations.

A patch that adds planning durations and timestamps associated with queries 
wouldn't interefere with my plans patch.
However, we could think about capturing the planning durations for the plans 
recorded by my patch.

The patch for tracking first-time parameters for normalized queries has
a different use case, compared to this patch. It shouldn't interfere
with my patch, anyway.

> Last question, didn't you think about a model to store all the
> different
> plans using a planid like
>
> queryid, planid, query, ...
> aaa plan1 ...
> aaa plan2 ...
> aaa plan3 ...
> ...
>
> I can not imagine that there would be so many of them ;o)

This wasn't obvious to me during development, as each entry (with a certain 
queryid) is directly connected to two plans.
But with future development in mind it probably makes sense to separate the 
plans from the rest of pg_stat_statements.

This would also allow us to keep old plans, while only storing new ones that 
are not equivalent, essentially providing a history of the plans used.
Keep in mind that this check for equivalence would require further development 
and we'd have to make sure we're not consuming too much memory (however much 
that is) when storing possibly infinite amounts of plans.

I've created a draft patch that provides access to plans in a view called 
pg_stat_statements_plans.
There is no column that indicates whether the plan is "good" or "bad", because 
that is evident from the execution time of both plans and because that would 
require some kind of plan_type for all plans that might be stored in future 
versions.

Please take it for a spin and tell me, whether the layout and handling of the 
view make sense to you.

Julian
diff --git a/contrib/pg_stat_statements/Makefile b/contrib/pg_stat_statements/Makefile
index 39b368b..49bb462 100644
--- a/contrib/pg_stat_statements/Makefile
+++ b/contrib/pg_stat_statements/Makefile
@@ -4,7 +4,8 @@ MODULE_big = pg_stat_statements
 OBJS = pg_stat_statements.o $(WIN32RES)
 
 EXTENSION = pg_stat_statements
-DATA = pg_stat_statements--1.4.sql pg_stat_statements--1.4--1.5.sql \
+DATA = pg_stat_statements--1.4.sql \
+	pg_stat_statements--1.5--1.6.sql pg_stat_statements--1.4--1.5.sql \
 	pg_stat_statements--1.3--1.4.sql pg_stat_statements--1.2--1.3.sql \
 	pg_stat_statements--1.1--1.2.sql pg_stat_statements--1.0--1.1.sql \
 	pg_stat_statements--unpackaged--1.0.sql
diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index 5318c35..e6725ed 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -395,4 +395,33 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
  SELECT pg_stat_statements_reset() | 1 |1
 (8 rows)
 
+-- test to see if any plans have been recorded.
+SELECT
+  CASE WHEN plan_time > 0 THEN 1 ELSE 0 END,
+  CASE WHEN plan_timestamp >= timestamp '1970-01-01 00:00:00' THEN 1 ELSE 0 END
+FROM pg_stat_statements_plans ORDER BY plan_timestamp;
+ case | case 
+--+--
+1 |1
+1 |1
+1 |1
+1 |1
+1 |1
+1 |1
+(6 rows)
+
+-- test if there is some text in the recorded plans.
+SELECT substr(plan, 0, 11) FROM pg_stat_statements_plans ORDER BY plan_timestamp;
+   substr   
+
+ Query Text
+ Query Text
+ Query Text
+ Query Text
+ Query Text
+ Query Text
+ Query Text
+ Query Text
+(8 rows)
+
 DROP EXTENSION pg_stat_statements;
diff --git a/contrib/pg_stat_statements/pg_stat_statements--1.5--1.6.sql b/contrib/pg_stat_statements/pg_stat_statements--1.5--1.6.sql
new file mode 100644
index 000..907c84a
--- /dev/null
+++ b/contrib/pg_stat_statements/pg_stat_statements--1.5--1.6.sql
@@ -0,0 +1,86 @@
+/* contrib/pg_stat_statements/pg_stat_statements--1.5--1.6.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION pg_stat_statements UPDATE TO '1.6'" to load this file. 

Re: [PATCH] pg_hba.conf : new auth option : clientcert=verify-full

2018-04-01 Thread Julian Markwort
On 1. of April 2018 17:46:38 MESZ wrote Magnus Hagander :

>I assume this is a patch that's intended to be applied on top of the
>previous patch? If so, please submit the complete pach to make sure the
>correct combination ends up actually being reviewed.

The v02.patch attached to my last mail contains both source and documentation 
changes.

>Keeping patches as short as possible is not a good thing itself. The
>important part is that the resulting code and documentation is the best
>possible. Sometimes you might want to turn it into two patches
>submitted at
>the same time if one is clearly just reorganisation, but avoiding code
>restructure just to keep the lines of patch down is not helpful.

I think I've made the right compromises regarding readability of the 
documentation in my patch then.

A happy Easter, passover, or Sunday to you
Julian



Re: [PATCH] pg_hba.conf : new auth option : clientcert=verify-full

2018-03-23 Thread Julian Markwort
On Sat, 2018-03-17 at 18:24 +0100, Magnus Hagander wrote:
> > The error message "certificate authentication failed for user XYZ:
> > 
> > client certificate contains no user name" is the result of calling
> > 
> > CheckCertAuth when the user presented a certificate without a CN in
> > it.
> 
> That is arguably wrong, since it's actually password authentication
> that fails. That is the authentication type that was picked in
> pg_hba.conf. It's more "certificate validation" that failed.

I think we got confused about this; maybe I didn't graps it fully
before: CheckCertAuth is currently only called when auth method cert is
used. So it actually makes sense to say that certificate authentication
failed, I think.

> I agree that the log message is useful. Though it could be good to
> clearly indicate that it was caused specifically because of the
> verify-full, to differentiate it from other cases of the same
> message.
I've modified my patch so it still uses CheckCertAuth, but now a
different message is written to the log when clientcert=verify-full was
used.
For auth method cert, the function should behave as before.


> For example, what about the scenario where I use GSSAPI
> authentication and clientcert=verify-full. Then we suddenly have
> three usernames (gssapi, certificate and specified) -- how is the
> user supposed to know which one came from the cert and which one came
> from gssapi for example?
The user will only see what's printed in the auth_failed() function in
auth.c with the addition of the logdetail string, which I don't touch
with this patch.
As you said, it makes sense that more detailed information is only
available in the server's log.

I've attached an updated version of the patch.
I'm not sure if it is preferred to keep patches as short as possible
(mostly with respect to the changed lines in the documentation) or to
organize changes so that the text matches the surrounding column width
und text flow? Also, I've omitted mentions of the current usage
'clientcert=1' - this is still supported in code, but I think telling
new users only about 'clientcert=verify-ca' and 'clientcert=verify-
full' is clearer. Or am I wrong on this one?

Greetings
Juliandiff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml
index 53832d0..11c5961 100644
--- a/doc/src/sgml/client-auth.sgml
+++ b/doc/src/sgml/client-auth.sgml
@@ -563,10 +563,17 @@ hostnossl  database  user
   
In addition to the method-specific options listed below, there is one
method-independent authentication option clientcert, which
-   can be specified in any hostssl record.  When set
-   to 1, this option requires the client to present a valid
-   (trusted) SSL certificate, in addition to the other requirements of the
-   authentication method.
+   can be specified in any hostssl record.
+   This option can be set to verify-ca or
+   verify-full. Both options require the client
+   to present a valid (trusted) SSL certificate, while
+   verify-full additionally enforces that the
+   CN (Common Name) in the certificate matches
+   the username or an applicable mapping.
+   This behaviour is similar to the cert autentication method
+   ( See  ) but enables pairing
+   the verification of client certificates with any authentication
+   method that supports hostssl entries.
   
  
 
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 587b430..a3eb180 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -2263,13 +2263,24 @@ pg_dumpall -p 5432 | psql -d postgres -p 5433
(CAs) you trust in a file in the data
directory, set the parameter  in
postgresql.conf to the new file name, and add the
-   authentication option clientcert=1 to the appropriate
+   authentication option clientcert=verify-ca or
+   clientcert=verify-full to the appropriate
hostssl line(s) in pg_hba.conf.
A certificate will then be requested from the client during SSL
connection startup.  (See  for a description
-   of how to set up certificates on the client.)  The server will
-   verify that the client's certificate is signed by one of the trusted
-   certificate authorities.
+   of how to set up certificates on the client.)
+  
+
+  
+   For a hostssl entry with 
+   clientcert=verify-ca, the server will verify
+   that the client's certificate is signed by one of the trusted
+   certificate authorities. If clientcert=verify-full
+   is specified, the server will not only verify the certificate
+   chain, but it will also check whether the username or it's
+   mapping match the common name (CN) of the provided certificate.
+   Note that this is always ensured when cert
+   authentication method is used (See ).
   
 
   
@@ -2293,14 +2304,33 @@ pg_dumpall -p 5432 | psql -d postgres -p 5433
client certificates against its CA file, if one is configured  but
it will not insist that a client 

Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

2018-03-20 Thread Julian Markwort
I just realized I made a whitespace error when I modified the comments.
I hope I don't make a habit of sending erroneous mails.

Please accept my apologies, as well as a guaranteed whitespace-error-free patch 
(updated to version 5 for clarity).

Julian

Julian Markwort schrieb am 2018-03-20:
> To anyone who followed along with this for so long, I'd like to present my 
> newest version of this patch.

> As suggested by Arthur Zakirov, there is now only a single GUC ( 
> pg_stat_statements.plan_track ) responsible for the selection of the plans 
> that should be tracked. Possible values are: all, none, good, or bad.
> I've mostly copied functionality from pl_handler.c . This resulted in the 
> need to include varlena.h so I could use the SplitIdentifierString() function 
> to parse the values, of which several (e.g. 
> pg_stat_statements.plan_track='good, bad') could be used.

> I've also added a new GUC:
> pg_stat_statements.plan_fence_factor
> This GUC can be used to scale the fences of the interval, outside of which a 
> plan might be updated.
> Right now, it is set to 1.5 (common factor for the definition of outliers in 
> boxplots) and you can see through additional colums in the pg_stat_statements 
> view, how often these fences are surpassed by execution times and how often 
> the plans are updated. (The colums are: good_plan_outliers, 
> good_plan_updates, bad_plan_outliers, bad_plan_updates and are primarily here 
> for testing and review purposes and are not essential to this patch, they 
> probably don't add any value for the average user)

> Similarly to the first suggestion by Arthur, I've also changed the plan_reset 
> functionality - there is now only one function, 
> pg_stat_statements_plan_reset(queryid bigint), overloaded with (queryid 
> bigint, plantype cstring) args, that can be used to remove both plans (when 
> omitting the cstring) or either of them. The cstring argument accepts 'good' 
> or 'bad'.

> I also added more comments to the estimations of the quartiles and the 
> calculation of the fences.

> The performance impact lies now at 139312 vs 141841 tps, so roughly 1.78% 
> slower than default pg_stat_statements.
> The fact that these results are a little worse than the previous iteration is 
> due to some changes in the definition of the fences which mistakenly 
> calculated by adding the scaled interquartile distance to the mean, instead 
> of adding it to the respective quartiles, which means that plan updates are 
> triggered a little more often.
> For 4259631 transactions however, only 11 updates for the bad plans where 
> triggered.



> I'm looking forward to your opinions!
> Julian
diff --git a/contrib/pg_stat_statements/Makefile b/contrib/pg_stat_statements/Makefile
index 39b368b70e..49bb462d10 100644
--- a/contrib/pg_stat_statements/Makefile
+++ b/contrib/pg_stat_statements/Makefile
@@ -4,7 +4,8 @@ MODULE_big = pg_stat_statements
 OBJS = pg_stat_statements.o $(WIN32RES)
 
 EXTENSION = pg_stat_statements
-DATA = pg_stat_statements--1.4.sql pg_stat_statements--1.4--1.5.sql \
+DATA = pg_stat_statements--1.4.sql \
+	pg_stat_statements--1.5--1.6.sql pg_stat_statements--1.4--1.5.sql \
 	pg_stat_statements--1.3--1.4.sql pg_stat_statements--1.2--1.3.sql \
 	pg_stat_statements--1.1--1.2.sql pg_stat_statements--1.0--1.1.sql \
 	pg_stat_statements--unpackaged--1.0.sql
diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index 5318c3550c..2ca549686f 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -395,4 +395,40 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
  SELECT pg_stat_statements_reset() | 1 |1
 (8 rows)
 
+-- test to see if any plans have been recorded.
+SELECT
+  CASE WHEN good_plan_time > 0 THEN 1 ELSE 0 END,
+  CASE WHEN bad_plan_time > 0 THEN 1 ELSE 0 END,
+  CASE WHEN good_plan_timestamp >= timestamp '1970-01-01 00:00:00' THEN 1 ELSE 0 END,
+  CASE WHEN bad_plan_timestamp >= timestamp '1970-01-01 00:00:00' THEN 1 ELSE 0 END
+FROM pg_stat_statements ORDER BY query COLLATE "C";
+ case | case | case | case 
+--+--+--+--
+0 |0 |0 |0
+0 |0 |0 |0
+0 |0 |0 |0
+0 |0 |0 |0
+0 |0 |0 |0
+0 |0 |0 |0
+1 |1 |1 |1
+1 |1 |1 |1
+1 |1 |1 |1
+(9 rows)
+
+-- test if there is some text in the recorded plans.
+SELECT substr(good_plan, 0, 11), substr(bad_plan, 0, 11) FROM pg_stat_statements ORDER BY query COLLATE "C";
+   substr   |   substr   
++
+| 
+| 
+| 
+| 
+| 
+   

Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

2018-03-20 Thread Julian Markwort
To anyone who followed along with this for so long, I'd like to present my 
newest version of this patch.

As suggested by Arthur Zakirov, there is now only a single GUC ( 
pg_stat_statements.plan_track ) responsible for the selection of the plans that 
should be tracked. Possible values are: all, none, good, or bad.
I've mostly copied functionality from pl_handler.c . This resulted in the need 
to include varlena.h so I could use the SplitIdentifierString() function to 
parse the values, of which several (e.g. pg_stat_statements.plan_track='good, 
bad') could be used.

I've also added a new GUC:
pg_stat_statements.plan_fence_factor
This GUC can be used to scale the fences of the interval, outside of which a 
plan might be updated.
Right now, it is set to 1.5 (common factor for the definition of outliers in 
boxplots) and you can see through additional colums in the pg_stat_statements 
view, how often these fences are surpassed by execution times and how often the 
plans are updated. (The colums are: good_plan_outliers, good_plan_updates, 
bad_plan_outliers, bad_plan_updates and are primarily here for testing and 
review purposes and are not essential to this patch, they probably don't add 
any value for the average user)

Similarly to the first suggestion by Arthur, I've also changed the plan_reset 
functionality - there is now only one function, 
pg_stat_statements_plan_reset(queryid bigint), overloaded with (queryid bigint, 
plantype cstring) args, that can be used to remove both plans (when omitting 
the cstring) or either of them. The cstring argument accepts 'good' or 'bad'.

I also added more comments to the estimations of the quartiles and the 
calculation of the fences.

The performance impact lies now at 139312 vs 141841 tps, so roughly 1.78% 
slower than default pg_stat_statements.
The fact that these results are a little worse than the previous iteration is 
due to some changes in the definition of the fences which mistakenly calculated 
by adding the scaled interquartile distance to the mean, instead of adding it 
to the respective quartiles, which means that plan updates are triggered a 
little more often.
For 4259631 transactions however, only 11 updates for the bad plans where 
triggered.



I'm looking forward to your opinions!
Julian
diff --git a/contrib/pg_stat_statements/Makefile b/contrib/pg_stat_statements/Makefile
index 39b368b70e..49bb462d10 100644
--- a/contrib/pg_stat_statements/Makefile
+++ b/contrib/pg_stat_statements/Makefile
@@ -4,7 +4,8 @@ MODULE_big = pg_stat_statements
 OBJS = pg_stat_statements.o $(WIN32RES)
 
 EXTENSION = pg_stat_statements
-DATA = pg_stat_statements--1.4.sql pg_stat_statements--1.4--1.5.sql \
+DATA = pg_stat_statements--1.4.sql \
+	pg_stat_statements--1.5--1.6.sql pg_stat_statements--1.4--1.5.sql \
 	pg_stat_statements--1.3--1.4.sql pg_stat_statements--1.2--1.3.sql \
 	pg_stat_statements--1.1--1.2.sql pg_stat_statements--1.0--1.1.sql \
 	pg_stat_statements--unpackaged--1.0.sql
diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index 5318c3550c..2ca549686f 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -395,4 +395,40 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
  SELECT pg_stat_statements_reset() | 1 |1
 (8 rows)
 
+-- test to see if any plans have been recorded.
+SELECT
+  CASE WHEN good_plan_time > 0 THEN 1 ELSE 0 END,
+  CASE WHEN bad_plan_time > 0 THEN 1 ELSE 0 END,
+  CASE WHEN good_plan_timestamp >= timestamp '1970-01-01 00:00:00' THEN 1 ELSE 0 END,
+  CASE WHEN bad_plan_timestamp >= timestamp '1970-01-01 00:00:00' THEN 1 ELSE 0 END
+FROM pg_stat_statements ORDER BY query COLLATE "C";
+ case | case | case | case 
+--+--+--+--
+0 |0 |0 |0
+0 |0 |0 |0
+0 |0 |0 |0
+0 |0 |0 |0
+0 |0 |0 |0
+0 |0 |0 |0
+1 |1 |1 |1
+1 |1 |1 |1
+1 |1 |1 |1
+(9 rows)
+
+-- test if there is some text in the recorded plans.
+SELECT substr(good_plan, 0, 11), substr(bad_plan, 0, 11) FROM pg_stat_statements ORDER BY query COLLATE "C";
+   substr   |   substr   
++
+| 
+| 
+| 
+| 
+| 
+| 
+ Query Text | Query Text
+ Query Text | Query Text
+ Query Text | Query Text
+ Query Text | Query Text
+(10 rows)
+
 DROP EXTENSION pg_stat_statements;
diff --git a/contrib/pg_stat_statements/pg_stat_statements--1.5--1.6.sql b/contrib/pg_stat_statements/pg_stat_statements--1.5--1.6.sql
new file mode 100644
index 00..6c8f743ee5
--- /dev/null
+++ b/contrib/pg_stat_statements/pg_stat_statements--1.5--1.6.sql
@@ -0,0 +1,82 @@
+/* contrib/pg_stat_statements/pg_stat_statements--1.5--1.6.sql */
+
+-- complain if script 

Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

2018-03-12 Thread Julian Markwort
Arthur Zakirov wrote on 2018-03-09:
> I'd like to review the patch and leave a feedback. I tested it with
> different indexes on same table and with same queries and it works fine.

Thanks for taking some time to review my patch, Arthur!

> First of all, GUC variables and functions. How about union
> 'pg_stat_statements.good_plan_enable' and
> 'pg_stat_statements.bad_plan_enable' into
> 'pg_stat_statements.track_plan' with GUC_LIST_INPUT flag? It may accept
> comma separated values 'good' and 'bad'. It lets to add another tracking
> type in future without adding new variable.

This sounds like a good idea to me; Somebody already suggested that tracking an 
"average plan" would be interesting as well, however I don't have any clever 
ideas on how to identify such a plan.

> In same manner pg_stat_statements_good_plan_reset() and
> pg_stat_statements_bad_plan_reset() functions can be combined in
> function:

> pg_stat_statements_plan_reset(in queryid bigint, in good boolean, in bad
> boolean)

This is also sensible, however if any more kinds of plans would be added in the 
future, there would be a lot of flags in this function. I think having varying 
amounts of flags (between extension versions) as arguments to the function also 
makes any upgrade paths difficult. Thinking more about this, we could also user 
function overloading to avoid this.
An alternative would be to have the caller pass the type of plan he wants to 
reset. Omitting the type would result in the deletion of all plans, maybe?
pg_stat_statements_plan_reset(in queryid bigint, in plan_type text)

I'm not sure if there are any better ways to do this?

> Further comments on the code.
 You're right about all of those, thanks!



Re: [PATCH] pg_hba.conf : new auth option : clientcert=verify-full

2018-03-09 Thread Julian Markwort
Hello Magnus,

> I think this makes a lot of sense, and can definitely be a useful
> option.

I was hesistant to write a long and elaborate patch as I wasn't certain
if there was any interest for such an addition, but I'm thankful for
your input.

> However, the patch is completely lacking documentation, which
> obviously make it a no-starter. 

I'll write the missing documentation shortly.

> Also if I read it right, if the CN is not correct, it will give the
> error message "certificate authentication failed for user ...". I
> realize this comes from the re-use of the code, but I don't think
> this makes it very useful. We  need to separate these two things.

The error message "certificate authentication failed for user XYZ:
client certificate contains no user name" is the result of calling
CheckCertAuth when the user presented a certificate without a CN in it.

The error message that is presented to the user upon trying to connect
with a certificate containing a CN other than the username is:

-
psql: FATAL: password authentication failed for user "nottestuser"
-

The server's log contains the lines:

-
2018-03-09 13:06:43.111 CET [3310] LOG:  provided user name
(nottestuser) and authenticated user name (testuser) do not match
2018-03-09 13:06:43.111 CET [3310] FATAL:  password authentication
failed for user "nottestuser"
2018-03-09 13:06:43.111 CET [3310] DETAIL:  Connection matched
pg_hba.conf line 97: "hostssl all nottestuser 127.0.0.1/32 password
clientcert=verify-full"
-

I'd argue that the message in the log file is consistent and useful,
however the message given by psql (or any libpq application for that
matter) leaves uncertainty regarding the correctness of a provided
password, for example.
I could attach the log message of CheckCertAuth to the logdetail,
however then I'd have issues if there is already something written to
the logdetail.
I could also use an additional ereport() call whenever clientcert was
set to verify-full and the user name didn't match the CN.

Kind regards
Julian

smime.p7s
Description: S/MIME cryptographic signature


Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

2018-03-08 Thread Julian Markwort
I've goofed up now, sorry for failing to attach my updated patch.

Am Donnerstag, den 08.03.2018, 14:55 +0100 schrieb Julian Markwort:
> Tom Lane wrote on 2018-03-02:
> > You need to make your changes in a 1.5--1.6
> > upgrade file.  Otherwise there's no clean path for existing
> > installations
> > to upgrade to the new version.
> 
> I've adressed all the issues that were brought up so far:
> 1. there is now only an added 1.5--1.6.sql file.
> 2. the overhead, as previously discussed (as much as a 12% decrease
> in
> TPS during read-only tests), is now gone, the problem was that I was
> collecting the plan String before checking if it needed to be stored
> at
> all.
> The patched version is now 99.95% as fast as unmodified
> pg_stat_statements.
> 3. I've cleaned up my own code and made sure it adheres to GNU C
> coding
> style, I was guilty of some // comments and curly brackets were
> sometimes in the same line as my control structures.
> 
> I'd love to hear more feedback, here are two ideas to polish this
> patch:
> a) Right now, good_plan and bad_plan collection can be activated and
> deactivated with separate GUCs. I think it would be sensible to
> collect
> either both or none. (This would result in fewer convoluted
> conditionals.)
> b) Would you like to be able to tune the percentiles yourself, to
> adjust for the point at which a new plan is stored?
> 
> Greetings
> Juliandiff --git a/contrib/pg_stat_statements/Makefile b/contrib/pg_stat_statements/Makefile
index 39b368b..49bb462 100644
--- a/contrib/pg_stat_statements/Makefile
+++ b/contrib/pg_stat_statements/Makefile
@@ -4,7 +4,8 @@ MODULE_big = pg_stat_statements
 OBJS = pg_stat_statements.o $(WIN32RES)
 
 EXTENSION = pg_stat_statements
-DATA = pg_stat_statements--1.4.sql pg_stat_statements--1.4--1.5.sql \
+DATA = pg_stat_statements--1.4.sql \
+	pg_stat_statements--1.5--1.6.sql pg_stat_statements--1.4--1.5.sql \
 	pg_stat_statements--1.3--1.4.sql pg_stat_statements--1.2--1.3.sql \
 	pg_stat_statements--1.1--1.2.sql pg_stat_statements--1.0--1.1.sql \
 	pg_stat_statements--unpackaged--1.0.sql
diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index 5318c35..3e79890 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -395,4 +395,40 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
  SELECT pg_stat_statements_reset() | 1 |1
 (8 rows)
 
+-- test to see if any plans have been recorded.
+SELECT
+  CASE WHEN good_plan_time > 0 THEN 1 ELSE 0 END,
+  CASE WHEN bad_plan_time > 0 THEN 1 ELSE 0 END,
+  CASE WHEN good_plan_timestamp >= timestamp '1970-01-01 00:00:00' THEN 1 ELSE 0 END,
+  CASE WHEN good_plan_timestamp >= timestamp '1970-01-01 00:00:00' THEN 1 ELSE 0 END
+FROM pg_stat_statements ORDER BY query COLLATE "C";
+ case | case | case | case 
+--+--+--+--
+0 |0 |0 |0
+0 |0 |0 |0
+0 |0 |0 |0
+0 |0 |0 |0
+0 |0 |0 |0
+0 |0 |0 |0
+1 |1 |1 |1
+1 |1 |1 |1
+1 |1 |1 |1
+(9 rows)
+
+-- test if there is some text in the recorded plans.
+select substr(good_plan, 0, 11), substr(bad_plan, 0, 11) from pg_stat_statements ORDER BY query COLLATE "C";
+   substr   |   substr   
++
+| 
+| 
+| 
+| 
+| 
+| 
+ Query Text | Query Text
+ Query Text | Query Text
+ Query Text | Query Text
+ Query Text | Query Text
+(10 rows)
+
 DROP EXTENSION pg_stat_statements;
diff --git a/contrib/pg_stat_statements/pg_stat_statements--1.5--1.6.sql b/contrib/pg_stat_statements/pg_stat_statements--1.5--1.6.sql
new file mode 100644
index 000..5302d01
--- /dev/null
+++ b/contrib/pg_stat_statements/pg_stat_statements--1.5--1.6.sql
@@ -0,0 +1,78 @@
+/* contrib/pg_stat_statements/pg_stat_statements--1.5--1.6.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "ALTER EXTENSION pg_stat_statements UPDATE TO '1.6'" to load this file. \quit
+
+/* First we have to remove them from the extension */
+ALTER EXTENSION pg_stat_statements DROP VIEW pg_stat_statements;
+ALTER EXTENSION pg_stat_statements DROP FUNCTION pg_stat_statements(boolean);
+ALTER EXTENSION pg_stat_statements DROP FUNCTION pg_stat_statements_reset();
+
+/* Then we can drop them */
+DROP VIEW pg_stat_statements;
+DROP FUNCTION pg_stat_statements(boolean);
+DROP FUNCTION pg_stat_statements_reset();
+
+-- Register functions.
+CREATE FUNCTION pg_stat_statements_reset()
+RETURNS void
+AS 'MODULE_PATHNAME'
+LANGUAGE C PARALLEL SAFE;
+
+CREATE FUNCTION pg_stat_statement

Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

2018-03-08 Thread Julian Markwort
Tom Lane wrote on 2018-03-02:
> You need to make your changes in a 1.5--1.6
> upgrade file.  Otherwise there's no clean path for existing
> installations
> to upgrade to the new version.

I've adressed all the issues that were brought up so far:
1. there is now only an added 1.5--1.6.sql file.
2. the overhead, as previously discussed (as much as a 12% decrease in
TPS during read-only tests), is now gone, the problem was that I was
collecting the plan String before checking if it needed to be stored at
all.
The patched version is now 99.95% as fast as unmodified
pg_stat_statements.
3. I've cleaned up my own code and made sure it adheres to GNU C coding
style, I was guilty of some // comments and curly brackets were
sometimes in the same line as my control structures.

I'd love to hear more feedback, here are two ideas to polish this
patch:
a) Right now, good_plan and bad_plan collection can be activated and
deactivated with separate GUCs. I think it would be sensible to collect
either both or none. (This would result in fewer convoluted
conditionals.)
b) Would you like to be able to tune the percentiles yourself, to
adjust for the point at which a new plan is stored?

Greetings
Julian

smime.p7s
Description: S/MIME cryptographic signature


Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

2018-03-02 Thread Julian Markwort
Andres Freund wrote on 2018-03-02:
> Yea, I misread the diff to think you added a conflicting version. Due
> to:
> -DATA =3D pg_stat_statements--1.4.sql pg_stat_statements--1.4--1.5.sql \
> +DATA =3D pg_stat_statements--1.5.sql pg_stat_statements--1.4--1.5.sql \

> and I'd checked that 1.5 already exists. But you just renamed the file,
> presumably because it's essentially rewriting the whole file?  I'm not
> sure I'm a big fan of doing so, because that makes testing the upgrade
> path more work.

You're right about 1.5 already existing. I wasn't sure about the versioning 
policy for extensions and seeing as version 1.5 only changed a few grants I 
quasi reused 1.5 for my intentions.

> What workload did you run? read/write or readonly? This seems like a
> feature were readonly makes a lot more sense. But ~1800 tps strongly
> suggests that's not what you did?

I'm sorry I forgot to mention this; I ran all tests as read-write.

> > With pg_stat_statements on, the latter test (10 minutes) resulted in 1833 
> > tps, while the patched version resulted in 1700 tps, so a little over 7% 
> > overhead? Well, the "control run", without pg_stat_statements delivered 
> > only 1806 tps, so variance seems to be quite high.

> That's quite some overhead, I'd say.

Yes, but I wouldn't give a warranty that it is neither more nor less overhead 
than 7%, seeing as for my testing, the tps were higher for (unmodified) pgss 
enabled vs no pgss at all.

> > If anybody has any recommendations for a setup that generates less 
> > variance, I'll try this again.

> I'd suggest disabling turboost, in my experience that makes tests
> painful to repeat, because it'll strongly depend on the current HW
> temperature.
This might be a problem for average systems but I'm fairly certain this isn't 
the issue here.

I might try some more benchmarks and will in particular look into running 
read-only tests, as the aforementioned 840 EVO SSD ist -comparatively speaking- 
pretty slow.
Do you have any recommendations as to what constitutes adequate testing times 
regarding pgbench?

Kind regards
Julian



Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

2018-03-02 Thread Julian Markwort
Andres Freund wrote on 2018-03-01:
> I think the patch probably doesn't apply anymore, due to other changes
> to pg_stat_statements since its posting. Could you refresh?

pgss_plans_v02.patch applies cleanly to master, there were no changes to 
pg_stat_statements since the copyright updates at the beginning of January.
(pgss_plans_v02.patch is attached to message 
1bd396a9-4573-55ad-7ce8-fe7adffa1...@uni-muenster.de and can be found in the 
current commitfest as well.)

> I've not done any sort of review. Scrolling through I noticed //
> comments which aren't pg coding style.

I'll fix that along with any other problems that might be found in a review.


> I'd like to see a small benchmark showing the overhead of the feature.
> Both in runtime and storage size.

I've tried to gather some meaningful results, however either my testing 
methodology was flawed (as variance between all my passes of pgbench was rather 
high) or the takeaway is that the feature only generates little overhead.
This is what I've run on my workstation using a Ryzen 1700 and 16GB of RAM and 
an old Samsung 840 Evo as boot drive, which also held the database:
The database used for the tests was dropped and pgbench initialized anew for 
each test (pgss off, pgss on, pgss on with plan collection) using a scaling of 
16437704*0.003~=50 (roughly what the phoronix test suite uses for a buffer 
test).
Also similar to the phoronix test suite, I used 8 jobs and 32 connections for a 
normal multithreaded load.

I then ran 10 passes, each for 60 seconds, with a 30 second pause between them, 
as well as another test which ran for 10 minutes.

With pg_stat_statements on, the latter test (10 minutes) resulted in 1833 tps, 
while the patched version resulted in 1700 tps, so a little over 7% overhead? 
Well, the "control run", without pg_stat_statements delivered only 1806 tps, so 
variance seems to be quite high.

The results of the ten successive tests, each running 60 seconds and then 
waiting for 30 seconds, are displayed in the attached plot.
I've tinkered with different settings with pgbench for quite some time now and 
all I can come up with are runs with high variance between them.

If anybody has any recommendations for a setup that generates less variance, 
I'll try this again.

Finally, the more interesting metric regarding this patch is the size of the 
pg_stat_statements.stat file, which stores all the metrics while the database 
is shut down. I reckon that the size of pgss_query_texts.stat (which holds only 
the query strings and plan strings while the database is running) will be 
similar, however it might fluctuate more as new strings are simply appended to 
the file until the garbagecollector decides that it has to be cleaned up.
After running the aforementioned tests, the file was 8566 bytes in size for 
pgss in it's unmodified form, while the tests resulted in 32607 bytes for the 
pgss that collects plans as well. This seems reasonable as plans strings are 
usually longer than the statements from which they result. Worst case, the 
pg_stat_statements.stat holds two plans for each type of statement.
I've not tested the length of the file with different encodings, such as JSON, 
YAML, or XML, however I do not expect any hugely different results.

Greetings
Julian


pgss_plans_pgbench.pdf
Description: Adobe PDF document


Re: Sample values for pg_stat_statements

2018-02-26 Thread Julian Markwort
Hi Vik,

this is my review of your patch. I hope I've ticked all the necessary
boxes.


Submission review:
Patch has context, applies cleanly, make and make check run
successfully, patch contains tests for the added functionality.
The patch doesn't seem to contain any documentation regarding the new
columns.

I think the patch could be shorter due to some not really necessary
whitespace changes, e.g. lines 414, ff. in the pgss.1.patch file.
Modifying the first test for '!entry' to '!entry || need_params' in
line 628 and adding another test for '!entry' later in the file leads
to many unneccessarily changed lines, because they are simply indented
one step further (Prominently noticeable with comments, eg. line 672-
678 and 733-739.
I'd like to see the check for 'need_params' have it's own block,
leaving the existing block largely intact.
This could happen after the original 'if(!entry)'' block, at which
point you can be sure that an entry already exists, so there is no need
for the duplicated code that stores params and consts in the query text
file and their references in the entry.


Usability review:
The patch fulfills it's goal. The new columns consist of arrays of text
as well as an array of regtypes. Unfortunately, this makes the
pg_stat_statements view even more cluttered and confusing. (The view
was very cluttered before your patch, the best solution is probably to
not 'SELECT * FROM pg_stat_statements;'...)

Regarding the security implications that I can think of, this patch
behaves in similar fashion to the rest of pg_stat_statements, showing
the consts, params, and param_types only to users with proper access
rights and if the showtext flag is set.


Feature test:
The feature works as advertised and does not seem to lead to any assert
failures or memory management errors.
Manual testing indicates that data is properly persisted through
database shutdowns and restarts.




If the intended purpose is to have some basic idea of the kinds of
values that are used with certain statements, I'd like to suggest that
you take a look at my own patch, which implements the tracking of good
and bad plans in pg_stat_statements, in the current commitfest. My
approach not only shows the values that where used when the statement
was executed for the first time (regarding the lifetime of the
pg_stat_statements tracked data), but it shows values of possibly more
current executions of the statements and offers the possibility to
identify values leading to very good or very poor performance.

Maybe we can combine our efforts; Here is one idea that came to my
mind:
- Store the parameters of a statement if the execution led to a new
slower or faster plan.
- Provide a function that allows users to take the jumbled query
expression and have the database explain it, based on the parameters
that were recorded previously.

Kind regards
Julian Markwort

smime.p7s
Description: S/MIME cryptographic signature


[PATCH] pg_hba.conf : new auth option : clientcert=verify-full

2018-02-16 Thread Julian Markwort
Dear Postgresql Hackers,

as of now, pg_hba.conf allows us to enable authentification by
certificate through the auth-method "cert", in which case the user must
provide a valid certificate with a certificate common name(CN) matching
the database user's name or an entry in a pg_ident map.

Additionaly, for every other auth-method it is possible to set the
auth-option "clientcert=1", so clients must present a valid certificate
at login. The logic behind this only checks the validity of the
certificate itself, but the certificate common name(CN) is not
relevant.

I wrote a very small patch that adds another auth-option:
- clientcert=verify-full (analogous to server certificates; you could
also use 2 instead of verify-full for backwards compatibility, or
verify-ca instead of 1) 
which also checks the certificate common name,
so all 3 factors get checked:
1.) auth-method, e.g. scram or md5 password passes 
2.) client cert is in truststore 
3.) CN is correct.

(The patch simply makes use of the function that is used for auth-
method "cert" to avoid code duplication).

Have a nice weekend,
Julian Markwortdiff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 3014b17a..5757aa99 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -347,6 +347,7 @@ void
 ClientAuthentication(Port *port)
 {
 	int			status = STATUS_ERROR;
+	int			status_verify_cert_full = STATUS_ERROR;
 	char	   *logdetail = NULL;
 
 	/*
@@ -600,10 +601,23 @@ ClientAuthentication(Port *port)
 			break;
 	}
 
+	if(port->hba->clientcert_verify_full)
+	{
+#ifdef USE_SSL
+			status_verify_cert_full = CheckCertAuth(port);
+#else
+			Assert(false);
+#endif
+	}
+	else
+	{
+		status_verify_cert_full = STATUS_OK;
+	}
+
 	if (ClientAuthentication_hook)
 		(*ClientAuthentication_hook) (port, status);
 
-	if (status == STATUS_OK)
+	if (status == STATUS_OK && status_verify_cert_full == STATUS_OK)
 		sendAuthRequest(port, AUTH_REQ_OK, NULL, 0);
 	else
 		auth_failed(port, status, logdetail);
diff --git a/src/backend/libpq/hba.c b/src/backend/libpq/hba.c
index acf625e4..a5b0683d 100644
--- a/src/backend/libpq/hba.c
+++ b/src/backend/libpq/hba.c
@@ -1675,10 +1675,17 @@ parse_hba_auth_opt(char *name, char *val, HbaLine *hbaline,
 			*err_msg = "clientcert can only be configured for \"hostssl\" rows";
 			return false;
 		}
-		if (strcmp(val, "1") == 0)
+		if (strcmp(val, "1") == 0
+			|| strcmp(val, "verify-ca") == 0)
 		{
 			hbaline->clientcert = true;
 		}
+		else if(strcmp(val, "2") == 0
+|| strcmp(val, "verify-full") == 0)
+		{
+			hbaline->clientcert = true;
+			hbaline->clientcert_verify_full = true;
+		}
 		else
 		{
 			if (hbaline->auth_method == uaCert)
diff --git a/src/include/libpq/hba.h b/src/include/libpq/hba.h
index 5f68f4c6..309db47d 100644
--- a/src/include/libpq/hba.h
+++ b/src/include/libpq/hba.h
@@ -87,6 +87,7 @@ typedef struct HbaLine
 	char	   *ldapprefix;
 	char	   *ldapsuffix;
 	bool		clientcert;
+	bool		clientcert_verify_full;
 	char	   *krb_realm;
 	bool		include_realm;
 	bool		compat_realm;


smime.p7s
Description: S/MIME cryptographic signature


Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

2018-01-15 Thread Julian Markwort

On 01/11/2018 03:43 PM, David Fetter wrote:

Is the assumption of a normal distribution reasonable for outlier
plans as you've seen them?

This is a difficult but fair question.
First of all, I'd like to clarify that the normal distribution is 
assumed for the set of all execution times matching a queryid; No 
assumptions are made about the distribution of the outliers themselves.
The primary goal of this approach was the limitation of plan updates, to 
avoid unnecessary IO operations.
When query performance does not vary much, no updates of the plans 
should be necessary, but as soon as query performance varies too much, 
the new plan should be stored.
For the purpose of distinguishing reasonable variance between execution 
times and great variance due to changing conditions which ultimately 
might result in a different plan, the assumption of a normal 
distribution for all execution times suits well.


Based on some early testing, this results in only a few percent of 
updates (1-3%) in relation to the total number of calls, when running 
some short pgbench tests.
As the sample size grows, the assumption of a normal distribution 
becomes increasingly accurate and the (unnecessary) sampling of plans 
decreases.
In a different test, I ran several queries with identical table sizes, 
the queries were fairly simple,  and the statistical evaluation led to 
few updates during these tests. When I increased the table size 
significantly, the database switched to a different plan. Because the 
execution time differed significantly, this new bad plan was stored. 
Similarly, after running a certain query a couple of times, I created an 
index on my test data, which resulted in a speedup which was significant 
enough to result in an update of the good plan.


Now, if a change to the data or the index situation only resulted in an 
insignificant performance increase or decrease (one that falls into the 
interval defined as [mean - 1.5*IQD, mean + 1-5*IQD] ), I think it might 
be possible to assume that we are not interested in an updated plan for 
this scenario.




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

2018-01-10 Thread Julian Markwort

Hello hackers,

I'd like to follow up to my previous proposition of tracking (some) best 
and worst plans for different queries in the pg_stat_statements extension.


Based on the comments and suggestions made towards my last endeavour, 
I've taken the path of computing the interquartile distance (by means of 
an adapted z-test, under the assumption of normal distribution, based on 
the mean_time and stddev_time already used by the extension).


A bad plan is recorded, if there is no previously recorded plan, or if 
the current execution time is greater than the maximum of the previously 
recorded plan's time and the query's mean+1.5*interquartile_distance.
A good plan is recorded on a similar condition; The execution time needs 
to be shorter than the minimum of the previously recorded good plan's 
time and the query's mean-1.5*interquartile_distance.


The boundaries are chosen to resemble the boundaries for whiskers in 
boxplots.
Using these boundaries, plans will be updated very seldomly, as long as 
they are more or less normally distributed.
Changes in the plans (for example the use of indices) used for each kind 
of query will most likely result in execution times exceeding these 
boundaries, so such changes are (very probably) recorded.


The ideal solution would be to compare the current plan with the last 
plan and only update when there is a difference between them, however I 
think this is unreasonably complex and a rather expensive task to 
compute on the completion of every query.


The intent of this patch is to provide a quick insight into the plans 
currently used by the database for the execution of certain queries. The 
tracked plans only represent instances of queries with very good or very 
poor performance.


I've (re)submitted this patch for the next commitfest as well.

Kind regards
Julian


On 03/04/2017 02:56 PM, Julian Markwort wrote:
Alright, for the next version of this patch I'll look into standard 
deviation (an implementation of Welfords' algorithm already exists in 
pg_stat_statements).


On 3/4/17 14:18, Peter Eisentraut wrote:


The other problem is that this measures execution time, which can vary
for reasons other than plan.  I would have expected that the cost
numbers are tracked somehow.
I've already thought of tracking specific parts of the explanation, 
like the cost numbers, instead of the whole string, I'll think of 
something, but if anybody has any bright ideas in the meantime, I'd 
gladly listen to them.



There is also the issue of generic vs specific plans, which this
approach might be papering over.
Would you be so kind and elaborate a little bit on this? I'm not sure 
if I understand this correctly. This patch only tracks specific plans, 
yes. The inital idea was that there might be some edge-cases that are 
not apparent when looking at generalized plans or queries.


kind regards
Julian


diff --git a/contrib/pg_stat_statements/Makefile b/contrib/pg_stat_statements/Makefile
index 39b368b70e..4d658d0ec7 100644
--- a/contrib/pg_stat_statements/Makefile
+++ b/contrib/pg_stat_statements/Makefile
@@ -4,7 +4,7 @@ MODULE_big = pg_stat_statements
 OBJS = pg_stat_statements.o $(WIN32RES)
 
 EXTENSION = pg_stat_statements
-DATA = pg_stat_statements--1.4.sql pg_stat_statements--1.4--1.5.sql \
+DATA = pg_stat_statements--1.5.sql pg_stat_statements--1.4--1.5.sql \
 	pg_stat_statements--1.3--1.4.sql pg_stat_statements--1.2--1.3.sql \
 	pg_stat_statements--1.1--1.2.sql pg_stat_statements--1.0--1.1.sql \
 	pg_stat_statements--unpackaged--1.0.sql
diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index 5318c3550c..3e79890d50 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -395,4 +395,40 @@ SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
  SELECT pg_stat_statements_reset() | 1 |1
 (8 rows)
 
+-- test to see if any plans have been recorded.
+SELECT
+  CASE WHEN good_plan_time > 0 THEN 1 ELSE 0 END,
+  CASE WHEN bad_plan_time > 0 THEN 1 ELSE 0 END,
+  CASE WHEN good_plan_timestamp >= timestamp '1970-01-01 00:00:00' THEN 1 ELSE 0 END,
+  CASE WHEN good_plan_timestamp >= timestamp '1970-01-01 00:00:00' THEN 1 ELSE 0 END
+FROM pg_stat_statements ORDER BY query COLLATE "C";
+ case | case | case | case 
+--+--+--+--
+0 |0 |0 |0
+0 |0 |0 |0
+0 |0 |0 |0
+0 |0 |0 |0
+0 |0 |0 |0
+0 |0 |0 |0
+1 |1 |1 |1
+1 |1 |1 |1
+1 |1 |1 |1
+(9 rows)
+
+-- test if there is some text in the recorded plans.
+select substr(good_plan, 0, 11), substr(bad_plan, 0, 11) from pg_stat_statements ORDER BY query COLL