Re: [HACKERS] New CF app deployment

2015-02-14 Thread Magnus Hagander
On Mon, Feb 9, 2015 at 4:56 PM, Robert Haas robertmh...@gmail.com wrote:

 On Mon, Feb 9, 2015 at 5:38 AM, Magnus Hagander mag...@hagander.net
 wrote:
  On Mon, Feb 9, 2015 at 11:09 AM, Marco Nenciarini
  marco.nenciar...@2ndquadrant.it wrote:
 
  Il 08/02/15 17:04, Magnus Hagander ha scritto:
  
   Filenames are now shown for attachments, including a direct link to
 the
   attachment itself.  I've also run a job to populate all old threads.
  
 
  I wonder what is the algorithm to detect when an attachment is a patch.
 
  If you look at https://commitfest.postgresql.org/4/94/ all the
  attachments are marked as Patch: no, but many of them are
  clearly a patch.
 
  It uses the magic module, same as the file command. And that one
 claims:
 
  mha@mha-laptop:/tmp$ file 0003-File-based-incremental-backup-v9.patch
  0003-File-based-incremental-backup-v9.patch: ASCII English text, with
 very
  long lines
 
  I think it doesn't consider it a patch because it's not actually a patch
 -
  it looks like a git-format actual email message that *contains* a patch.
 It
  even includes the unix From separator line. So if anything it should have
  detected that it's an email message, which it apparently doesn't.
 
  Picking from the very top patch on the cf, an actual patch looks like
 this:
 
  mha@mha-laptop:/tmp$ file psql_fix_uri_service_004.patch
  psql_fix_uri_service_004.patch: unified diff output, ASCII text, with
 very
  long lines

 Can we make it smarter, so that the kinds of things people produce
 intending for them to be patches are thought by the CF app to be
 patches?


Doing it wouldn't be too hard, as the code right now is simply:

# Attempt to identify the file using magic information
mtype = mag.buffer(contents)
if mtype.startswith('text/x-diff'):
a.ispatch = True
else:
a.ispatch = False


(where mag is the API call into the magic module)

So we could easily add for example our own regexp parsing or so. The
question is do we want to - because we'll have to maintain it as well. But
I guess if we have a restricted enough set of rules, we can probably live
with that.



-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Patch to support SEMI and ANTI join removal

2015-02-14 Thread David Rowley
On 13 February 2015 at 20:52, Michael Paquier michael.paqu...@gmail.com
wrote:



 On Sun, Nov 23, 2014 at 8:23 PM, David Rowley dgrowle...@gmail.com
 wrote:


 As the patch stands there's still a couple of FIXMEs in there, so there's
 still a bit of work to do yet.
 Comments are welcome


 Hm, if there is still work to do, we may as well mark this patch as
 rejected as-is, also because it stands in this state for a couple of months.


My apologies, I'd not realised that the thread link on the commitfest app
was pointing to the wrong thread.

I see now that the patch has been bumped off the december fest that It's
now a duplicate on the February commitfest as I'd hastily added it to Feb
before I rushed off on my summer holidays 2 weeks ago.

I've now changed the duplicate to returned with feedback as there was no
option that I could find to delete it. (If anyone has extra rights to do
this, could that be done instead?)

The state of the patch is currently ready for review. The FIXME stuff that
I had talked about above is old news.

Please can we use the
http://www.postgresql.org/message-id/flat/CAApHDvocUEYdt1uT+DLDPs2xEu=v3qjgt6hexkonqm4ry_o...@mail.gmail.com#CAApHDvocUEYdt1uT+DLDPs2xEu=v3qjgt6hexkonqm4ry_o...@mail.gmail.com
thread for further communications about this patch. I'm trying to kill this
one off due to the out-of-date subject.

Regards

David Rowley


Re: [HACKERS] New CF app deployment

2015-02-14 Thread Magnus Hagander
On Sat, Feb 7, 2015 at 1:59 PM, Magnus Hagander mag...@hagander.net wrote:

 On Sat, Feb 7, 2015 at 1:53 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Magnus Hagander mag...@hagander.net writes:
  On Sat, Feb 7, 2015 at 1:02 AM, Jeff Janes jeff.ja...@gmail.com
 wrote:
  I'd like the ability to add a comment which does not get turned into an
  email.

  I really don't ;)

  The reason I really don't like that is that this now makes it
 impossible to
  track the review status by just reading throught he mail thread. You
 have
  to context-switch back and forth between the app and the archives. We
 had
  this problem  in the old system every now and then where reviews were
  posted entirely in the old system...

 Yeah, people did that sometimes and it sucked.  At the same time I see
 Jeff's point: 300-email threads tend to contain a lot of dross.  Could we
 address it by allowing only *very short* annotations?  The limiting case
 would be 1-bit annotations, ie you could star the important messages in a
 thread; but that might be too restrictive.


 Right - to me that's the difference between annotation (per Roberts email
 earlier, just tagging won't be enough, and I think I agree with that -
 but a limited length ones) and a comment.

 It could be that I'm reading too much into Jeff's suggestion though -
 maybe that's actually what he is suggesting.

 The annotation would then highlight the email in the archives with a
 direct link (haven't figured out exactly how to implement that part yet but
 I have some ideas and I think it's going to work out well).


Ok, I've pushed an attempt at doing this.

For each mailthread, you can now create annotations. Each annotation is
connected to a mail in the thread, and has a free text comment field. The
message will then automatically be highlighted out of the archives and a
direct link to the message include, alongside the text comment.

I *think* this is approximately what people wanted, so let's give it a
shot. If you have a chance, please test it out and comment as soon as you
can - if I'm completely off track with how it's done, we should back it out
and try a different way before we start putting actual valuable data into
it, so we don't end up with multiple different ways of doing it in the
end...

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Patch to support SEMI and ANTI join removal

2015-02-14 Thread David Rowley
There does not seem to be a delete button, so marking as rejected due to this 
now being a duplicate entry for this patch.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Manipulating complex types as non-contiguous structures in-memory

2015-02-14 Thread Tom Lane
Martijn van Oosterhout klep...@svana.org writes:
 On Thu, Feb 12, 2015 at 08:52:56AM -0500, Robert Haas wrote:
 BTW, I'm not all that thrilled with the deserialized object terminology.
 I found myself repeatedly tripping up on which form was serialized and
 which de-.  If anyone's got a better naming idea I'm willing to adopt it.

 My first thought is that we should form some kind of TOAST-like
 backronym, like Serialization Avoidance Loading and Access Device
 (SALAD) or Break-up, Read, Edit, Assemble, and Deposit (BREAD).  I
 don't think there is anything per se wrong with the terms
 serialization and deserialization; indeed, I used the same ones in the
 parallel-mode stuff.  But they are fairly general terms, so it might
 be nice to have something more specific that applies just to this
 particular usage.

 The words that sprung to mind for me were: packed/unpacked.

Trouble is that we're already using packed with a specific connotation
in that same area of the code, namely for short-header varlena values.
(See pg_detoast_datum_packed() etc.)  So I don't think this will work.
But maybe a different adjective?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multiple backends attempting to wait for pincount 1

2015-02-14 Thread Andres Freund
On 2015-02-14 17:25:00 +, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:
  Imagine what happens in LockBufferForCleanup() when
  ProcWaitForSignal() returns spuriously - something it's
  documented to possibly do (and which got more likely with the new
  patches). In the normal case UnpinBuffer() will have unset
  BM_PIN_COUNT_WAITER - but in a spurious return it'll still be set
  and LockBufferForCleanup() will see it still set.
 
 That analysis makes sense to me.
 
  I think we should simply move the
buf-flags = ~BM_PIN_COUNT_WAITER (Inside LockBuffer)
 
 I think you meant inside UnpinBuffer?

No, LockBufferHdr. What I meant was that the pincount can only be
manipulated while the buffer header spinlock is held.

  to LockBufferForCleanup, besides the PinCountWaitBuf = NULL.
  Afaics, that should do the trick.
 
 I tried that on the master branch (33e879c) (attached) and it
 passes `make check-world` with no problems.  I'm reviewing the
 places that BM_PIN_COUNT_WAITER appears, to see if I can spot any
 flaw in this.  Does anyone else see a problem with it?  Even though
 it appears to be a long-standing bug, there don't appear to have
 been any field reports, so it doesn't seem like something to
 back-patch.

I was wondering about that as well. But I don't think I agree. The most
likely scenario for this to fail is in full table vacuums that have to
freeze rows - those are primarily triggered by autovacuum. I don't think
it's likely that such a error message would be discovered in the logs
unless it happens very regularly.

 --
 Kevin Grittner
 EDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company

 diff --git a/src/backend/storage/buffer/bufmgr.c 
 b/src/backend/storage/buffer/bufmgr.c
 index e1e6240..40b2194 100644
 --- a/src/backend/storage/buffer/bufmgr.c
 +++ b/src/backend/storage/buffer/bufmgr.c
 @@ -1548,7 +1548,6 @@ UnpinBuffer(volatile BufferDesc *buf, bool fixOwner)
   /* we just released the last pin other than the 
 waiter's */
   int wait_backend_pid = 
 buf-wait_backend_pid;
  
 - buf-flags = ~BM_PIN_COUNT_WAITER;
   UnlockBufHdr(buf);
   ProcSendSignal(wait_backend_pid);
   }
 @@ -3273,6 +3272,7 @@ LockBufferForCleanup(Buffer buffer)
   else
   ProcWaitForSignal();
  
 + bufHdr-flags = ~BM_PIN_COUNT_WAITER;
   PinCountWaitBuf = NULL;
   /* Loop back and try again */
   }

You can't manipulate flags without holding the spinlock. Otherwise you
(or the other writer) can easily cancel the other sides effects.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multiple backends attempting to wait for pincount 1

2015-02-14 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:
 On 2015-02-14 17:25:00 +, Kevin Grittner wrote:

 I think we should simply move the
  buf-flags = ~BM_PIN_COUNT_WAITER (Inside LockBuffer)

 I think you meant inside UnpinBuffer?

 No, LockBufferHdr. What I meant was that the pincount can only be
 manipulated while the buffer header spinlock is held.

Oh, I see what you were saying -- I had read that a different way
entirely.  Got it.

 Even though it appears to be a long-standing bug, there don't
 appear to have been any field reports, so it doesn't seem like
 something to back-patch.

 I was wondering about that as well. But I don't think I agree.
 The most likely scenario for this to fail is in full table
 vacuums that have to freeze rows - those are primarily triggered
 by autovacuum. I don't think it's likely that such a error
 message would be discovered in the logs unless it happens very
 regularly.

I guess we have some time before the next minor release to find any
problems with this; perhaps the benefit would outweigh the risk.
Anyone else want to weigh in on that?

 You can't manipulate flags without holding the spinlock.
 Otherwise you (or the other writer) can easily cancel the other
 sides effects.

So is the attached more like what you had in mind?  If not, feel
free to post a patch.  :-)

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index e1e6240..6640172 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1548,7 +1548,6 @@ UnpinBuffer(volatile BufferDesc *buf, bool fixOwner)
 			/* we just released the last pin other than the waiter's */
 			int			wait_backend_pid = buf-wait_backend_pid;
 
-			buf-flags = ~BM_PIN_COUNT_WAITER;
 			UnlockBufHdr(buf);
 			ProcSendSignal(wait_backend_pid);
 		}
@@ -3273,6 +3272,13 @@ LockBufferForCleanup(Buffer buffer)
 		else
 			ProcWaitForSignal();
 
+		/*
+		 * Clear the flag unconditionally here, because otherwise a spurious
+		 * signal (which is allowed) could make it look like an error.
+		 */
+		LockBufHdr(bufHdr);
+		bufHdr-flags = ~BM_PIN_COUNT_WAITER;
+		UnlockBufHdr(bufHdr);
 		PinCountWaitBuf = NULL;
 		/* Loop back and try again */
 	}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Manipulating complex types as non-contiguous structures in-memory

2015-02-14 Thread Robert Haas
On Sat, Feb 14, 2015 at 10:45 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Martijn van Oosterhout klep...@svana.org writes:
 On Thu, Feb 12, 2015 at 08:52:56AM -0500, Robert Haas wrote:
 BTW, I'm not all that thrilled with the deserialized object terminology.
 I found myself repeatedly tripping up on which form was serialized and
 which de-.  If anyone's got a better naming idea I'm willing to adopt it.

 My first thought is that we should form some kind of TOAST-like
 backronym, like Serialization Avoidance Loading and Access Device
 (SALAD) or Break-up, Read, Edit, Assemble, and Deposit (BREAD).  I
 don't think there is anything per se wrong with the terms
 serialization and deserialization; indeed, I used the same ones in the
 parallel-mode stuff.  But they are fairly general terms, so it might
 be nice to have something more specific that applies just to this
 particular usage.

 The words that sprung to mind for me were: packed/unpacked.

 Trouble is that we're already using packed with a specific connotation
 in that same area of the code, namely for short-header varlena values.
 (See pg_detoast_datum_packed() etc.)  So I don't think this will work.
 But maybe a different adjective?

expanded?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gcc5: initdb produces gigabytes of _fsm files

2015-02-14 Thread Andres Freund
On 2015-02-12 18:16:37 -0500, Tom Lane wrote:
 I wrote:
  Christoph Berg c...@df7cb.de writes:
  gcc5 is lurking in Debian experimental, and it's breaking initdb.
 
  Yeah, I just heard the same about Red Hat as well:
  https://bugzilla.redhat.com/show_bug.cgi?id=1190978
  Not clear if it's an outright compiler bug or they've just found some
  creative new way to make an optimization assumption we're violating.
 
 Apparently, it's the former.  See
 
 https://bugzilla.redhat.com/show_bug.cgi?id=1190978#c3
 
 I will be unamused if the gcc boys try to make an argument that they
 did some valid optimization here.

Fixed in gcc upstream https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65053

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] chkpass with RANDOMIZE_ALLOCATED_MEMORY

2015-02-14 Thread Tom Lane
Asif Naeem anaeem...@gmail.com writes:
 It is been observed on RANDOMIZE_ALLOCATED_MEMORY enabled PG95 build that
 chkpass is failing because of uninitialized memory and seems showing false
 alarm.

It's not a false alarm, unfortunately, because chkpass_in actually does
give different results from one call to the next.  We could fix the aspect
of that involving failing to zero out unused bytes (which it appears was
introduced by sloppy replacement of strncpy with strlcpy).  But we can't
really do anything about the dependency on random(), because that's part
of the fundamental specification of the data type.  It was a bad idea,
no doubt, to design the input function to do this; but we're stuck with
it now.

 I think these uninitialized memory areas are false alarms. Along with this
 there seems another issue that makes chkpass troubling
 RANDOMIZE_ALLOCATED_MEMORY comparison logic is that crypt() function is
 being passed with random salt value for DES based result. PFA patch, to
 generate consistent results, it uses constant salt value.

This is a seriously awful idea.  You can't break the fundamental behavior
of a data type (especially not in a way that introduces a major security
hole) just for the convenience of some debugging option or other.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multiple backends attempting to wait for pincount 1

2015-02-14 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:

 I don't think it's actually 675333 at fault here. I think it's a
 long standing bug in LockBufferForCleanup() that can just much
 easier be hit with the new interrupt code.

The patches I'll be posting soon make it even easier to hit, which
is why I was trying to sort this out when Tom noticed the buildfarm
issues.

 Imagine what happens in LockBufferForCleanup() when
 ProcWaitForSignal() returns spuriously - something it's
 documented to possibly do (and which got more likely with the new
 patches). In the normal case UnpinBuffer() will have unset
 BM_PIN_COUNT_WAITER - but in a spurious return it'll still be set
 and LockBufferForCleanup() will see it still set.

That analysis makes sense to me.

 I think we should simply move the
   buf-flags = ~BM_PIN_COUNT_WAITER (Inside LockBuffer)

I think you meant inside UnpinBuffer?

 to LockBufferForCleanup, besides the PinCountWaitBuf = NULL.
 Afaics, that should do the trick.

I tried that on the master branch (33e879c) (attached) and it
passes `make check-world` with no problems.  I'm reviewing the
places that BM_PIN_COUNT_WAITER appears, to see if I can spot any
flaw in this.  Does anyone else see a problem with it?  Even though
it appears to be a long-standing bug, there don't appear to have
been any field reports, so it doesn't seem like something to
back-patch.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index e1e6240..40b2194 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -1548,7 +1548,6 @@ UnpinBuffer(volatile BufferDesc *buf, bool fixOwner)
 			/* we just released the last pin other than the waiter's */
 			int			wait_backend_pid = buf-wait_backend_pid;
 
-			buf-flags = ~BM_PIN_COUNT_WAITER;
 			UnlockBufHdr(buf);
 			ProcSendSignal(wait_backend_pid);
 		}
@@ -3273,6 +3272,7 @@ LockBufferForCleanup(Buffer buffer)
 		else
 			ProcWaitForSignal();
 
+		bufHdr-flags = ~BM_PIN_COUNT_WAITER;
 		PinCountWaitBuf = NULL;
 		/* Loop back and try again */
 	}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New CF app deployment

2015-02-14 Thread Robert Haas
On Sat, Feb 14, 2015 at 7:29 AM, Magnus Hagander mag...@hagander.net wrote:
 Ok, I've pushed an attempt at doing this.

 For each mailthread, you can now create annotations. Each annotation is
 connected to a mail in the thread, and has a free text comment field. The
 message will then automatically be highlighted out of the archives and a
 direct link to the message include, alongside the text comment.

 I *think* this is approximately what people wanted, so let's give it a shot.
 If you have a chance, please test it out and comment as soon as you can - if
 I'm completely off track with how it's done, we should back it out and try a
 different way before we start putting actual valuable data into it, so we
 don't end up with multiple different ways of doing it in the end...

Hmm.  This kind of looks like the right idea, but it's hard to use,
because all you've got to work with is the subject of the message
(which is the same for all), the time it was sent, and the author.  In
the old system, you could look at the message in the archives and then
copy-and-paste in the message ID.  That's obviously a bit manual, but
it worked.

I suppose what would be really spiffy is if the integration of the
archives and the CF app could be such that you could see the whole
message and then click annotate this message.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Manipulating complex types as non-contiguous structures in-memory

2015-02-14 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sat, Feb 14, 2015 at 10:45 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Martijn van Oosterhout klep...@svana.org writes:
 The words that sprung to mind for me were: packed/unpacked.

 Trouble is that we're already using packed with a specific connotation
 in that same area of the code, namely for short-header varlena values.
 (See pg_detoast_datum_packed() etc.)  So I don't think this will work.
 But maybe a different adjective?

 expanded?

That seems to work from the standpoint of not conflicting with other
nearby usages in our code, and it's got the right semantics I think.

Any other suggestions out there?  Otherwise I'll probably go with this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reducing our reliance on MD5

2015-02-14 Thread Henry B (Hank) Hotz, CISSP
SASL was done by many of the same people who did GSSAPI. It's main practical 
advantages are that it supports password-based mechanisms (in addition to 
GSSAPI/krb5), and that it’s more explicitly pluggable than GSSAPI is. 

The password mechanism is simple enough that it's frequently implemented 
without a general library in e.g. older Jabber clients. We could likewise 
provide that as a configure fallback if availability of client libraries turns 
out to be a problem.

Cyrus SASL is bundled with a saslauthd and other utilities that handle the 
on-disk storage of hashed passwords. SASLauthd can be configured to use PAM, 
Kerberos 5, MySQL, custom plugin, or local BDB files for password 
verification/storage. (Instead of our own, we can provide someone else’s gun to 
shoot yourself with! ;-))

For myself, I think getting rid of MD5 is a low priority. If its replaced with 
SASL, then that has the advantage (?) of replacing GSSAPI as well, so the 
number of user options increases while the number of mechanisms in PG itself 
decreases.

Personal email.  hbh...@oxy.edu





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] why does find_my_exec resolve symlinks?

2015-02-14 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 Here is a scenario:
 ./configure --prefix=/usr/local/pgsql/9.4.1
 make
 make install
 ln -s 9.4.1 /usr/local/pgsql/9.4
 PATH=/usr/local/pgsql/9.4/bin:$PATH

 And then when 9.4.2 comes out, the symlink is updated.

 I think this sort of setup in variations is not uncommon.

If it were all that common, we'd have heard people complaining before.

 The reason for this behavior is

 commit 336969e490d71c316a42fabeccda87f798e562dd
 Author: Tom Lane t...@sss.pgh.pa.us
 Date:   Sat Nov 6 23:06:29 2004 +

 Add code to find_my_exec() to resolve a symbolic link down to the
 actual executable location.  This allows people to continue to use
 setups where, eg, postmaster is symlinked from a convenient place.
 Per gripe from Josh Berkus.

 I don't quite understand what setup Josh was using there.

IIRC, the idea was that /usr/bin/postgres might be a symlink to
/usr/local/pgsql/9.4.1/bin/postgres.  If we stop resolving symlinks,
that type of arrangement will break :-(.  Given that it's been like
this for ten years without previous complaints, I'm disinclined to
mess with it ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] multiple backends attempting to wait for pincount 1

2015-02-14 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 I don't think it's actually 675333 at fault here. I think it's a
 long standing bug in LockBufferForCleanup() that can just much easier be
 hit with the new interrupt code.

 Imagine what happens in LockBufferForCleanup() when ProcWaitForSignal()
 returns spuriously - something it's documented to possibly do (and which
 got more likely with the new patches). In the normal case UnpinBuffer()
 will have unset BM_PIN_COUNT_WAITER - but in a spurious return it'll
 still be set and LockBufferForCleanup() will see it still set.

Yeah, you're right: LockBufferForCleanup has never coped with the
possibility that ProcWaitForSignal returns prematurely.  I'm not sure
if that was possible when this code was written, but we've got it
documented as being possible at least back to 8.2.  So this needs to
be fixed in all branches.

 If you just gdb into the VACUUM process with 6647248e370884 checked out,
 and do a PGSemaphoreUnlock(MyProc-sem) you'll hit it as well. I think
 we should simply move the buf-flags = ~BM_PIN_COUNT_WAITER (Inside
 LockBuffer) to LockBufferForCleanup, besides the PinCountWaitBuf =
 NULL. Afaics, that should do the trick.

If we're moving the responsibility for clearing that flag from the waker
to the wakee, I think it would be smarter to duplicate all the logic
that's currently in UnlockBuffers(), just to make real sure we don't
drop somebody else's waiter flag.  So the bottom of the loop would
look more like this:
   
LockBufHdr(bufHdr);
if ((bufHdr-flags  BM_PIN_COUNT_WAITER) != 0 
bufHdr-wait_backend_pid == MyProcPid)
{
/* Release hold on the BM_PIN_COUNT_WAITER bit */
bufHdr-flags = ~BM_PIN_COUNT_WAITER;
PinCountWaitBuf = NULL;
// optionally, we could check for pin count 1 here ...
}
UnlockBufHdr(bufHdr);
/* Loop back and try again */


Also we should rethink at least the comment in UnlockBuffers().
I'm not sure what the failure conditions are with this reassignment
of responsibility, but the described case couldn't occur anymore.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RangeType internal use

2015-02-14 Thread Jim Nasby

On 2/13/15 3:34 PM, David Fetter wrote:

On Fri, Feb 13, 2015 at 03:13:11PM -0600, Jim Nasby wrote:

On 2/10/15 2:04 PM, David Fetter wrote:

Yeah, but people expect to be able to partition on ranges that are not
all of equal width.  I think any proposal that we shouldn't support
that is the kiss of death for a feature like this - it will be so
restricted as to eliminate 75% of the use cases.


Well, that's debatable IMO (especially your claim that variable-size
partitions would be needed by a majority of users).

It's ubiquitous.

Time range partition sets almost always have some sets with finite
range and at least one range with infinity in it: current end to
infinity, and somewhat less frequently in my experience, -infinity
to some arbitrary start.


We could instead handle that with a generic this doesn't fit in any
other partition capability. Presumably that would be easy if we're
building this on top of inheritance features.

If we exclude the issue of needing one or two oddball partitions for
+/- infinity, I expect that fixed sized partitions would actually
cover 80-90% of cases.


Is partition the domain really that big an ask?

]
Since this debate has been running for a few months, perhaps it is. I'd 
rather see limited partitioning get in sooner and come back to handle 
the less common cases (as long as we don't paint ourselves in a corner).

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] restrict global access to be readonly

2015-02-14 Thread Robert Haas
On Fri, Feb 13, 2015 at 3:32 AM, happy times guangzhouzh...@qq.com wrote:
 I didn’t find any convenient way to restrict access to PostgreSQL databases
 to be read-only for all users. I need it in following scenarios:

 A) Planned switch-over from master to slave. We want to minimize impact
 within the planned switch-overs. So during the process we switch from master
 to slave, we would like to allow read-only transactions to be run on the
 original master until the switch-over complete and the new master starts
 taking user connections (we do the switch with virtual IP mechanism). I
 didn’t find way to do this on the database server side. Sure, we can utilize
 the   runtime parameter default_transaction_read_only, however, it does not
 restrict user from changing transaction attribute to non-readonly mode, so
 is not safe.

 B) Blocking writing access when storage constraint is reached. We have
 massive PostgreSQL instances which are sold to external users with specific
 storage constraints and prices. When storage constraint for a specific
 instance is reached, we would rather change the instance to be readonly
 (then notify user to cleanup data or buy more storage) than shutdown the
 instance. Our current solution is putting a recovery.conf file to the
 instance (killing all existing connections) and restart the instance to get
 it into recovery mode (which is readonly), which is not pretty.

 C) Blocking writing access when an instance has expired. Similar with B),
 when the user’s contract with us expires about his/her instance, we want to
 firstly block the write access rather than shutdown the instance completely.

 Having that said, it would be very nice if there is a command like “SET
 GLOBAL_ACCESS TO READONLY | READWRITE” which does the job for the whole
 instance. I guess there could be others who want this feature too.

 So, have anyone considered or discussed about adding such a command? Is
 there anyone working on it (I would like to work on it if no)?

I think this would be a useful feature and have thought about it
myself.  I suggest that it be spelled like this:

ALTER SYSTEM [ READ ONLY | READ WRITE ];

Although I like the idea, it's not clear to me how to implement it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] why does enum_endpoint call GetTransactionSnapshot()?

2015-02-14 Thread Robert Haas
The comments say say that we must use an MVCC snapshot here, but
they don't explain why it should be one retrieved via
GetTransactionSnapshot() rather than GetActiveSnapshot() or
GetLatestSnapshot() or GetCatalogSnapshot().  The comments also refer
the reader to catalog/pg_enum.c for more details, but the only
information I can find in pg_enum.c just says that if reading multiple
values for the same enum, the same snapshot should be used for all of
them.  That's not an issue here, since we're only reading one value.
The behavior can't be chalked up to a desire for MVCC compliance,
because it updates the transaction snapshot mid-query:

rhaas=# create type color as enum ('red');
CREATE TYPE
rhaas=# select distinct enum_last(NULL::color) from
generate_series(1,100) g;
 enum_last
---
 red
(1 row)
rhaas=# select distinct enum_last(NULL::color) from
generate_series(1,100) g;
-- while this is running, do alter type color add value 'green'; in
another window
enum_last
---
 red
 green
(2 rows)

I think this is probably a holdover from before the death of
SnapshotNow, and that we should just pass NULL to
systable_beginscan_ordered() here, the same as we do for other catalog
accesses.  Barring objections, I'll go make that change.

If there's some remaining reason for doing it this way, we should add
comments explaining what it is.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New CF app deployment

2015-02-14 Thread Jim Nasby

On 2/14/15 11:42 AM, Robert Haas wrote:

On Sat, Feb 14, 2015 at 7:29 AM, Magnus Hagander mag...@hagander.net wrote:

Ok, I've pushed an attempt at doing this.

For each mailthread, you can now create annotations. Each annotation is
connected to a mail in the thread, and has a free text comment field. The
message will then automatically be highlighted out of the archives and a
direct link to the message include, alongside the text comment.

I *think* this is approximately what people wanted, so let's give it a shot.
If you have a chance, please test it out and comment as soon as you can - if
I'm completely off track with how it's done, we should back it out and try a
different way before we start putting actual valuable data into it, so we
don't end up with multiple different ways of doing it in the end...


Hmm.  This kind of looks like the right idea, but it's hard to use,
because all you've got to work with is the subject of the message
(which is the same for all), the time it was sent, and the author.  In
the old system, you could look at the message in the archives and then
copy-and-paste in the message ID.  That's obviously a bit manual, but
it worked.

I suppose what would be really spiffy is if the integration of the
archives and the CF app could be such that you could see the whole
message and then click annotate this message.


It would be nice if CF-related emails had a link at the bottom that took 
you to that email in the CF app. Another possibility is some kind of 
special markup you could add to an email to create an annotation.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Reduce pinning in btree indexes

2015-02-14 Thread Kevin Grittner
We have a customer who was unable to migrate from a well-known
commercial database product to pg because they have a very large
software base that holds cursors open for very long periods of
time, preventing GlobalXmin values from advancing, leading to
bloat.  They have a standard test that exercises hundreds of client
connections for days at a time, and saw disk space and CPU usage
climbing to unacceptable levels[1].  The other database product had
no such problems.  We suggested the obvious solutions that we
always suggest on the community lists, and they had perfectly good
reasons why those were not viable for them -- we either needed to
prevent bloat under their current software or they could not
migrate.

What they wanted was what happened in the other database product --
if a snapshot got sufficiently old that cleaning up the MVCC data
was a problem *and* the snapshot was used again *and* it read a
page which had been modified far enough back that it was not
possible to return correct data, then they wanted to receive a
snapshot too old error.  I wrote a patch to do that, but they
didn't seem much improvement, because all (auto)vacuum processes
blocked indefinitely on the btree index pages where a cursor was
parked.  So they still got massive bloat and consequent problems.

It seemed odd to me that a btree implementation based on the
Lehman  Yao techniques would block anything, because the concept
is that it reads everything it needs off the index page and pauses
between pages.  We sorta do that, but the interlock between
vacuum processes and other index usages basically involves holding
a pin on the page we just read until we are done using the values
we read off that page, and treating the pin as a lighter lock than
a shared (or READ) lock -- which only conflicts with a super
exclusive lock, which consists of getting an exclusive lock only
once there are no other processes holding a pin on the page.  This
ensures that at the end of a vacuum pass over a btree index there
are no TIDs in process-local memory from before the start of the
pass, and consequently no scan can read a new tuple assigned to the
same TID value after the rest of the vacuum phases run.  So a pin
on a btree page blocks a vacuum process indefinitely.

Interestingly, the btree README points out that using the old TID
with a new tuple poses no hazard for a scan using an MVCC snapshot,
because the new tuple would not be visible to a snapshot created
that long ago.  The customer's cursors which were causing the
problems all use MVCC snapshots, so I went looking to see whether
we couldn't take advantage of this fact.  For the most part it
seemed we were OK if we dropped pins with the READ lock for a btree
scan which used an MVCC snapshot.  I found that the LP_DEAD hinting
would be a problem with an old TID, but I figured we could work
around that by storing the page LSN into the scan position structure
when the page contents were read, and only doing hinting if that
matched when we were ready to do the hinting.  That wouldn't work
for an index which was not WAL-logged, so the patch still holds
pins for those.  Robert pointed out that the visibility information
for an index-only scan wasn't checked while the index page READ
lock was held, so those scans also still hold the pins.

There was also a problem with the fact that the code conflated the
notion of having a valid scan position with holding a pin on a
block in the index.  Those two concepts needed to be teased apart,
which I did using some new macros and figuring out which one to use
where.  Without a pin on the block, we also needed to remember what
block we had been processing to be able to do the LP_DEAD hinting;
for that the block number was added to the structure which tracks
scan position.

Finally, there was an optimization for marking buffer position
for possible restore that was incompatible with releasing the pin.
I use quotes because the optimization adds overhead to every move
to the next page in order set some variables in a structure when a
mark is requested instead of running two fairly small memcpy()
statements.  The two-day benchmark of the customer showed no
performance hit, and looking at the code I would be amazed if the
optimization yielded a measurable benefit.  In general, optimization
by adding overhead to moving through a scan to save time in a mark
operation seems dubious.

At some point we could consider building on this patch to recheck
index conditions for heap access when a non-MVCC snapshot is used,
check the visibility map for referenced heap pages when the TIDs
are read for an index-only scan, and/or skip LP_DEAD hinting for
non-WAL-logged indexes.  But all those are speculative future work;
this is a conservative implementation that just didn't modify
pinning where there were any confounding factors.

In combination with the snapshot-too-old patch the 2-day customer
benchmark ran without problem levels of bloat, controlling disk
space growth and 

Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0

2015-02-14 Thread Heikki Linnakangas

On 02/10/2015 02:21 AM, Peter Geoghegan wrote:

On Fri, Feb 6, 2015 at 1:51 PM, Bruce Momjian br...@momjian.us wrote:

Other than the locking part, the biggest part of this patch is adjusting
things so that an INSERT can change into an UPDATE.


Thanks for taking a look at it. That's somewhat cleaned up in the
attached patchseries - V2.2. This has been rebased to repair the minor
bit-rot pointed out by Thom.


I don't really have the energy to review this patch in one batch, so I'd 
really like to split this into two:


1. Solve the existing problem with exclusion constraints that if two 
insertions happen concurrently, one of them might be aborted with 
deadlock detected error, rather then conflicting key violation 
error. That really wouldn't be worth fixing otherwise, but it happens to 
be a useful stepping stone for this upsert patch.


2. All the rest.

I took a stab at extracting the parts needed to do 1. See attached. I 
didn't modify ExecUpdate to use speculative insertions, so the issue 
remains for UPDATEs, but that's easy to add.


I did not solve the potential for livelocks (discussed at 
http://www.postgresql.org/message-id/cam3swztftt_fehet3tu3ykcpcypynnauquz3q+naasnh-60...@mail.gmail.com). 
The patch always super-deletes the already-inserted tuple on conflict, 
and then waits for the other inserter. That would be nice to fix, using 
one of the ideas from that thread, or something else.


We never really debated the options for how to do the speculative 
insertion and super-deletion. This patch is still based on the quick  
dirty demo patches I posted about a year ago, in response to issues you 
found with earlier versions. That might be OK - maybe I really hit the 
nail on designing those things and got them right on first try - but 
more likely there are better alternatives.


Are we happy with acquiring the SpeculativeInsertLock heavy-weight lock 
for every insertion? That seems bad for performance reasons. Also, are 
we happy with adding the new fields to the proc array? Another 
alternative that was discussed was storing the speculative insertion 
token on the heap tuple itself. (See 
http://www.postgresql.org/message-id/52d00d2d.6030...@vmware.com)


Are we happy with the way super-deletion works? Currently, the xmin 
field is set to invalid to mark the tuple as super-deleted. That 
required checks in HeapTupleSatisfies* functions. One alternative would 
be to set xmax equal to xmin, and teach the code currently calls 
XactLockTableWait() to check if xmax=xmin, and not consider the tuple as 
conflicting.


- Heikki
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 46060bc1..0aa3e575 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2048,6 +2048,9 @@ FreeBulkInsertState(BulkInsertState bistate)
  * This causes rows to be frozen, which is an MVCC violation and
  * requires explicit options chosen by user.
  *
+ * If HEAP_INSERT_SPECULATIVE is specified, the MyProc-specInsert fields
+ * are filled.
+ *
  * Note that these options will be applied when inserting into the heap's
  * TOAST table, too, if the tuple requires any out-of-line data.
  *
@@ -2196,6 +2199,13 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 
 	END_CRIT_SECTION();
 
+	/*
+	 * Let others know that we speculatively inserted this tuple, before
+	 * releasing the buffer lock.
+	 */
+	if (options  HEAP_INSERT_SPECULATIVE)
+		SetSpeculativeInsertionTid(relation-rd_node, heaptup-t_self);
+
 	UnlockReleaseBuffer(buffer);
 	if (vmbuffer != InvalidBuffer)
 		ReleaseBuffer(vmbuffer);
@@ -2616,11 +2626,17 @@ xmax_infomask_changed(uint16 new_infomask, uint16 old_infomask)
  * (the last only for HeapTupleSelfUpdated, since we
  * cannot obtain cmax from a combocid generated by another transaction).
  * See comments for struct HeapUpdateFailureData for additional info.
+ *
+ * If 'killspeculative' is true, caller requires that we super-delete a tuple
+ * we just inserted in the same command. Instead of the normal visibility
+ * checks, we check that the tuple was inserted by the current transaction and
+ * given command id.  Also, instead of setting its xmax, we set xmin to
+ * invalid, making it immediately appear as dead to everyone.
  */
 HTSU_Result
 heap_delete(Relation relation, ItemPointer tid,
 			CommandId cid, Snapshot crosscheck, bool wait,
-			HeapUpdateFailureData *hufd)
+			HeapUpdateFailureData *hufd, bool killspeculative)
 {
 	HTSU_Result result;
 	TransactionId xid = GetCurrentTransactionId();
@@ -2678,7 +2694,18 @@ heap_delete(Relation relation, ItemPointer tid,
 	tp.t_self = *tid;
 
 l1:
-	result = HeapTupleSatisfiesUpdate(tp, cid, buffer);
+	if (!killspeculative)
+	{
+		result = HeapTupleSatisfiesUpdate(tp, cid, buffer);
+	}
+	else
+	{
+		if (tp.t_data-t_choice.t_heap.t_xmin != xid ||
+			tp.t_data-t_choice.t_heap.t_field3.t_cid != cid)
+			elog(ERROR, attempted to super-delete a tuple from other CID);
+		result = 

Re: [HACKERS] why does enum_endpoint call GetTransactionSnapshot()?

2015-02-14 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I think this is probably a holdover from before the death of
 SnapshotNow, and that we should just pass NULL to
 systable_beginscan_ordered() here, the same as we do for other catalog
 accesses.  Barring objections, I'll go make that change.

Seems reasonable to me, but why are you only looking at that one and
not the identical code in enum_range_internal()?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] why does enum_endpoint call GetTransactionSnapshot()?

2015-02-14 Thread Robert Haas
On Sat, Feb 14, 2015 at 5:12 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I think this is probably a holdover from before the death of
 SnapshotNow, and that we should just pass NULL to
 systable_beginscan_ordered() here, the same as we do for other catalog
 accesses.  Barring objections, I'll go make that change.

 Seems reasonable to me, but why are you only looking at that one and
 not the identical code in enum_range_internal()?

I noticed that one after hitting send.  I think we should change them both.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] restrict global access to be readonly

2015-02-14 Thread Jim Nasby

On 2/14/15 3:14 PM, Robert Haas wrote:

On Fri, Feb 13, 2015 at 3:32 AM, happy times guangzhouzh...@qq.com wrote:

I didn’t find any convenient way to restrict access to PostgreSQL databases
to be read-only for all users. I need it in following scenarios:

A) Planned switch-over from master to slave. We want to minimize impact
within the planned switch-overs. So during the process we switch from master
to slave, we would like to allow read-only transactions to be run on the
original master until the switch-over complete and the new master starts
taking user connections (we do the switch with virtual IP mechanism). I
didn’t find way to do this on the database server side. Sure, we can utilize
the   runtime parameter default_transaction_read_only, however, it does not
restrict user from changing transaction attribute to non-readonly mode, so
is not safe.

B) Blocking writing access when storage constraint is reached. We have
massive PostgreSQL instances which are sold to external users with specific
storage constraints and prices. When storage constraint for a specific
instance is reached, we would rather change the instance to be readonly
(then notify user to cleanup data or buy more storage) than shutdown the
instance. Our current solution is putting a recovery.conf file to the
instance (killing all existing connections) and restart the instance to get
it into recovery mode (which is readonly), which is not pretty.

C) Blocking writing access when an instance has expired. Similar with B),
when the user’s contract with us expires about his/her instance, we want to
firstly block the write access rather than shutdown the instance completely.

Having that said, it would be very nice if there is a command like “SET
GLOBAL_ACCESS TO READONLY | READWRITE” which does the job for the whole
instance. I guess there could be others who want this feature too.

So, have anyone considered or discussed about adding such a command? Is
there anyone working on it (I would like to work on it if no)?


I think this would be a useful feature and have thought about it
myself.  I suggest that it be spelled like this:

ALTER SYSTEM [ READ ONLY | READ WRITE ];

Although I like the idea, it's not clear to me how to implement it.


Throw an error in AssignTransactionId/GetNewTransactionId? I see 4 calls 
to Get*TransactionId in logical replication, though arguably if we're 
fixing that we should look at doing something special for Slony and the 
likes.


Related to this, a lot of people have expressed desire for read only 
tables. That would presumably be trickier to accomplish.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] restrict global access to be readonly

2015-02-14 Thread Tom Lane
Jim Nasby jim.na...@bluetreble.com writes:
 On 2/14/15 3:14 PM, Robert Haas wrote:
 Although I like the idea, it's not clear to me how to implement it.

 Throw an error in AssignTransactionId/GetNewTransactionId?

A whole lot depends on what you choose to mean by read only.  If it
means the same thing as all transactions are READ ONLY, that would be
useful for some purposes, and it would have the advantage of having a
visible relationship to a long-established feature with the same name.
If you want it to mean no writes to disk at all, that's something
totally different, and possibly not all that useful (eg, do you really
want to make sorts fail if they'd spill to disk? How about hint bit
updates?).  Jim's suggestion above would be somewhere in the middle,
as it would successfully block use of temp tables but not eg. VACUUM.
Another possibility that would be attractive for replication-related
use-cases would be nothing that generates WAL thank you very much.

I'm inclined to think that we should agree on the desired semantics
before jumping to implementation.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Allow snapshot too old error, to prevent bloat

2015-02-14 Thread Kevin Grittner
This patch is related to the Reduce pinning in btree indexes
patch submitted here:

http://www.postgresql.org/message-id/721615179.3351449.1423959585771.javamail.ya...@mail.yahoo.com

That describes how they evolved and how they relate; I won't
duplicate that here.

Unlike the other patch, this one is more at the proof of concept
phase, because it requires support in the heap and each index AM to
work correctly; so far I have only had time to cover the heap and
btree indexes.  In spite of that, I have thrown the worst test
cases I could think of at it (and only succeeded in uncovering a
bug which was already out there in production), and it has shown
its value in a two-day test test simulating a 300 user load with
complex real-world applications (although the only indexes it used
were btree indexes).  Without the patches the database growth was
39GB per day; with the patches it was 28.5GB per day.  (The test
does involve more inserts than deletes, so some growth is
expected.)  At the end of the tests, pgstattuple reported eight
times as many dead tuples in the database without the patches.
More importantly, without the patches the CPU load started at 60%
and showed linear growth to 92% over the course of the first day;
with the patches it stayed at a stable 60% throughout the test.

What this patch does is add a GUC call old_snapshot_threshold.  It
defaults to -1, which leaves behavior matching unpatched code.
Above that it allows tuples to be vacuumed away after the number of
transaction IDs specified by the GUC have been consumed.  It also
saves the current insertion LSN into every snapshot when it is
created.  When reading from the heap or any index, if the snapshot
is vulnerable to showing incorrect data because the threshold has
been crossed since it was generated, reading any page with an LSN
past the snapshot LSN causes a snapshot too old error to be
thrown.  Since this is LSN-based, the new logic is not used for any
relation which is not WAL-logged.

Note that if you don't read data from a page which has been
modified after your snapshot was taken, the threshold doesn't
matter.

All `make installcheck` tests succeed with any setting.  With a
setting of 0 (the most extreme), `make installcheck-world` sees
four isolation tests fail.  Those all pass if you raise the
setting to 2.  The postgres_fdw test needs a setting of 4 to
succeed.  I would expect most shops would want to tune this to
something in the six-digit to eight-digit range.  In the tests
mentioned above it was set to 15 (which corresponded to just
under 4 minutes of txid consumption) and there were no snapshot
too old errors, even though some cursors were left open for the
entire two-day run.

The patch still lacks (as mentioned above) support for index AMs
other than btree, and lacks documentation for the new GUC.  I'm
sure that there are some comments and README files that need
adjustment, too.  As I said, this is still POC.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
***
*** 366,371  heapgetpage(HeapScanDesc scan, BlockNumber page)
--- 366,372 
  	LockBuffer(buffer, BUFFER_LOCK_SHARE);
  
  	dp = (Page) BufferGetPage(buffer);
+ 	TestForOldSnapshot(snapshot, scan-rs_rd, dp);
  	lines = PageGetMaxOffsetNumber(dp);
  	ntup = 0;
  
***
*** 496,501  heapgettup(HeapScanDesc scan,
--- 497,503 
  		LockBuffer(scan-rs_cbuf, BUFFER_LOCK_SHARE);
  
  		dp = (Page) BufferGetPage(scan-rs_cbuf);
+ 		TestForOldSnapshot(snapshot, scan-rs_rd, dp);
  		lines = PageGetMaxOffsetNumber(dp);
  		/* page and lineoff now reference the physically next tid */
  
***
*** 538,543  heapgettup(HeapScanDesc scan,
--- 540,546 
  		LockBuffer(scan-rs_cbuf, BUFFER_LOCK_SHARE);
  
  		dp = (Page) BufferGetPage(scan-rs_cbuf);
+ 		TestForOldSnapshot(snapshot, scan-rs_rd, dp);
  		lines = PageGetMaxOffsetNumber(dp);
  
  		if (!scan-rs_inited)
***
*** 696,701  heapgettup(HeapScanDesc scan,
--- 699,705 
  		LockBuffer(scan-rs_cbuf, BUFFER_LOCK_SHARE);
  
  		dp = (Page) BufferGetPage(scan-rs_cbuf);
+ 		TestForOldSnapshot(snapshot, scan-rs_rd, dp);
  		lines = PageGetMaxOffsetNumber((Page) dp);
  		linesleft = lines;
  		if (backward)
***
*** 1573,1578  heap_fetch(Relation relation,
--- 1577,1583 
  	 */
  	LockBuffer(buffer, BUFFER_LOCK_SHARE);
  	page = BufferGetPage(buffer);
+ 	TestForOldSnapshot(snapshot, relation, page);
  
  	/*
  	 * We'd better check for out-of-range offnum in case of VACUUM since the
***
*** 1902,1907  heap_get_latest_tid(Relation relation,
--- 1907,1913 
  		buffer = ReadBuffer(relation, ItemPointerGetBlockNumber(ctid));
  		LockBuffer(buffer, BUFFER_LOCK_SHARE);
  		page = BufferGetPage(buffer);
+ 		TestForOldSnapshot(snapshot, relation, page);
  
  		/*
  		 * Check for bogus item number.  This is not 

Re: [HACKERS] Allow snapshot too old error, to prevent bloat

2015-02-14 Thread Tom Lane
Kevin Grittner kgri...@ymail.com writes:
 What this patch does is add a GUC call old_snapshot_threshold.  It
 defaults to -1, which leaves behavior matching unpatched code.
 Above that it allows tuples to be vacuumed away after the number of
 transaction IDs specified by the GUC have been consumed.

TBH, I'm not sure why we'd wish to emulate Oracle's single worst
operational feature.

 Unlike the other patch, this one is more at the proof of concept
 phase, because it requires support in the heap and each index AM to
 work correctly; so far I have only had time to cover the heap and
 btree indexes.

But, having said that, why would the index AMs care?  Seems like what
you are describing should be strictly a matter for VACUUM's removal
rules.  If we're going to have something as ugly as this, I would much
rather it had a very small code footprint.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] improving speed of make check-world

2015-02-14 Thread Peter Eisentraut
On 8/31/14 5:36 AM, Fabien COELHO wrote:
 Running make -j2 check-world does not work because initdb is not
 found by pg_regress. but make -j1 check-world does work fine. It
 seems that some dependencies might be missing and there is a race
 condition between temporary install and running some checks?? Maybe it
 is not expected to work anyway? See below suggestions to make it work.

Here is an updated patch that fixes this problem.

The previous problem was simply a case were the make rules were not
parallel-safe.

For recursive make, we (through magic) set up targets like this:

check: check-subdir1-check check-subdir2-check

And with my old patch we added

check: temp-install

So the aggregate prerequisites were in effect something like

check: check-subdir1-check check-subdir2-check temp-install

And so there was nothing stopping a parallel make to descend into the
subdirectories before the temp install was set up.

What we need is additional prerequisites like

check-subdir1-check check-subdir2-check: temp-install

I have hacked this directly into the $(recurse) function, which is ugly.
 This could possibly be improved somehow, but the effect would be the
same in any case.

With this, I can now run things like

make -C src/pl check -j3
make -C src/bin check -j8

A full make check-world -j$X is still prone to fail because some test
suites can't run in parallel with others, but that's a separate problem
to fix.


Note:  We have in the meantime added logic to pg_regress to clean up the
temporary installation after the run.  This changes that because
pg_regress is no longer responsible for the temporary installation.
pg_regress still cleans up the temporary data directory, so you still
get quite a bit of space savings.  But the temporary installation is not
cleaned up.  But since we would now only use a single temporary
installation, the disk space usage still stays in the same order of
magnitude.

diff --git a/.gitignore b/.gitignore
index 715f817..77feb4c 100644
--- a/.gitignore
+++ b/.gitignore
@@ -35,3 +35,4 @@ lib*.pc
 /pgsql.sln.cache
 /Debug/
 /Release/
+/tmp_install/
diff --git a/GNUmakefile.in b/GNUmakefile.in
index 69e0824..361897a 100644
--- a/GNUmakefile.in
+++ b/GNUmakefile.in
@@ -47,6 +47,7 @@ $(call recurse,distprep,doc src config contrib)
 # it's not built by default
 $(call recurse,clean,doc contrib src config)
 clean:
+	rm -rf tmp_install/
 # Garbage from autoconf:
 	@rm -rf autom4te.cache/
 
diff --git a/contrib/earthdistance/Makefile b/contrib/earthdistance/Makefile
index 93dcbe3..cde1ae6 100644
--- a/contrib/earthdistance/Makefile
+++ b/contrib/earthdistance/Makefile
@@ -7,7 +7,7 @@ DATA = earthdistance--1.0.sql earthdistance--unpackaged--1.0.sql
 PGFILEDESC = earthdistance - calculate distances on the surface of the Earth
 
 REGRESS = earthdistance
-REGRESS_OPTS = --extra-install=contrib/cube
+EXTRA_INSTALL = contrib/cube
 
 LDFLAGS_SL += $(filter -lm, $(LIBS))
 
diff --git a/contrib/pg_upgrade/test.sh b/contrib/pg_upgrade/test.sh
index 75b6357..3012ed0 100644
--- a/contrib/pg_upgrade/test.sh
+++ b/contrib/pg_upgrade/test.sh
@@ -87,7 +87,7 @@ if [ $1 = '--install' ]; then
 	# use psql from the proper installation directory, which might
 	# be outdated or missing. But don't override anything else that's
 	# already in EXTRA_REGRESS_OPTS.
-	EXTRA_REGRESS_OPTS=$EXTRA_REGRESS_OPTS --psqldir='$bindir'
+	EXTRA_REGRESS_OPTS=$EXTRA_REGRESS_OPTS --bindir='$bindir'
 	export EXTRA_REGRESS_OPTS
 fi
 
diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile
index 438be44..613e9c3 100644
--- a/contrib/test_decoding/Makefile
+++ b/contrib/test_decoding/Makefile
@@ -39,35 +39,33 @@ submake-test_decoding:
 
 REGRESSCHECKS=ddl rewrite toast permissions decoding_in_xact decoding_into_rel binary prepared
 
-regresscheck: all | submake-regress submake-test_decoding
+regresscheck: all | submake-regress submake-test_decoding temp-install
 	$(MKDIR_P) regression_output
 	$(pg_regress_check) \
 	--temp-config $(top_srcdir)/contrib/test_decoding/logical.conf \
-	--temp-install=./tmp_check \
-	--extra-install=contrib/test_decoding \
+	--temp-instance=./tmp_check \
 	--outputdir=./regression_output \
 	$(REGRESSCHECKS)
 
-regresscheck-install-force: | submake-regress submake-test_decoding
+regresscheck-install-force: | submake-regress submake-test_decoding temp-install
 	$(pg_regress_installcheck) \
-	--extra-install=contrib/test_decoding \
 	$(REGRESSCHECKS)
 
 ISOLATIONCHECKS=mxact delayed_startup ondisk_startup concurrent_ddl_dml
 
-isolationcheck: all | submake-isolation submake-test_decoding
+isolationcheck: all | submake-isolation submake-test_decoding temp-install
 	$(MKDIR_P) isolation_output
 	$(pg_isolation_regress_check) \
 	--temp-config $(top_srcdir)/contrib/test_decoding/logical.conf \
-	--extra-install=contrib/test_decoding \
 	--outputdir=./isolation_output \
 	$(ISOLATIONCHECKS)
 

Re: [HACKERS] improving speed of make check-world

2015-02-14 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On 8/31/14 5:36 AM, Fabien COELHO wrote:
 Running make -j2 check-world does not work because initdb is not
 found by pg_regress. but make -j1 check-world does work fine. It
 seems that some dependencies might be missing and there is a race
 condition between temporary install and running some checks?? Maybe it
 is not expected to work anyway? See below suggestions to make it work.

 Here is an updated patch that fixes this problem.

Doesn't the Windows side of the house still depend on that functionality
you removed from pg_regress?  Perhaps that's not a big deal to fix, but
it seems like a commit-blocker from here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow snapshot too old error, to prevent bloat

2015-02-14 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kgri...@ymail.com writes:
 What this patch does is add a GUC call old_snapshot_threshold.  It
 defaults to -1, which leaves behavior matching unpatched code.
 Above that it allows tuples to be vacuumed away after the number of
 transaction IDs specified by the GUC have been consumed.

 TBH, I'm not sure why we'd wish to emulate Oracle's single worst
 operational feature.

I've run into cases where people have suffered horribly bloated
databases because of one ill-behaved connection.  Some companies
don't want to be vulnerable to that and the disruption that
recovery from that bloat causes.

Note that the default is to maintain legacy PostgreSQL behavior; 
the postgresql.conf file must be modified to see this error.

 Unlike the other patch, this one is more at the proof of concept
 phase, because it requires support in the heap and each index AM to
 work correctly; so far I have only had time to cover the heap and
 btree indexes.

 But, having said that, why would the index AMs care?  Seems like what
 you are describing should be strictly a matter for VACUUM's removal
 rules.  If we're going to have something as ugly as this, I would much
 rather it had a very small code footprint.

If a tuple is vacuumed away, you would not notice that if you were
running an index scan because you would not visit the page it used
to be on to see that the LSN had changed.  We tried to sell the
idea that *any* scan using a snapshot past the threshold should
error out (which is the only way I can see to avoid making the
index AMs aware of this), but they insisted that there was no
reason to have scans fail when reading an unmodified table using an
old snapshot, or even an unmodified page of a modified table.

Due to the cost to this company of modifying their software to deal
with differrent behavior, they will not move without the behavior
I'm shooting for with this patch.  Due to the value of this
customer to us, we will put this (or something to accomplish this
behavior) into our fork.  We're happy to share it with the
community.  If the community does not feel they need this behavior,
or wants to generate the error more aggressively to avoid touching
the index AMs, I understand.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Auditing extension for PostgreSQL (Take 2)

2015-02-14 Thread David Steele
I've posted a couple of messages over the last few weeks about the work
I've been doing on the pg_audit extension.  The lack of response could
be due to either universal acclaim or complete apathy, but in any case I
think this is a very important topic so I want to give it another try.

I've extensively reworked the code that was originally submitted by
2ndQuandrant.  This is not an indictment of their work, but rather an
attempt to redress concerns that were expressed by members of the
community.  I've used core functions to determine how audit events
should be classified and simplified and tightened the code wherever
possible.  I've removed deparse and event triggers and opted for methods
that rely only on existing hooks.  In my last message I provided
numerous examples of configuration, usage, and output which I hoped
would alleviate concerns of complexity.  I've also written a ton of unit
tests to make sure that the code works as expected.

Auditing has been a concern everywhere I've used or introduced
PostgreSQL.  Over time I've developed a reasonably comprehensive (but
complex) system of triggers, tables and functions to provide auditing
for customers.  While this has addressed most requirements, there are
always questions of auditing aborted transactions, DDL changes,
configurability, etc. which I have been unable to satisfy.

There has been some discussion of whether or not this module needs to be
in contrib.  One reason customers trust contrib is that the modules are
assumed to be held to a higher standard than code found on GitHub.  This
has already been pointed out.  But I believe another important reason is
that they know packages will be made available for a variety of
platforms, and bugs are more likely to be experienced by many users and
not just a few (or one).  For this reason my policy is not to run
custom-compiled code in production.  This is especially true for
something as mission critical as a relational database.

I realize this is not an ideal solution.  Everybody (including me) wants
something that is in core with real policies and more options.  It's
something that I am personally really eager to work on.  But the reality
is that's not going to happen for 9.5 and probably not for 9.6 either.
Meanwhile, I believe the lack of some form of auditing is harming
adoption of PostgreSQL, especially in the financial and government sectors.

The patch I've attached satisfies the requirements that I've had from
customers in the past.  I'm confident that if it gets out into the wild
it will bring all kinds of criticism and comments which will be valuable
in designing a robust in-core solution.

I'm submitting this patch to the Commitfest.  I'll do everything I can
to address the concerns of the community and I'm happy to provide more
examples as needed.  I'm hoping the sgml docs I've provided with the
patch will cover any questions, but of course feedback is always
appreciated.

--
- David Steele
da...@pgmasters.net
diff --git a/contrib/Makefile b/contrib/Makefile
index 195d447..d8e75f4 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -29,6 +29,7 @@ SUBDIRS = \
pageinspect \
passwordcheck   \
pg_archivecleanup \
+   pg_audit\
pg_buffercache  \
pg_freespacemap \
pg_prewarm  \
diff --git a/contrib/pg_audit/Makefile b/contrib/pg_audit/Makefile
new file mode 100644
index 000..32bc6d9
--- /dev/null
+++ b/contrib/pg_audit/Makefile
@@ -0,0 +1,20 @@
+# pg_audit/Makefile
+
+MODULE = pg_audit
+MODULE_big = pg_audit
+OBJS = pg_audit.o
+
+EXTENSION = pg_audit
+
+DATA = pg_audit--1.0.0.sql
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/pg_audit
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/pg_audit/pg_audit--1.0.0.sql 
b/contrib/pg_audit/pg_audit--1.0.0.sql
new file mode 100644
index 000..2eee3b9
--- /dev/null
+++ b/contrib/pg_audit/pg_audit--1.0.0.sql
@@ -0,0 +1,4 @@
+/* pg_audit/pg_audit--1.0.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use CREATE EXTENSION pg_audit to load this file.\quit
diff --git a/contrib/pg_audit/pg_audit.c b/contrib/pg_audit/pg_audit.c
new file mode 100644
index 000..b3914ac
--- /dev/null
+++ b/contrib/pg_audit/pg_audit.c
@@ -0,0 +1,1099 @@
+/*--
+ * pg_audit.c
+ *
+ * An auditing extension for PostgreSQL. Improves on standard statement logging
+ * by adding more logging classes, object level logging, and providing
+ * fully-qualified object names for all DML and many DDL statements.
+ *
+ * Copyright (c) 2014-2015, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *   contrib/pg_prewarm/pg_prewarm.c
+ 

[HACKERS] mogrify and indent features for jsonb

2015-02-14 Thread Andrew Dunstan


Attached is a patch to provide a number of very useful facilities to 
jsonb that people have asked for. These are based on work by Dmitry 
Dolgov in his jsonbx extension, but I take responsibility for any bugs.


The facilities are:

new operations:

concatenation:jsonb || jsonb - jsonb
deletion: jsonb - text - jsonb
deletion: jsonb - int - text

new functions:

produce indented text: jsonb_indent(jsonb) - text
change an element at a path:  jsonb_replace(jsonb, text[], jsonb) - jsonb.


It would be relatively trivial to add:

delete an element at a path: jsonb_delete(jsonb, text[]) - json

and I think we should do that for the sake of completeness.

The docs might need a little extra work, and the indent code definitely 
needs work, which I hope to complete in the next day or two, but I 
wanted to put a stake in the ground.



cheers

andrew
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index d57243a..9fdaee7 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10256,6 +10256,24 @@ table2-mapping
 entryDo all of these key/element emphasisstrings/emphasis exist?/entry
 entryliteral'[a, b]'::jsonb ?amp; array['a', 'b']/literal/entry
/row
+   row
+entryliteral||/literal/entry
+entrytypejsonb/type/entry
+entryConcatentate these two values to make a new value/entry
+entryliteral'[a, b]'::jsonb || '[c, d]'::jsonb/literal/entry
+   /row
+   row
+entryliteral-/literal/entry
+entrytypetext/type/entry
+entryDelete the object field with this key/entry
+entryliteral'{a: b}'::jsonb - 'a' /literal/entry
+   /row
+   row
+entryliteral-/literal/entry
+entrytypeinteger/type/entry
+entryDelete the element with this index (Negative integers count from the end of the array.)/entry
+entryliteral'[a, b]'::jsonb - 1 /literal/entry
+   /row
   /tbody
  /tgroup
/table
@@ -10760,6 +10778,42 @@ table2-mapping
entryliteraljson_strip_nulls('[{f1:1,f2:null},2,null,3]')/literal/entry
entryliteral[{f1:1},2,null,3]/literal/entry
/row
+  row
+   entryparaliteraljsonb_replace(target jsonb, path text[], replacement jsonb)/literal
+ /para/entry
+   entryparatypejsonb/type/para/entry
+   entry
+ Returns replaceabletarget/replaceable
+ with the section designated by  replaceablepath/replaceable
+ replaced by replaceablereplacement/replaceable.
+   /entry
+   entryliteraljsonb_replace('[{f1:1,f2:null},2,null,3]', '{0,f1},'[2,3,4]')/literal/entry
+   entryliteral[{f1:[2,3,4],f2:null},2,null,3]/literal
+/entry
+   /row
+  row
+   entryparaliteraljsonb_indent(from_json jsonb)/literal
+ /para/entry
+   entryparatypetext/type/para/entry
+   entry
+ Returns replaceablefrom_json/replaceable
+ as indented json text.
+   /entry
+   entryliteraljsonb_indent('[{f1:1,f2:null},2,null,3]')/literal/entry
+   entry
+programlisting
+ [  
+ {  
+ f1: 1,   
+ f2: null 
+ }, 
+ 2, 
+ null,  
+ 3  
+ ]
+/programlisting
+/entry
+   /row
  /tbody
 /tgroup
/table
diff --git a/src/backend/utils/adt/jsonb.c b/src/backend/utils/adt/jsonb.c
index 644ea6d..4ba4951 100644
--- a/src/backend/utils/adt/jsonb.c
+++ b/src/backend/utils/adt/jsonb.c
@@ -77,6 +77,8 @@ static void datum_to_jsonb(Datum val, bool is_null, JsonbInState *result,
 static void add_jsonb(Datum val, bool is_null, JsonbInState *result,
 		  Oid val_type, bool key_scalar);
 static JsonbParseState * clone_parse_state(JsonbParseState * state);
+static char *JsonbToCStringWorker(StringInfo out, JsonbContainer *in, int estimated_len, bool indent);
+static void add_indent(StringInfo out, bool indent, int level);
 
 /*
  * jsonb type input function
@@ -414,12 +416,25 @@ jsonb_in_scalar(void *pstate, char *token, JsonTokenType tokentype)
 char *
 JsonbToCString(StringInfo out, JsonbContainer *in, int estimated_len)
 {
+	return JsonbToCStringWorker(out, in, estimated_len, false);
+}
+
+char *
+JsonbToCStringIndent(StringInfo out, JsonbContainer *in, int estimated_len)
+{
+	return JsonbToCStringWorker(out, in, estimated_len, true);
+}
+
+static char *
+JsonbToCStringWorker(StringInfo out, JsonbContainer *in, int estimated_len, bool indent)
+{
 	bool		first = true;
 	JsonbIterator *it;
 	int			type = 0;
 	JsonbValue	v;
 	int			level = 0;
 	bool		redo_switch = false;
+	int			ispaces = indent ? 1 : 2;
 
 	if (out == NULL)
 		out = makeStringInfo();
@@ -436,26 +451,34 @@ JsonbToCString(StringInfo out, JsonbContainer *in, int estimated_len)
 		{
 			case WJB_BEGIN_ARRAY:
 if (!first)
-	appendBinaryStringInfo(out, , , 2);
+	appendBinaryStringInfo(out, , , ispaces);
 first = true;
 
 if 

[HACKERS] Add pg_settings.pending_restart column

2015-02-14 Thread Peter Eisentraut
When managing configuration changes through automatic systems like Chef
or Puppet, there is a problem: How do you manage changes requiring a
restart?

Generally, you'd set it up so that when a configuration file is changed,
the server is reloaded.  But for settings that require a restart, well,
I don't know.  From discussions with others, it emerged that a way to
ask the server whether a restart is necessary would be useful.  Then you
can either automate the restart, or have a monitoring system warn you
about it, and possibly schedule a restart separately or undo the
configuration file change.

So here is a patch for that.  It adds a column pending_restart to
pg_settings that is true when the configuration file contains a changed
setting that requires a restart.  We already had the logic to detect
such changes, for producing the log entry.  I have also set it up so
that if you change your mind and undo the setting and reload the server,
the pending_restart flag is reset to false.
From dd477dba05c5fc3a6e51535bb8280a67d22271f5 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut pete...@gmx.net
Date: Sat, 14 Feb 2015 21:58:59 -0500
Subject: [PATCH] Add pg_settings.pending_restart column

---
 doc/src/sgml/catalogs.sgml  |  6 ++
 src/backend/utils/misc/guc.c| 21 -
 src/include/catalog/catversion.h|  2 +-
 src/include/catalog/pg_proc.h   |  2 +-
 src/include/utils/guc_tables.h  |  1 +
 src/test/regress/expected/rules.out |  5 +++--
 6 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 515a40e..4d1a401 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -8841,6 +8841,12 @@ titlestructnamepg_settings/ Columns/title
   or when examined by a non-superuser)
   /entry
  /row
+ row
+  entrystructfieldpending_restart/structfield/entry
+  entrytypeboolean/type/entry
+  entrySet to true of the value has been changed in the configuration
+   file but needs a restart./entry
+ /row
 /tbody
/tgroup
   /table
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 9572777..44a2fc2 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -5834,12 +5834,15 @@ set_config_option(const char *name, const char *value,
 {
 	if (*conf-variable != newval)
 	{
+		record-status |= GUC_PENDING_RESTART;
 		ereport(elevel,
 (errcode(ERRCODE_CANT_CHANGE_RUNTIME_PARAM),
  errmsg(parameter \%s\ cannot be changed without restarting the server,
 		name)));
 		return 0;
 	}
+	else
+		record-status = ~GUC_PENDING_RESTART;
 	return -1;
 }
 
@@ -5922,12 +5925,15 @@ set_config_option(const char *name, const char *value,
 {
 	if (*conf-variable != newval)
 	{
+		record-status |= GUC_PENDING_RESTART;
 		ereport(elevel,
 (errcode(ERRCODE_CANT_CHANGE_RUNTIME_PARAM),
  errmsg(parameter \%s\ cannot be changed without restarting the server,
 		name)));
 		return 0;
 	}
+	else
+		record-status = ~GUC_PENDING_RESTART;
 	return -1;
 }
 
@@ -6010,12 +6016,15 @@ set_config_option(const char *name, const char *value,
 {
 	if (*conf-variable != newval)
 	{
+		record-status |= GUC_PENDING_RESTART;
 		ereport(elevel,
 (errcode(ERRCODE_CANT_CHANGE_RUNTIME_PARAM),
  errmsg(parameter \%s\ cannot be changed without restarting the server,
 		name)));
 		return 0;
 	}
+	else
+		record-status = ~GUC_PENDING_RESTART;
 	return -1;
 }
 
@@ -6116,12 +6125,15 @@ set_config_option(const char *name, const char *value,
 	if (*conf-variable == NULL || newval == NULL ||
 		strcmp(*conf-variable, newval) != 0)
 	{
+		record-status |= GUC_PENDING_RESTART;
 		ereport(elevel,
 (errcode(ERRCODE_CANT_CHANGE_RUNTIME_PARAM),
  errmsg(parameter \%s\ cannot be changed without restarting the server,
 		name)));
 		return 0;
 	}
+	else
+		record-status = ~GUC_PENDING_RESTART;
 	return -1;
 }
 
@@ -6209,12 +6221,15 @@ set_config_option(const char *name, const char *value,
 {
 	if (*conf-variable != newval)
 	{
+		record-status |= GUC_PENDING_RESTART;
 		ereport(elevel,
 (errcode(ERRCODE_CANT_CHANGE_RUNTIME_PARAM),
  errmsg(parameter \%s\ cannot be changed without restarting the server,
 		name)));
 		return 0;
 	}
+	else
+		record-status = ~GUC_PENDING_RESTART;
 	return -1;
 }
 
@@ -7907,6 +7922,8 @@ GetConfigOptionByNum(int varnum, const char **values, bool *noshow)
 		values[14] = NULL;
 		values[15] = NULL;
 	}
+
+	values[16] = conf-status  GUC_PENDING_RESTART ? t : f;
 }
 
 /*
@@ -7942,7 +7959,7 @@ show_config_by_name(PG_FUNCTION_ARGS)
  * show_all_settings - equiv to SHOW ALL command but implemented as
  * a Table Function.
  

[HACKERS] forward vs backward slashes in msvc build code

2015-02-14 Thread Peter Eisentraut
I understand that on Windows, you can use forward slashes in path names
interchangeably with backward slashes (except possibly in the shell).  I
have developed the attached patch to change the msvc build code to use
forward slashes consistently.  Together with the other attached patch,
which is an unpolished hack, this allows me to run the build.pl script
on not-Windows.  It won't actually build, but it will create all the
project files.  That way, I can check whether some makefile hackery
would break the Windows build.  Could someone verify this please?
From 46c7e31ee49f32fb373a8bd10c3bd5e777359053 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut pete...@gmx.net
Date: Sat, 14 Feb 2015 22:42:41 -0500
Subject: [PATCH 1/2] Workarounds for lack of Win32 module

---
 src/tools/msvc/Mkvcbuild.pm | 2 +-
 src/tools/msvc/Project.pm   | 2 +-
 src/tools/msvc/Solution.pm  | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index dba9b63..879589d 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -6,7 +6,7 @@ package Mkvcbuild;
 # src/tools/msvc/Mkvcbuild.pm
 #
 use Carp;
-use Win32;
+#use Win32;
 use strict;
 use warnings;
 use Project;
diff --git a/src/tools/msvc/Project.pm b/src/tools/msvc/Project.pm
index b9b5a23..6d84d89 100644
--- a/src/tools/msvc/Project.pm
+++ b/src/tools/msvc/Project.pm
@@ -21,7 +21,7 @@ sub _new
 	my $self = {
 		name  = $name,
 		type  = $type,
-		guid  = Win32::GuidGen(),
+		guid  = $^O eq 'MSWin32' ? Win32::GuidGen() : '{FIXME}',
 		files = {},
 		references= [],
 		libraries = [],
diff --git a/src/tools/msvc/Solution.pm b/src/tools/msvc/Solution.pm
index 39e41f6..9bd864c 100644
--- a/src/tools/msvc/Solution.pm
+++ b/src/tools/msvc/Solution.pm
@@ -562,7 +562,7 @@ EOF
 		}
 		if ($fld ne )
 		{
-			$flduid{$fld} = Win32::GuidGen();
+			$flduid{$fld} = $^O eq 'MSWin32' ? Win32::GuidGen() : '{FIXME}';
 			print SLN EOF;
 Project({2150E333-8FDC-42A3-9474-1A3956D46DE8}) = $fld, $fld, $flduid{$fld}
 EndProject
-- 
2.3.0

From 1898bb7812920c64d19476a77db8adaaa54cba46 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut pete...@gmx.net
Date: Sat, 14 Feb 2015 22:43:03 -0500
Subject: [PATCH 2/2] Use forward slash instead of backward slash in msvc build
 code

---
 src/tools/msvc/MSBuildProject.pm |   2 +-
 src/tools/msvc/Mkvcbuild.pm  | 322 +++
 src/tools/msvc/Project.pm|  41 +++--
 src/tools/msvc/Solution.pm   | 138 -
 4 files changed, 250 insertions(+), 253 deletions(-)

diff --git a/src/tools/msvc/MSBuildProject.pm b/src/tools/msvc/MSBuildProject.pm
index 37958f9..a16f9ac 100644
--- a/src/tools/msvc/MSBuildProject.pm
+++ b/src/tools/msvc/MSBuildProject.pm
@@ -127,7 +127,7 @@ EOF
 	foreach my $fileNameWithPath (sort keys %{ $self-{files} })
 	{
 		confess Bad format filename '$fileNameWithPath'\n
-		  unless ($fileNameWithPath =~ /^(.*)\\([^\\]+)\.(c|cpp|y|l|rc)$/);
+		  unless ($fileNameWithPath =~ m!^(.*)/([^/]+)\.(c|cpp|y|l|rc)$!);
 		my $dir  = $1;
 		my $fileName = $2;
 		if ($fileNameWithPath =~ /\.y$/ or $fileNameWithPath =~ /\.l$/)
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index 879589d..8820d3a 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -56,9 +56,9 @@ sub mkvcbuild
 {
 	our $config = shift;
 
-	chdir('..\..\..') if (-d '..\msvc'  -d '..\..\..\src');
+	chdir('../../..') if (-d '../msvc'  -d '../../../src');
 	die 'Must run from root or msvc directory'
-	  unless (-d 'src\tools\msvc'  -d 'src');
+	  unless (-d 'src/tools/msvc'  -d 'src');
 
 	my $vsVersion = DetermineVisualStudioVersion();
 
@@ -85,36 +85,36 @@ sub mkvcbuild
 
 	$libpgport = $solution-AddProject('libpgport', 'lib', 'misc');
 	$libpgport-AddDefine('FRONTEND');
-	$libpgport-AddFiles('src\port', @pgportfiles);
+	$libpgport-AddFiles('src/port', @pgportfiles);
 
 	$libpgcommon = $solution-AddProject('libpgcommon', 'lib', 'misc');
 	$libpgcommon-AddDefine('FRONTEND');
-	$libpgcommon-AddFiles('src\common', @pgcommonfrontendfiles);
+	$libpgcommon-AddFiles('src/common', @pgcommonfrontendfiles);
 
-	$postgres = $solution-AddProject('postgres', 'exe', '', 'src\backend');
-	$postgres-AddIncludeDir('src\backend');
-	$postgres-AddDir('src\backend\port\win32');
-	$postgres-AddFile('src\backend\utils\fmgrtab.c');
+	$postgres = $solution-AddProject('postgres', 'exe', '', 'src/backend');
+	$postgres-AddIncludeDir('src/backend');
+	$postgres-AddDir('src/backend/port/win32');
+	$postgres-AddFile('src/backend/utils/fmgrtab.c');
 	$postgres-ReplaceFile(
-		'src\backend\port\dynloader.c',
-		'src\backend\port\dynloader\win32.c');
-	$postgres-ReplaceFile('src\backend\port\pg_sema.c',
-		'src\backend\port\win32_sema.c');
-	$postgres-ReplaceFile('src\backend\port\pg_shmem.c',
-		'src\backend\port\win32_shmem.c');
-