Re: [HACKERS] smart shutdown at end of transaction (was: Default mode for shutdown)

2012-05-07 Thread Albe Laurenz
Fujii Masao wrote:
 I'm not necessarily opposed to commandeering the name smart for the
 new behavior, so that what we have to find a name for is the old smart
 behavior.  How about

        slow    - allow existing sessions to finish (old smart)
        smart   - allow existing transactions to finish (new)
        fast    - kill active queries
        immediate - unclean shutdown

 I could live with that.  Really, I'd like to have fast just be the
 default.  But the above compromise would still be a big improvement
 over what we have now, assuming the new smart becomes the default.
 
 Should this new shutdown mode wait for online backup like old smart does?

I think it shouldn't; I like to think of it as some kind of quite fast
shutdown (provided there are no long-running transactions).

And I still think that we should choose a name different from smart
to indicate that something has changed, even if it is the new default.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unexpected EOF messages

2012-05-07 Thread Magnus Hagander
On Thu, May 3, 2012 at 9:26 PM, Magnus Hagander mag...@hagander.net wrote:
 On Thu, May 3, 2012 at 7:48 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 Heh - we already used ERRCODE_CONNECTION_FAILURE on the errors in
 copy.c. Since COPY can only happen when there is a transaction
 (right?), I just changed those error messages for consistency.

 Agreed on changing the message texts to match, but I wonder whether
 we ought not switch all those SQLSTATEs to something different.  Per my
 comment to Kevin, I think the whole 08 class is meant to be issued on
 the client side.  Maybe it's okay to conflate a server-detected
 connection loss with client-detected loss, but I'm not convinced.

 Sure,that's a simple search and replace of course... If we can come to
 a decision about what codes to actually use. I'm not sure I have much
 input other than that I agree they need to be different :-)

Any further suggestoins for which codes to use? If not, I think I'm
going to commit the patch as I had it, because it's not any worse than
what we had before (but fixes the annoying messages), and we can
always revisit the actual errorcodes later.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Temporary tables under hot standby

2012-05-07 Thread Merlin Moncure
On Tue, Apr 24, 2012 at 10:55 PM, Noah Misch n...@leadboat.com wrote:
 A key barrier to migrations from trigger-based replication to WAL-based
 replication is the lack of temporary tables under hot standby.  I'd like to
 close that gap; the changes needed will also reduce the master-side cost of
 temporary table usage.  Here is a high-level design for your advice and
 comments.  Much of this builds on ideas from past postings noted below.

On Mon, May 7, 2012 at 8:52 AM, Michael Nolan htf...@gmail.com wrote:
 To cross-pollinate with another thread, if temporary tables (and
 insert/delete/update transactions to them) are to be supported on a slave,
 will the applications using those temporary tables expect to be able to use
 'nextval' on inserts to temporary tables as well?

That's a very good question.  I'm moving it  -hackers...how do non
table temporary objects work?  Do you have CREATE GLOBAL TEMPORARY
SEQUENCE?  etc.

My understanding of the current proposal is that the sequence (along
with the rest of the table) has to be defined in the master as a
global temporary table. It seems that it wouldn't be possible or
desirable to serialize sequence fetches between the master and standby
servers, but I'm not sure about that.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9.2 Beta release notes

2012-05-07 Thread Bruce Momjian
FYI, I am planning to complete the 9.2 beta release notes by this
Wednesday night, America-time, so developers will have Thursday to make
adjustments before we ship the release notes as part of the beta.

I wanted to complete them sooner, but I also wanted to be current on
email before I started.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] more possible dead ports cleanup

2012-05-07 Thread Robert Haas
On Sun, May 6, 2012 at 9:04 AM, Peter Eisentraut pete...@gmx.net wrote:
 I think a few more things could removed/simplified after the recent
 round of port removal:

 - Remove definition of offsetof() in c.h

I see no particular virtue to getting rid of this.

 - (Side point, the definition of endof() in the same place isn't used
 anywhere, and doesn't look safe to me, because it can go one past the
 end of memory.)

That I think we could remove.

 - Remove strtol, strtoul, strdup from libpgport.  I seem to recall these
 were for nextstep at some point.  Before we applied libpgport
 everywhere, we were actually already going fine without these in some
 parts of the source tree.

+1 for removing these and seeing if anyone complains.  We can always
put 'em back if there's a problem.

 Some more risky things that can only be tested at run time;

 - We could use fflush(NULL) in src/backend/postmaster/fork_process.c.

 - There is some business about realloc(NULL) not working in
 src/interfaces/libpq/fe-exec.c.

I doubt that either of these things is worth changing.  It seems
unlikely it would save anything material, and it might break again in
some future port.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unexpected EOF messages

2012-05-07 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 Any further suggestoins for which codes to use? If not, I think I'm
 going to commit the patch as I had it, because it's not any worse than
 what we had before (but fixes the annoying messages), and we can
 always revisit the actual errorcodes later.

I'm still a bit uncomfortable about using the 08 codes on the backend
side; but on reflection it's hard to see how it could cause any real
confusion, so maybe we should just go with that.

Another point is that the patch would be shorter and more reliable
if you just forced whereToSendOutput = DestNone, without trying to save
and restore it.  Once the connection is known busted, there is no point
in sending any future I/O towards the client, either.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] smart shutdown at end of transaction (was: Default mode for shutdown)

2012-05-07 Thread Robert Haas
On Sat, May 5, 2012 at 12:41 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Sat, Apr 28, 2012 at 4:00 AM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Apr 27, 2012 at 2:48 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm not necessarily opposed to commandeering the name smart for the
 new behavior, so that what we have to find a name for is the old smart
 behavior.  How about

        slow    - allow existing sessions to finish (old smart)
        smart   - allow existing transactions to finish (new)
        fast    - kill active queries
        immediate - unclean shutdown

 I could live with that.  Really, I'd like to have fast just be the
 default.  But the above compromise would still be a big improvement
 over what we have now, assuming the new smart becomes the default.

 Should this new shutdown mode wait for online backup like old smart does?

I think it had better not, because what happens when all the
connections are gone, no new ones can be made, and yet online backup
mode is still active?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unexpected EOF messages

2012-05-07 Thread Magnus Hagander
On Mon, May 7, 2012 at 5:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 Any further suggestoins for which codes to use? If not, I think I'm
 going to commit the patch as I had it, because it's not any worse than
 what we had before (but fixes the annoying messages), and we can
 always revisit the actual errorcodes later.

 I'm still a bit uncomfortable about using the 08 codes on the backend
 side; but on reflection it's hard to see how it could cause any real
 confusion, so maybe we should just go with that.

 Another point is that the patch would be shorter and more reliable
 if you just forced whereToSendOutput = DestNone, without trying to save
 and restore it.  Once the connection is known busted, there is no point
 in sending any future I/O towards the client, either.

Makes sense, will change and commit.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Latch for the WAL writer - further reducing idle wake-ups.

2012-05-07 Thread Tom Lane
Peter Geoghegan pe...@2ndquadrant.com writes:
 This latest revision also covers the checkpointer. The code for that
 is far simpler than that for the WAL Writer, so it doesn't
 particularly feel like I'm pushing my luck by slipping that into
 something to be slipped in.

Well ... maybe, or maybe not, or maybe you are just poking at a sore
spot that was already created by the patch to make a separate
checkpointer process.  What bothers me in looking at this is that the
main loop of the checkpointer includes an AbsorbFsyncRequests() call,
which is now the only wakeup condition that isn't covered by latch
logic or a predictable time delay.  A long sleep period could easily
result in overflow of the fsync request queue, which is not good for
performance.  I'm inclined to think that we'd better add logic to
ForwardFsyncRequest() to set the latch once the queue is, say, more
than half full.

I also notice that the separate-checkpointer patch failed to rename
assorted things like BgWriterCommLock, BgWriterRequest,
BgWriterShmemStruct, which are all 100% inappropriately named now.
And it still contains various obsolete comments referring to itself
as the background writer.  Will see about cleaning that up.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unexpected EOF messages

2012-05-07 Thread Robert Haas
On Mon, May 7, 2012 at 12:39 PM, Magnus Hagander mag...@hagander.net wrote:
 Makes sense, will change and commit.

Since the following hunk is repeated 3x, maybe it should be stuffed
into a function that is then called in three places:

+   if (IsTransactionState())
+   ereport(COMMERROR,
+   (errcode(ERRCODE_CONNECTION_FAILURE),
+errmsg(unexpected EOF on
client connection with an open transaction)));
+   else
+   {
+   /*
+* Can't send DEBUG log messages to client at
this point.
+* Since we're disconnecting right away, we
don't need to
+* restore whereToSendOutput.
+*/
+   whereToSendOutput = DestNone;
+   ereport(DEBUG1,
+
(errcode(ERRCODE_CONNECTION_DOES_NOT_EXIST),
+errmsg(unexpected EOF on
client connection)));
+   }

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unexpected EOF messages

2012-05-07 Thread Magnus Hagander
On Mon, May 7, 2012 at 7:18 PM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, May 7, 2012 at 12:39 PM, Magnus Hagander mag...@hagander.net wrote:
 Makes sense, will change and commit.

 Since the following hunk is repeated 3x, maybe it should be stuffed
 into a function that is then called in three places:

I considered it trivial enough not to do that for it. I can perhaps be
convinced otherwise, but I doubt it's worth it..

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Uppercase tab completion keywords in psql?

2012-05-07 Thread Robert Haas
On Sat, May 5, 2012 at 9:03 AM, Bruce Momjian br...@momjian.us wrote:
 On Fri, May 04, 2012 at 08:46:28PM +0300, Peter Eisentraut wrote:
 On tor, 2012-05-03 at 15:47 -0400, Bruce Momjian wrote:
  Peter, where are we on this?

 I hadn't received any clear feedback, but if no one objects, I can
 commit it.

 I think there were enough people that wanted some kind of control in
 this area.  I did give you feedback on the patch.

Yes, there were significantly more votes for reverting this than
keeping it.  So we at least need to have a setting.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unexpected EOF messages

2012-05-07 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Mon, May 7, 2012 at 7:18 PM, Robert Haas robertmh...@gmail.com wrote:
 Since the following hunk is repeated 3x, maybe it should be stuffed
 into a function that is then called in three places:

 I considered it trivial enough not to do that for it. I can perhaps be
 convinced otherwise, but I doubt it's worth it..

I had considered suggesting the same, but decided not to on the grounds
that if we fold these into a subroutine, it will no longer be possible
to tell from the file-and-line-number info which call site reported the
error.  I'm not sure that there would be cases where we'd want to tell
that, but I'm not sure there wouldn't be, either.  So on the whole I
agree with the way Magnus coded it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Latch for the WAL writer - further reducing idle wake-ups.

2012-05-07 Thread Simon Riggs
On 7 May 2012 18:09, Tom Lane t...@sss.pgh.pa.us wrote:
 Peter Geoghegan pe...@2ndquadrant.com writes:
 This latest revision also covers the checkpointer. The code for that
 is far simpler than that for the WAL Writer, so it doesn't
 particularly feel like I'm pushing my luck by slipping that into
 something to be slipped in.

 Well ... maybe, or maybe not, or maybe you are just poking at a sore
 spot that was already created by the patch to make a separate
 checkpointer process.  What bothers me in looking at this is that the
 main loop of the checkpointer includes an AbsorbFsyncRequests() call,
 which is now the only wakeup condition that isn't covered by latch
 logic or a predictable time delay.  A long sleep period could easily
 result in overflow of the fsync request queue, which is not good for
 performance.  I'm inclined to think that we'd better add logic to
 ForwardFsyncRequest() to set the latch once the queue is, say, more
 than half full.

OK

 I also notice that the separate-checkpointer patch failed to rename
 assorted things like BgWriterCommLock, BgWriterRequest,
 BgWriterShmemStruct, which are all 100% inappropriately named now.
 And it still contains various obsolete comments referring to itself
 as the background writer.  Will see about cleaning that up.

For want of a better name, keeping them the same seemed best.

If you have a suggested name change, I'd be happy to oblige.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Latch for the WAL writer - further reducing idle wake-ups.

2012-05-07 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On 7 May 2012 18:09, Tom Lane t...@sss.pgh.pa.us wrote:
 I also notice that the separate-checkpointer patch failed to rename
 assorted things like BgWriterCommLock, BgWriterRequest,
 BgWriterShmemStruct, which are all 100% inappropriately named now.
 And it still contains various obsolete comments referring to itself
 as the background writer.  Will see about cleaning that up.

 For want of a better name, keeping them the same seemed best.

I was just thinking s/BgWriter/Checkpointer/, do you think that's too
long?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Latch for the WAL writer - further reducing idle wake-ups.

2012-05-07 Thread Simon Riggs
On 7 May 2012 19:44, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 On 7 May 2012 18:09, Tom Lane t...@sss.pgh.pa.us wrote:
 I also notice that the separate-checkpointer patch failed to rename
 assorted things like BgWriterCommLock, BgWriterRequest,
 BgWriterShmemStruct, which are all 100% inappropriately named now.
 And it still contains various obsolete comments referring to itself
 as the background writer.  Will see about cleaning that up.

 For want of a better name, keeping them the same seemed best.

 I was just thinking s/BgWriter/Checkpointer/, do you think that's too
 long?

CheckpointerCommLock
CheckpointerShmemStruct
work OK

CheckpointerRequest
sounds a little vague, but can be tweaked

It also leaves the situation that we have a catalog view called
pg_stat_bgwriter that would be accessing checkpointer things. That's
really the thorny one that I wasn't sure how to handle. Good example
of why we shouldn't expose internals too much.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Latch for the WAL writer - further reducing idle wake-ups.

2012-05-07 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 It also leaves the situation that we have a catalog view called
 pg_stat_bgwriter that would be accessing checkpointer things. That's
 really the thorny one that I wasn't sure how to handle. Good example
 of why we shouldn't expose internals too much.

Yeah, that's a bit unfortunate but changing it doesn't seem like a good
idea.  The names I intended to change are all internal.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Latch for the WAL writer - further reducing idle wake-ups.

2012-05-07 Thread Simon Riggs
On 7 May 2012 20:06, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 It also leaves the situation that we have a catalog view called
 pg_stat_bgwriter that would be accessing checkpointer things. That's
 really the thorny one that I wasn't sure how to handle. Good example
 of why we shouldn't expose internals too much.

 Yeah, that's a bit unfortunate but changing it doesn't seem like a good
 idea.  The names I intended to change are all internal.

OK, I'll change just the internal names.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] c-function variants running time

2012-05-07 Thread Robert Haas
On Fri, May 4, 2012 at 12:41 PM, Armando
armando.mirag...@stud-inf.unibz.it wrote:
 Hi everybody.

 First of all I have to thank you for your wonderful job! PostgreSQL rocks!

 I am writing you because I am interested in understanding some specifics 
 related
 to PostgreSQL internals. More precisely, I am investigating the running time
 of the different function implementation approaches, which is part of my BSc
 thesis.

I would suggest that you run perf or oprofile to figure out where the
time is being spent.

It's a bit hard to tell what these functions are intended to do.  It's
not obvious that you're doing anything that couldn't be done using
straight SQL.

How fast do you need this to run?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] CLOG background writing

2012-05-07 Thread Robert Haas
I spent a significant chunk of my time last week, and also a whole lot
of machine time, trying to evaluate the effectiveness of flushing CLOG
pages to disk in the background.  Simon made the last effort in this
area:

http://archives.postgresql.org/pgsql-hackers/2012-01/msg00571.php

...but we weren't able to demonstrate that it improved performance.
However, commit 3ae5133b1cf478d51f2003bc68ba0edb84c7 improved the
SLRU eviction logic in a way that seems to give background writing a
better chance of actually helping, so I thought it worthwhile to
revisit the topic.  That commit conflicted heavily with Simon's patch,
so I drummed up a couple of patches of my own.  They are a bit
different in detail from what Simon did: his patch cleaned the next
buffer to be evicted, while my patch will clean any dirty buffer
provided that it's old enough.   In the attached
background-write-clog.patch, old enough means before RecentXmin -
that is, we clean pages as soon as we know they won't be written
again.  Unfortunately the approach taken there can't work during
recovery (I think), so I tried another approach in the attached
background-write-clog-2p.patch, which cleans pages are more than two
pages before the page where nextXid lives.  I then benchmarked these
using pgbench at scale factor 300.

I thought this approach would be better than cleaning only the oldest
buffer because it's fairly easy to thrash the cache, so even a
recently-used buffer may get evicted within a very short period of
time.  But it turns out that it's not: even if you aggressively clean
the CLOG pool on every background-writer tick, the backends still end
up doing all the dirty-page eviction.  There's just so much cache
pressure that things end up getting booted out of the cache more or
less randomly, and a background writing process that comes along every
200ms is far too slow.  I didn't try cranking down the bgwriter delay,
but I doubt it would help much.

Now, potentially, the fix here is to tweak the buffer replacement
algorithm so that it prefers to a newer clean buffer over an older
dirty buffer.  But there's danger lurking in the weeds there, because
now you really need background writing for *all* of the SLRUs, not
just CLOG.  Otherwise, you can get really pathological situations
where, say, all the pages but one are dirty, and you sit there and
replace the last remaining non-dirty buffer over and over again.  Or,
alternatively, all the buffers become dirty, and now suddenly every
backend in the system starts a buffer I/O and you get a system-wide
stall of exactly the type we're trying to avoid by doing background
writing in the first place.  And it's not enough to just have *some*
kind of background writing for every SLRU - it's actually got to be
aggressive enough to keep up, which is probably not too hard for CLOG
but may be trickier for some of the others: you only need to clean a
CLOG buffer every couple of seconds at current peak transaction rates,
but you need to clean pg_subtrans buffers 16 times as fast, which is
starting to push the limits of what we can expect the bgwriter to keep
up with as a side task.  Also, if you clean *too* aggressively, you'll
end up increasing the total write volume, which isn't good either.

We could add another background task just to do background cleaning of
SLRU buffers of all sorts, but I think it might be time to consider
whether there are other reasonable approaches to the problem.  I have
a couple of thoughts in mind.

1. Instrumentation reveals that ExtendCLOG() causes much longer stalls
than ExtendSUBTRANS(), and it appears that those stalls happen mostly
as a result of ExtendCLOG() needing to evict a dirty buffer.  But
ExtendSUBTRANS() also evicts dirty buffers, yet the stalls that it
causes are much less severe.  Of course, this is because clog is
fsync'd and subtrans is not.  I previously suggested the idea of
passing off fsync requests for SLRU buffers just as we do for
shared_buffers, and I think that's one angle that we should
investigate here.  Backends might still end up writing dirty pages,
but not having to fsync them would ease the pain quite a bit.  And
even if we figure out a way to make background writing safe and
useful, off-loading the fsyncs is still a good back-stop against the
possibility that a backend might somehow end up writing a dirty page
anyway.  So I'm going to see if I can work something up for this.

2. ExtendSUBTRANS() seems ripe for optimization.  Many pg_subtrans
pages will never contain anything but zeros.  So ExtendSUBTRANS() is
mostly guarding against XID wrapround: we need to make sure we clobber
any pg_subtrans data left over from previous use of the XID space -
but if we kept track of the age of the oldest pg_subtrans page, we
could know that there's no problem there, as will normally be the
case.  I think we could then arrange to create pg_subtrans pages
lazily, rather than repeatedly writing out a dirty page of all zeros
to make room for a new 

[HACKERS] problem/bug in drop tablespace?

2012-05-07 Thread Michael Nolan
While researching a problem reported on the -general list by a user who
lost a disk containing his index tablespace, I ran into something, but  I'm
not sure is a serious bug or just an inconsistency in how \d shows tables.

Here are the steps I took.

1.  Create a new database 'MYDB' and connect to it.
2.  Create a new tablespace 'MYTBLSP'
3.  Create a table 'MYTABLE' and populate it.
4.  Create an index 'MYIND' on that table, with the index in the new
tablespace, MYTBLSP.

Now, exit psql and delete the files in the tablespace directory created in
step 2, simulating the problem the user had.

Trying to execute an SQL command on the table MYTABLE will, as expected,
generate an error.

Now, drop tablespace MYTBLSP.  This will produce an error, but it will
delete the tablespace according to \db.

Recreate tablespace MYTBLSP.

Regenerate the index on MYTABLE.  Queries will work on this table again, as
expected.

Now, here's the problem I ran into:

The index will be rebuilt in tablespace MYTBLSP, but \d on table MYTABLE
will not show the index as being in that tablespace.
--
Mike Nolan