Re: [HACKERS] Fast promotion failure

2013-05-16 Thread Kyotaro HORIGUCHI
Hello,

   Is the point of this discussion that the patch may leave out some
   glich about timing of timeline-related changing and Heikki saw an
   egress of that?
  
   AFAIU, the committed patch has some gap in overall scenario which is
  the
   fast promotion issue.
  
  Right, the fast promotion issue is still there.
  
  Just to get us all on the same page again: Does anyone see a problem
  with a fresh git checkout, with the fast-promotion-quick-fix.patch
  applied?
  (http://www.postgresql.org/message-id/51894942.4080...@vmware.com). If
  you do, please speak up. As far as I know, the already-committed patch,
  together with fast-promotion-quick-fix.patch, should fix all known
  issues (*).

Shared XLogCtl-ThisTimeLineID is written and read without
fencing by spinlock unlike some other XLogCtl members. Can this
break coherency of its memory between different processors?  It
is quite reasonable that I cannot find the trouble if it is the
cause. I didn't see the issue even without
fast-promotion-quick-fix.patch.

 The patch provided will un-necessarily call InitXLOGAccess() 2 times for End
 of recovery checkpoint, it doesn't matter w.r.t performance but actually the
 purpose will
 be almost same for calling LocalSetXLogInsertAllowed() and InitXLOGAccess(),
 or am I missing something.
 
 One more thing, I think after fast promotion, either it should set timeline
 or give error in CreateCheckPoint() function before it reaches the check
 mentioned by you in your initial mail.
 if (RecoveryInProgress()  (flags  CHECKPOINT_END_OF_RECOVERY) == 0) 
 elog(ERROR, can't create a checkpoint during recovery);
 Shouldn't it set timeline in above check (RecoveryInProgress()) or when
 RecoveryInProgress() is called before CreateCheckPoint()?

Thinking of checkpointer, it does RecoveryInProgress() far
earlier to there, in waiting loop in CheckPointerMain where to
decide which to do between checkpoint and restartpoint. So
InitXLogAccess() has been already done when checkpoint is choosed
there for the first time. And before that, ThisTimeLineID in the
startup process gets incremented and is copied onto
XLogCtl-ThisTimeLineID before xlogctl-SharedRecoveryInProgress
becomes false in StartupXLog().  I don't think it is possible for
checkpointer to run on older timeline id on codition that all
processes looks exactly the same memory image.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logging of PAM Authentication Failure

2013-05-16 Thread Amit Langote
Attached herewith is a patch based on description in my previous mail.
This patch would need revision since the error situation in case of
authentication timeout on the server needs to be handled; probably in
simple_prompt()?


--
Amit Langote


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast promotion failure

2013-05-16 Thread Simon Riggs
On 16 May 2013 07:02, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote:

   fast promotion issue.

Excuse me for not joining the thread earlier. I'm not available today,
but will join in later in my evening.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logging of PAM Authentication Failure

2013-05-16 Thread Amit Langote
On Thu, May 16, 2013 at 3:53 PM, Amit Langote amitlangot...@gmail.com wrote:
 Attached herewith is a patch based on description in my previous mail.
 This patch would need revision since the error situation in case of
 authentication timeout on the server needs to be handled; probably in
 simple_prompt()?

Forgot attaching the patch in the last mail; find it with this one.


--
Amit Langote


psql-password-over-existing-conn.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fast promotion failure

2013-05-16 Thread Amit Kapila
On Thursday, May 16, 2013 11:33 AM Kyotaro HORIGUCHI wrote:
 Hello,
 
Is the point of this discussion that the patch may leave out
 some
glich about timing of timeline-related changing and Heikki saw
 an
egress of that?
   
AFAIU, the committed patch has some gap in overall scenario which
 is
   the
fast promotion issue.
  
   Right, the fast promotion issue is still there.
  
   Just to get us all on the same page again: Does anyone see a
 problem
   with a fresh git checkout, with the fast-promotion-quick-fix.patch
   applied?
   (http://www.postgresql.org/message-id/51894942.4080...@vmware.com).
 If
   you do, please speak up. As far as I know, the already-committed
 patch,
   together with fast-promotion-quick-fix.patch, should fix all known
   issues (*).
 
 Shared XLogCtl-ThisTimeLineID is written and read without
 fencing by spinlock unlike some other XLogCtl members. Can this
 break coherency of its memory between different processors?  It
 is quite reasonable that I cannot find the trouble if it is the
 cause. I didn't see the issue even without
 fast-promotion-quick-fix.patch.
 
  The patch provided will un-necessarily call InitXLOGAccess() 2 times
 for End
  of recovery checkpoint, it doesn't matter w.r.t performance but
 actually the
  purpose will
  be almost same for calling LocalSetXLogInsertAllowed() and
 InitXLOGAccess(),
  or am I missing something.
 
  One more thing, I think after fast promotion, either it should set
 timeline
  or give error in CreateCheckPoint() function before it reaches the
 check
  mentioned by you in your initial mail.
  if (RecoveryInProgress()  (flags  CHECKPOINT_END_OF_RECOVERY) ==
 0)
  elog(ERROR, can't create a checkpoint during
 recovery);
  Shouldn't it set timeline in above check (RecoveryInProgress()) or
 when
  RecoveryInProgress() is called before CreateCheckPoint()?
 
 Thinking of checkpointer, it does RecoveryInProgress() far
 earlier to there, in waiting loop in CheckPointerMain where to
 decide which to do between checkpoint and restartpoint. So
 InitXLogAccess() has been already done when checkpoint is choosed
 there for the first time. And before that, ThisTimeLineID in the
 startup process gets incremented and is copied onto
 XLogCtl-ThisTimeLineID before xlogctl-SharedRecoveryInProgress
 becomes false in StartupXLog().  I don't think it is possible for
 checkpointer to run on older timeline id on codition that all
 processes looks exactly the same memory image.

I also think the same, but now one difference with fast promotion is the
request for checkpoint is done after recovery; so some operations can happen
before checkpoint with new timeline.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.3 beta breaks some extensions make install

2013-05-16 Thread Dimitri Fontaine
Andrew Dunstan and...@dunslane.net writes:
 DATA_built = sql/$(EXTENSION)--$(EXTVERSION).sql
 DATA = $(filter-out sql/$(EXTENSION)--$(EXTVERSION).sql, $(wildcard
 sql/*--*.sql))

 Is that right?

I think that's still breaking VPATH builds because the widlcard call
happens in the current tree, not in the VPATH (source) tree.

And VPATH building of extension is crucially important for me, as the
easiest way I've found to build and package a given extension against
all currently supported version of PostgreSQL.

I couldn't figure out how to allow for wildcard and whatnot in the
extensions Makefile and have it work in all cases, including VPATH, so
my current advice is to get rid of any dynamic code here and just plain
copy and paste the file names you're interested into.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] proposal 9.4. Explain on signal

2013-05-16 Thread Pavel Stehule
Hello

I proposed a some months log plans of cancelled queries
http://www.postgresql.org/message-id/cafj8pra-duzkmdtu52ciugb0p7tvri_b8ltjmjfwcnr1lpt...@mail.gmail.com
. After discussion the proposal was changed to get plan of any running
query.

I have a proof concept patch now and I am thinking so it can work well

So I propose following concept:

1. function pg_explain_backend(PID int, loglevel int default 'log',
explain_top_level boolean default true);

Execution of this function ensure sending sigusr1 signal to PID process.

2. Sigusr1 handler will be enhanced for PROCSIG_EXPLAIN_MESSAGES
message and it will write explain result to log.


It share lot of code with auto_explain module. So I am thinking so we
should move auto_explain functionality to core. Then EXPLAIN ON SIGNAL
can be used for monitoring of query evaluating.

Regards

Pavel

comments?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal 9.4. Explain on signal

2013-05-16 Thread Thom Brown
On 16 May 2013 11:09, Pavel Stehule pavel.steh...@gmail.com wrote:
 Hello

 I proposed a some months log plans of cancelled queries
 http://www.postgresql.org/message-id/cafj8pra-duzkmdtu52ciugb0p7tvri_b8ltjmjfwcnr1lpt...@mail.gmail.com
 . After discussion the proposal was changed to get plan of any running
 query.

 I have a proof concept patch now and I am thinking so it can work well

 So I propose following concept:

 1. function pg_explain_backend(PID int, loglevel int default 'log',
 explain_top_level boolean default true);

 Execution of this function ensure sending sigusr1 signal to PID process.

 2. Sigusr1 handler will be enhanced for PROCSIG_EXPLAIN_MESSAGES
 message and it will write explain result to log.


 It share lot of code with auto_explain module. So I am thinking so we
 should move auto_explain functionality to core. Then EXPLAIN ON SIGNAL
 can be used for monitoring of query evaluating.

What a neat idea.  So the original plan of EXPLAINing cancelled
queries... does this cater for that?  Can cancelled queries
automatically invoke the EXPLAIN functionality as part of this
feature?

--
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal 9.4. Explain on signal

2013-05-16 Thread Pavel Stehule
2013/5/16 Thom Brown t...@linux.com:
 On 16 May 2013 11:09, Pavel Stehule pavel.steh...@gmail.com wrote:
 Hello

 I proposed a some months log plans of cancelled queries
 http://www.postgresql.org/message-id/cafj8pra-duzkmdtu52ciugb0p7tvri_b8ltjmjfwcnr1lpt...@mail.gmail.com
 . After discussion the proposal was changed to get plan of any running
 query.

 I have a proof concept patch now and I am thinking so it can work well

 So I propose following concept:

 1. function pg_explain_backend(PID int, loglevel int default 'log',
 explain_top_level boolean default true);

 Execution of this function ensure sending sigusr1 signal to PID process.

 2. Sigusr1 handler will be enhanced for PROCSIG_EXPLAIN_MESSAGES
 message and it will write explain result to log.


 It share lot of code with auto_explain module. So I am thinking so we
 should move auto_explain functionality to core. Then EXPLAIN ON SIGNAL
 can be used for monitoring of query evaluating.

 What a neat idea.  So the original plan of EXPLAINing cancelled
 queries... does this cater for that?  Can cancelled queries
 automatically invoke the EXPLAIN functionality as part of this
 feature?


I would to get EXPLAIN of long queries without waiting on end.

So it is possible for manual cancelation (not for timeout)

SELECT pg_explain_backend(xx);
SELECT pg_cancel_backend(xx);

Regards

Pavel

 --
 Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logging of PAM Authentication Failure

2013-05-16 Thread Andres Freund
On 2013-05-16 17:35:10 +0900, Amit Langote wrote:
 On Thu, May 16, 2013 at 3:53 PM, Amit Langote amitlangot...@gmail.com wrote:
  Attached herewith is a patch based on description in my previous mail.
  This patch would need revision since the error situation in case of
  authentication timeout on the server needs to be handled; probably in
  simple_prompt()?
 
 Forgot attaching the patch in the last mail; find it with this one.

The patch seems to have windows line endings...

 --- a/src/interfaces/libpq/libpq-fe.h
 +++ b/src/interfaces/libpq/libpq-fe.h
 @@ -62,7 +62,11 @@ typedef enum
* backend 
 startup. */
   CONNECTION_SETENV,  /* Negotiating environment. */
   CONNECTION_SSL_STARTUP, /* Negotiating SSL. */
 - CONNECTION_NEEDED   /* Internal state: connect() 
 needed */
 + CONNECTION_NEEDED,  /* Internal state: connect() 
 needed */
 + CONNECTION_SENDING_PASSWORD /* An intermediate state to 
 help client send a password
 +  * over an existing connection  
 +  */
 + 
  } ConnStatusType;
  
  typedef enum
 @@ -258,6 +262,9 @@ extern PGconn *PQsetdbLogin(const char *pghost, const 
 char *pgport,
  #define PQsetdb(M_PGHOST,M_PGPORT,M_PGOPT,M_PGTTY,M_DBNAME)  \
   PQsetdbLogin(M_PGHOST, M_PGPORT, M_PGOPT, M_PGTTY, M_DBNAME, NULL, NULL)
  
 +/* send a password that the server asked for halfway between a connection 
 sequence */
 +extern void PQsendPassword(PGconn *conn, char *password);
 +

I unfortunately have to say I don't really see the point of this. The
cost of the additional connection attempt is rather low and we have to
deal with the superflous attempts anyway since there will be old libpqs
around for years. Why is this worth the effort?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.3 beta breaks some extensions make install

2013-05-16 Thread Andrew Dunstan


On 05/16/2013 05:41 AM, Dimitri Fontaine wrote:

Andrew Dunstan and...@dunslane.net writes:

DATA_built = sql/$(EXTENSION)--$(EXTVERSION).sql
DATA = $(filter-out sql/$(EXTENSION)--$(EXTVERSION).sql, $(wildcard
sql/*--*.sql))

Is that right?

I think that's still breaking VPATH builds because the widlcard call
happens in the current tree, not in the VPATH (source) tree.

And VPATH building of extension is crucially important for me, as the
easiest way I've found to build and package a given extension against
all currently supported version of PostgreSQL.

I couldn't figure out how to allow for wildcard and whatnot in the
extensions Makefile and have it work in all cases, including VPATH, so
my current advice is to get rid of any dynamic code here and just plain
copy and paste the file names you're interested into.





Is there documented support for VPATH builds? I know of an external 
script that promises such support, but AFAIK there is nothing in our 
code that supports it. If we're going to support this I suggest we do it 
properly. I don't consider myself on the hook to support some external 
build tool.


As for supporting multiple versions, I have given up supporting multiple 
Postgres versions from a single set of sources, for FDWs at least. For 
those I use a git branch that mirrors the corresponding Postgres branch.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logging of PAM Authentication Failure

2013-05-16 Thread Amit Langote
On Thu, May 16, 2013 at 8:01 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-05-16 17:35:10 +0900, Amit Langote wrote:
 On Thu, May 16, 2013 at 3:53 PM, Amit Langote amitlangot...@gmail.com 
 wrote:
  Attached herewith is a patch based on description in my previous mail.
  This patch would need revision since the error situation in case of
  authentication timeout on the server needs to be handled; probably in
  simple_prompt()?

 Forgot attaching the patch in the last mail; find it with this one.

 The patch seems to have windows line endings...

My bad. I will reupload the proper patch later.

 --- a/src/interfaces/libpq/libpq-fe.h
 +++ b/src/interfaces/libpq/libpq-fe.h
 @@ -62,7 +62,11 @@ typedef enum
* backend 
 startup. */
   CONNECTION_SETENV,  /* Negotiating environment. */
   CONNECTION_SSL_STARTUP, /* Negotiating SSL. */
 - CONNECTION_NEEDED   /* Internal state: connect() 
 needed */
 + CONNECTION_NEEDED,  /* Internal state: connect() 
 needed */
 + CONNECTION_SENDING_PASSWORD /* An intermediate state to 
 help client send a password
 +  * over an existing connection
 +  */
 +
  } ConnStatusType;

  typedef enum
 @@ -258,6 +262,9 @@ extern PGconn *PQsetdbLogin(const char *pghost, const 
 char *pgport,
  #define PQsetdb(M_PGHOST,M_PGPORT,M_PGOPT,M_PGTTY,M_DBNAME)  \
   PQsetdbLogin(M_PGHOST, M_PGPORT, M_PGOPT, M_PGTTY, M_DBNAME, NULL, 
 NULL)

 +/* send a password that the server asked for halfway between a connection 
 sequence */
 +extern void PQsendPassword(PGconn *conn, char *password);
 +

 I unfortunately have to say I don't really see the point of this. The
 cost of the additional connection attempt is rather low and we have to
 deal with the superflous attempts anyway since there will be old libpqs
 around for years. Why is this worth the effort?

While full connection sequence (with proper authentication exchanges)
appears  to go smoothly for other cases (authentication methods), it
doesn't quite in this case probably because accounting for such a case
was not considered to be as important. But while investigating about
the PAM issue (original subject of this thread), it turned out that
the occurrence of that minor issue was due to this behavior in libpq.
Addition of this one more state (viz. input password in between an
ongoing connect sequence) to the possible connection states helps
account for such instances where this kind of password exchange has to
happen (as in psql for md5 and password). Also, others using libpq can
either use it if they wish to or just do away without having to worry
about this state. This patch does not introduce any change as to what
connection state applications can expect to be in after they return
from connectDBComplete() or PQconnectPoll(). On the other hand, we can
now enter these functions with one more possible connection state
which PQconnectPoll() is now able to handle. As a side effect, it also
helps avoid drop-and-reconnect occurrences at times.

Albeit, it is up to application (using libpq) whether to go via this
new alternate path or stick to drop-and-reconnect, should a need to
input password in between connect sequence arise. We can consider
having such an option, probably just for the sake of completeness
(even if to account for a possibly rare method of authentication
exchange)


--
Amit Langote


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Better LWLocks with compare-and-swap (9.4)

2013-05-16 Thread Stephen Frost
* Heikki Linnakangas (hlinnakan...@vmware.com) wrote:
 My theory is that after that point all the cores are busy,
 and processes start to be sometimes context switched while holding
 the spinlock, which kills performance. Has anyone else seen that
 pattern? 

Isn't this the same issue which has prompted multiple people to propose
(sometimes with code, as I recall) to rip out our internal spinlock
system and replace it with kernel-backed calls which do it better,
specifically by dealing with issues like the above?  Have you seen those
threads in the past?  Any thoughts about moving in that direction?

 Curiously, I don't see that when connecting pgbench via TCP
 over localhost, only when connecting via unix domain sockets.
 Overall performance is higher over unix domain sockets, so I guess
 the TCP layer adds some overhead, hurting performance, and also
 affects scheduling somehow, making the steep drop go away.

I wonder if the kernel locks around unix domain sockets are helping us
out here, while it's not able to take advantage of such knowledge about
the process that's waiting when it's a TCP connection?  Just a hunch.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] plperl segfault in plperl_trusted_init() on kfreebsd

2013-05-16 Thread Stephen Frost
* Christoph Berg (c...@df7cb.de) wrote:
 That was because the plain ./configure version (for a minimal way to
 reproduce) didn't built with debug symbols. The original gcc line from the
 Debian build log is:

It did, but Debian (and Ubuntu and friends) pull the debugging symbols
out of the binaries and stick them into independent packages, allowing
you to only install them if you need/want to.

Try installing postgresql-9.3-dbg.  At least the postgresql-9.2-dbg
package on my system also includes the plperl.so debugging symbols.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] plperl segfault in plperl_trusted_init() on kfreebsd

2013-05-16 Thread Christoph Berg
Re: Stephen Frost 2013-05-16 20130516123344.gt4...@tamriel.snowman.net
 * Christoph Berg (c...@df7cb.de) wrote:
  That was because the plain ./configure version (for a minimal way to
  reproduce) didn't built with debug symbols. The original gcc line from the
  Debian build log is:
 
 It did, but Debian (and Ubuntu and friends) pull the debugging symbols
 out of the binaries and stick them into independent packages, allowing
 you to only install them if you need/want to.
 
 Try installing postgresql-9.3-dbg.  At least the postgresql-9.2-dbg
 package on my system also includes the plperl.so debugging symbols.

This wasn't the Debian build, but just plain ./configure --with-perl
without any other arguments. As said in the previous mail, both this
and the Debian build segfault here. (Which means there is no -dbg
package because the build fails.)

I don't think this makes a difference anyway - I can pull more
information out from that core, but that needs someone saying which
info, because that's the place where I'm lost at.

(I'll be away until monday.)

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fallocate / posix_fallocate for new WAL file creation (etc...)

2013-05-16 Thread Jon Nelson
On Wed, May 15, 2013 at 10:36 PM, Jon Nelson jnelson+pg...@jamponi.net wrote:
 On Wed, May 15, 2013 at 10:17 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
 Jon Nelson escribió:
 On Wed, May 15, 2013 at 4:46 PM, Jon Nelson jnelson+pg...@jamponi.net 
 wrote:

  That's true. I originally wrote the patch using fallocate(2). What
  would be appropriate here? Should I switch on the return value and the
  six (6) or so relevant error codes?

 I addressed this, hopefully in a reasonable way.

 Would it work to just assign the value you got from posix_fallocate (if
 nonzero) to errno and then use %m in the errmsg() call in ereport()?

 That strikes me as a better way. I'll work something up soon.
 Thanks!

Please find attached version 3.
Am I doing this the right way? Should I be posting the full patch each
time, or incremental patches?


--
Jon


fallocate-v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] counting algorithm for incremental matview maintenance

2013-05-16 Thread Kevin Grittner
Josh Berkus j...@agliodbs.com wrote:

 It's fairly common for matviews to be constructed such that
 updates to them are strictly appends.  For example, a matview
 which has a daily summary would just get appended to each day,
 and existing rows would not change barring a major historical
 database cleanup.

 It seems like we could ... and ought to ... optimize for this
 pattern somehow for incremental updates.  That is, if the user
 knows that we're going to be only appending new rows and not
 modifying any old ones, s/he ought to be able to tell the
 database that somehow and avoid the overhead of checking.  While
 the overhead of checking a count wouldn't be that high for a few
 hundred rows, I've dealt with matviews which were thousands to
 millions of rows.

Thanks for the suggestion; I will keep an eye out for ways this
insight might allow an optimization.  I think there might be some
misunderstanding of the counting algorithm, though -- there is no
need to do a sequential pass through the matview examining the
counts.

I don't want to replicate the content of a fairly dense (in the
sense of having a lot of content per page) 10 page computer science
paper in this email, but for purposes of illustration I will take a
very simple case and show how it works.  (This is not geared toward
your particular case, because that could get kinda long to explain
here and now, but hopefully this will give an insight into the
technique overall.)

Let's say there is a table and matview like this:

create table foo (fooid int primary key, val int not null);
create materialized view bar as select distinct val from foo;

Let's say there are millions of rows in both, and that we have
flagged the view for incremental maintenance.  (Syntax omitted to
avoid distracting bikeshedding on that when the point is the
algorithm.)

Now, someone runs this:

update foo set val = val + 1 where fooid between 1 and 10;

What will happen is this:

Since foo will be flagged as a relation which is referenced by an
incrementally maintained matview, a delta relation will be
materialized for this update, which will contain the net change to
the underlying table in the count_t system column.  Before tuples
will have a count of -1; after tuples will have a count of 1.
Then the query defining the view will be run *against the delta*,
resulting in a relation with a count_t column reflecting the net
change for each val.  Anything with a zero for the net change will
be dropped.  We will run a logical UNION of this relation and the
bar matview.  In this case, we obviously want this to be done in a
way that for each row in this net change relation, we do an index
scan against the bar matview; if not found, we insert the new row
into the matview with its count from the net change relation.
(That had better be positive or we have a bug -- so elog ERROR if
negative.)  If bar does contain a matching row, update count_t in
that row with the sum of its old value and the value from the net
change relation.  Of course, that new value also had better be
positive or zero -- if zero we delete the old row rather than
updating count_t.

The count_t column saves us from having to scan foo for all the old
val values.  It does not require any scan of the entire bar
matview.  It allows us to zero in on exactly the right rows, and
lets us know what needs doing.

Clearly we want the planner to find the best plans for the interim
steps rather than hard-coding particular rule-based plans; I just
used an example where it's pretty clear what a reasonable plan
would be.

Hopefully this makes it fairly clear that the only thing that an
optimization around the append-only assertion for a matview would
be the ability to skip the probe for an existing record *related to
rows which are in the delta*.  As long as there is reasonable
indexing on the matview, maintenance for the append-only case would
not involve scanning the entire matview.

-- 
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] Parallel Sort

2013-05-16 Thread Noah Misch
On Wed, May 15, 2013 at 08:49:00PM +0200, Andres Freund wrote:
 On 2013-05-13 10:28:59 -0400, Noah Misch wrote:
  Each worker needs to make SnapshotNow visibility decisions coherent with the
  master.  For sorting, this allows us to look up comparison functions, even
  when the current transaction created or modified those functions.  This will
  also be an essential building block for any parallelism project that 
  consults
  user tables.  Implementing this means copying the subtransaction stack and 
  the
  combocid hash to each worker.  For the sake of completeness, we should also
  copy the global MVCC snapshot data (sorting probably won't care).  It also
  means forbidding, while a parallel task is in flight, operations that affect
  the transaction state:
 
 Btw, if you assume you can simply copy a snapshot from the normal
 backend to the worker backend to make visibility decisions in the
 general case: You're wrong. Unfortunately you need in-memory state to
 make sense of combocids...

Correct.  If you think of any required state information that I did not list
above, please let me know.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Beta1 status report

2013-05-16 Thread 'Bruce Momjian'
On Tue, May  7, 2013 at 10:23:48AM +0530, Amit Kapila wrote:
   2. I am not able to figure out which item of release notes cover the
  below
   feature commit
   Avoid inserting Result nodes that only compute identity projections.
   http://www.postgresql.org/message-id/E1UGCBh-0006P3-
  a...@gemulon.postgresql.org
  
  I did not think that warranted a mention in the release notes.  Was I
  wrong?
 
 This was a performance improvement for a quite usable scenario, so I thought
 it would be useful for users to know about it.
 Performance data for simple cases I have posted:
 http://www.postgresql.org/message-id/007e01ce08ff$dc0a2c60$941e8520$@kapila@
 huawei.com

I usually mention items that have a user-visible change, or are easy to
explain, or apply to most queries.  I am not sure this falls into any of
those categories.

Can you suggest some release note text for this item?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.3 beta breaks some extensions make install

2013-05-16 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 05/16/2013 05:41 AM, Dimitri Fontaine wrote:
 And VPATH building of extension is crucially important for me, as the
 easiest way I've found to build and package a given extension against
 all currently supported version of PostgreSQL.

 Is there documented support for VPATH builds?

The core code certainly builds okay in VPATH mode, and I would agree
with Dimitri that pgxs builds should as well.  But this is more of an
autoconf+make feature than ours, so I'm not sure why you'd expect us
to document it.

Having said that, if a particular extension's makefile contains features
that cause it to not build VPATH, then presumably that extension's
author doesn't care.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plperl segfault in plperl_trusted_init() on kfreebsd

2013-05-16 Thread Stephen Frost
* Christoph Berg (c...@df7cb.de) wrote:
 This wasn't the Debian build, but just plain ./configure --with-perl
 without any other arguments. As said in the previous mail, both this
 and the Debian build segfault here. (Which means there is no -dbg
 package because the build fails.)

Oh, I see what you mean- this is the actual *build* failing.

Interesting..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Patch proposal: query result history in psql

2013-05-16 Thread Dimitri Fontaine
Maciej Gajewski maciej.gajews...@gmail.com writes:
 I find this feature quite useful, but I understand that my use case
 may be quite unique.

Just to say that I too find what you've done quite useful. Please add
your patch to the next commit fest for consideration in 9.4!

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.3 beta breaks some extensions make install

2013-05-16 Thread Cédric Villemain
Le jeudi 16 mai 2013 15:51:48, Tom Lane a écrit :
 Andrew Dunstan and...@dunslane.net writes:
  On 05/16/2013 05:41 AM, Dimitri Fontaine wrote:
  And VPATH building of extension is crucially important for me, as the
  easiest way I've found to build and package a given extension against
  all currently supported version of PostgreSQL.
  
  Is there documented support for VPATH builds?
 
 The core code certainly builds okay in VPATH mode, and I would agree
 with Dimitri that pgxs builds should as well.  But this is more of an
 autoconf+make feature than ours, so I'm not sure why you'd expect us
 to document it.

Extension does not support VPATH at 100% (well, pgxs.mk).
There is a minor hack for some REGRESS files but that's all.

I think at least DOCS, DATA and REGRESS needs some addition on pgxs.mk to help 
extension author allows build out-of-tree (postgresql been built out or not).

 I'll work on a patch for that.
-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] 9.3 Beta1 status report

2013-05-16 Thread Amit Kapila
On Thursday, May 16, 2013 7:17 PM Bruce Momjian wrote:
 On Tue, May  7, 2013 at 10:23:48AM +0530, Amit Kapila wrote:
2. I am not able to figure out which item of release notes cover
 the
   below
feature commit
Avoid inserting Result nodes that only compute identity
 projections.
http://www.postgresql.org/message-id/E1UGCBh-0006P3-
   a...@gemulon.postgresql.org
  
   I did not think that warranted a mention in the release notes.  Was
 I
   wrong?
 
  This was a performance improvement for a quite usable scenario, so I
 thought
  it would be useful for users to know about it.
  Performance data for simple cases I have posted:
  http://www.postgresql.org/message-
 id/007e01ce08ff$dc0a2c60$941e8520$@kapila@
  huawei.com
 
 I usually mention items that have a user-visible change, or are easy to
 explain, or apply to most queries.  I am not sure this falls into any
 of
 those categories.
 
 Can you suggest some release note text for this item?

I can think of below text:

Reduce query processing overhead by avoiding insertion of useless plan nodes
OR
Improve performance of certain kind of queries by avoiding extra processing
of doing projection

This applies to queries doing identity projection (SELECT * FROM ...) for
partitioned tables.


With Regards,
Amit Kapila.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.3 beta breaks some extensions make install

2013-05-16 Thread Andrew Dunstan


On 05/16/2013 10:39 AM, Cédric Villemain wrote:

Le jeudi 16 mai 2013 15:51:48, Tom Lane a écrit :

Andrew Dunstan and...@dunslane.net writes:

On 05/16/2013 05:41 AM, Dimitri Fontaine wrote:

And VPATH building of extension is crucially important for me, as the
easiest way I've found to build and package a given extension against
all currently supported version of PostgreSQL.

Is there documented support for VPATH builds?

The core code certainly builds okay in VPATH mode, and I would agree
with Dimitri that pgxs builds should as well.  But this is more of an
autoconf+make feature than ours, so I'm not sure why you'd expect us
to document it.

Extension does not support VPATH at 100% (well, pgxs.mk).
There is a minor hack for some REGRESS files but that's all.


Right. My impression is that pgxs.mk actively works against vpath builds.



I think at least DOCS, DATA and REGRESS needs some addition on pgxs.mk to help
extension author allows build out-of-tree (postgresql been built out or not).

  I'll work on a patch for that.


OK, great.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #8167: false EINVAL -22 for opening a file

2013-05-16 Thread Tom Lane
nel...@monkey.org writes:
 RhodiumToad asked me to report pg in PathNameOpenFile / mdopen is
 assuming that errno is preserved over calls to free() which is not
 required by the spec

Good catch!  The proximate problem appears to be this bit in
PathNameOpenFile:

vfdP-fd = BasicOpenFile(fileName, fileFlags, fileMode);

if (vfdP-fd  0)
{
FreeVfd(file);
free(fnamecopy);
return -1;
}

The other code layers in this stack seem to be careful about preserving
errno where needed, but this bit isn't.

I've not really reviewed fd.c for other instances of the same bug,
but I do see some places where there are DO_DB(...) debug calls in
places where errno had better not change.  I'm thinking maybe we should
improve that macro to include save/restore errno.  Or else rip that
stuff out entirely --- I've sure never built this code with FDDEBUG
set, has anyone else?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logging of PAM Authentication Failure

2013-05-16 Thread Tom Lane
Amit Langote amitlangot...@gmail.com writes:
 On Thu, May 16, 2013 at 8:01 PM, Andres Freund and...@2ndquadrant.com wrote:
 I unfortunately have to say I don't really see the point of this. The
 cost of the additional connection attempt is rather low and we have to
 deal with the superflous attempts anyway since there will be old libpqs
 around for years. Why is this worth the effort?

 While full connection sequence (with proper authentication exchanges)
 appears  to go smoothly for other cases (authentication methods), it
 doesn't quite in this case probably because accounting for such a case
 was not considered to be as important. But while investigating about
 the PAM issue (original subject of this thread), it turned out that
 the occurrence of that minor issue was due to this behavior in libpq.

I have to agree with Andres that it's not clear this is a reasonable
fix.  To get rid of extra reconnections this way will require not merely
upgrading libpq, but upgrading every single application that uses libpq
and is capable of prompting its user for a password.  The odds are
pretty good that that won't ever happen.

The real complaint here is that the server-side PAM auth code path is
losing the information that the client chose to disconnect rather than
offer a password, and thus logging a message that we could do without.
What's wrong with just fixing that?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Better LWLocks with compare-and-swap (9.4)

2013-05-16 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 Isn't this the same issue which has prompted multiple people to propose
 (sometimes with code, as I recall) to rip out our internal spinlock
 system and replace it with kernel-backed calls which do it better,
 specifically by dealing with issues like the above?  Have you seen those
 threads in the past?  Any thoughts about moving in that direction?

All of the proposals of that sort that I've seen had a flavor of
my OS is the only one that matters.  While I don't object to
platform-dependent implementations of spinlocks as such, they're not
much of a cure for a generic performance issue.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] counting algorithm for incremental matview maintenance

2013-05-16 Thread Kevin Grittner
I wrote:

 Let's say there is a table and matview like this:
 
 create table foo (fooid int primary key, val int not null);
 create materialized view bar as select distinct val from foo;

Some of the subsequent text doesn't make sense unless that
materialized view has an index, like this:

create unique index bar_val on bar (val);

Without such an index, it would need to use a plan which scanned
the entire bar relation for any maintenance.  Apologies for the
omission and any confusion it caused.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logging of PAM Authentication Failure

2013-05-16 Thread Amit Langote
On Fri, May 17, 2013 at 1:05 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Amit Langote amitlangot...@gmail.com writes:
 On Thu, May 16, 2013 at 8:01 PM, Andres Freund and...@2ndquadrant.com 
 wrote:
 I unfortunately have to say I don't really see the point of this. The
 cost of the additional connection attempt is rather low and we have to
 deal with the superflous attempts anyway since there will be old libpqs
 around for years. Why is this worth the effort?

 While full connection sequence (with proper authentication exchanges)
 appears  to go smoothly for other cases (authentication methods), it
 doesn't quite in this case probably because accounting for such a case
 was not considered to be as important. But while investigating about
 the PAM issue (original subject of this thread), it turned out that
 the occurrence of that minor issue was due to this behavior in libpq.

 I have to agree with Andres that it's not clear this is a reasonable
 fix.  To get rid of extra reconnections this way will require not merely
 upgrading libpq, but upgrading every single application that uses libpq
 and is capable of prompting its user for a password.  The odds are
 pretty good that that won't ever happen.

Can this stay in the future releases for new users of libpq to
consider using it (saving them a reconnection, however small a benefit
that is) or at least psql which is being changed to use it anyway? I
only think it makes libpq take into account a connection state that
could be used.

 The real complaint here is that the server-side PAM auth code path is
 losing the information that the client chose to disconnect rather than
 offer a password, and thus logging a message that we could do without.
 What's wrong with just fixing that?

Back in this thread, Horiguchi-san has posted a fix. It seems to fix
the original issue. Attaching his patch here again.

-- 
Amit Langote


pamauth_duplog_quickfix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logging of PAM Authentication Failure

2013-05-16 Thread Andres Freund
On 2013-05-17 01:29:25 +0900, Amit Langote wrote:
 On Fri, May 17, 2013 at 1:05 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  Amit Langote amitlangot...@gmail.com writes:
  On Thu, May 16, 2013 at 8:01 PM, Andres Freund and...@2ndquadrant.com 
  wrote:
  I unfortunately have to say I don't really see the point of this. The
  cost of the additional connection attempt is rather low and we have to
  deal with the superflous attempts anyway since there will be old libpqs
  around for years. Why is this worth the effort?
 
  While full connection sequence (with proper authentication exchanges)
  appears  to go smoothly for other cases (authentication methods), it
  doesn't quite in this case probably because accounting for such a case
  was not considered to be as important. But while investigating about
  the PAM issue (original subject of this thread), it turned out that
  the occurrence of that minor issue was due to this behavior in libpq.
 
  I have to agree with Andres that it's not clear this is a reasonable
  fix.  To get rid of extra reconnections this way will require not merely
  upgrading libpq, but upgrading every single application that uses libpq
  and is capable of prompting its user for a password.  The odds are
  pretty good that that won't ever happen.
 
 Can this stay in the future releases for new users of libpq to
 consider using it (saving them a reconnection, however small a benefit
 that is) or at least psql which is being changed to use it anyway? I
 only think it makes libpq take into account a connection state that
 could be used.

Which basically is an API  ABI break since its not handled in existing
callers. So you would need to make it conditional. At that point the
complexity really doesn't seem warranted.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logging of PAM Authentication Failure

2013-05-16 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2013-05-17 01:29:25 +0900, Amit Langote wrote:
 Can this stay in the future releases for new users of libpq to
 consider using it (saving them a reconnection, however small a benefit
 that is) or at least psql which is being changed to use it anyway? I
 only think it makes libpq take into account a connection state that
 could be used.

 Which basically is an API  ABI break since its not handled in existing
 callers. So you would need to make it conditional.

Yeah, there would need to be a way for the caller to indicate that it's
prepared to handle this new connection state; else you risk actively
breaking existing code that doesn't know it needs to do something here.

Another point worth considering is that, if you assume that what's going
to happen is manual entry of a password (probably requiring at least a
couple of seconds), the actual benefit of avoiding a second fork() is
really completely negligible.  It could even be argued that the benefit
is negative, since we're tying up a postmaster child process slot that
might be better used for something else.

So, while I wouldn't have objected to this if it'd been included in the
original design for PQconnectPoll-style connections, it's really unclear
that it's worth the work to add it now.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.3 beta breaks some extensions make install

2013-05-16 Thread Alvaro Herrera
Andrew Dunstan wrote:
 
 On 05/16/2013 10:39 AM, Cédric Villemain wrote:
 Le jeudi 16 mai 2013 15:51:48, Tom Lane a écrit :
 Andrew Dunstan and...@dunslane.net writes:
 On 05/16/2013 05:41 AM, Dimitri Fontaine wrote:
 And VPATH building of extension is crucially important for me, as the
 easiest way I've found to build and package a given extension against
 all currently supported version of PostgreSQL.
 Is there documented support for VPATH builds?
 The core code certainly builds okay in VPATH mode, and I would agree
 with Dimitri that pgxs builds should as well.  But this is more of an
 autoconf+make feature than ours, so I'm not sure why you'd expect us
 to document it.
 Extension does not support VPATH at 100% (well, pgxs.mk).
 There is a minor hack for some REGRESS files but that's all.
 
 Right. My impression is that pgxs.mk actively works against vpath builds.

That's my experience, yes.  It would be great to get it fixed.

Then there's the outright weird stuff using ancient makefiles ..
*grumble* pg_filedump *grumble*

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fallocate / posix_fallocate for new WAL file creation (etc...)

2013-05-16 Thread Alvaro Herrera
Jon Nelson escribió:

 Am I doing this the right way? Should I be posting the full patch each
 time, or incremental patches?

Full patch each time is okay.  Context-format patch is even better.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logging of PAM Authentication Failure

2013-05-16 Thread Amit Langote
On Fri, May 17, 2013 at 1:46 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andres Freund and...@2ndquadrant.com writes:
 On 2013-05-17 01:29:25 +0900, Amit Langote wrote:
 Can this stay in the future releases for new users of libpq to
 consider using it (saving them a reconnection, however small a benefit
 that is) or at least psql which is being changed to use it anyway? I
 only think it makes libpq take into account a connection state that
 could be used.

 Which basically is an API  ABI break since its not handled in existing
 callers. So you would need to make it conditional.

 Yeah, there would need to be a way for the caller to indicate that it's
 prepared to handle this new connection state; else you risk actively
 breaking existing code that doesn't know it needs to do something here.

 Another point worth considering is that, if you assume that what's going
 to happen is manual entry of a password (probably requiring at least a
 couple of seconds), the actual benefit of avoiding a second fork() is
 really completely negligible.  It could even be argued that the benefit
 is negative, since we're tying up a postmaster child process slot that
 might be better used for something else.

I agree it's a pretty valid point. We'd better just fix the original
issue and leave it to that. :)

--
Amit Langote


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Heap truncation without AccessExclusiveLock (9.4)

2013-05-16 Thread Robert Haas
On Wed, May 15, 2013 at 8:24 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I've been thinking for a while that we need some other system for
 managing other kinds of invalidations.  For example, suppose we want
 to cache relation sizes in blocks.  So we allocate 4kB of shared
 memory, interpreted as an array of 512 8-byte entries.  Whenever you
 extend a relation, you hash the relfilenode and take the low-order 9
 bits of the hash value as an index into the array.  You increment that
 value either under a spinlock or perhaps using fetch-and-add where
 available.

 I'm not sure I believe the details of that.

 1. 4 bytes is not enough to store the exact identity of the table that
 the cache entry belongs to, so how do you disambiguate?

You don't.  The idea is that it's inexact.  When a relation is
extended, every backend is forced to recheck the length of every
relation whose relfilenode hashes to the same array slot as the one
that was actually extended.  So if you happen to be repeatedly
scanning relation A, and somebody else is repeatedly scanning relation
B, you'll *probably* not have to invalidate anything.  But if A and B
happen to hash to the same slot, then you'll keep getting bogus
invalidations.  Fortunately, that isn't very expensive.

The fast-path locking code uses a similar trick to detect conflicting
strong locks, and it works quite well.  In that case, as here, you can
reduce the collision probability as much as you like by increasing the
number of slots, at the cost of increased shared memory usage.

 2. If you don't find an entry for your target rel in the cache, aren't
 you still going to have to do an lseek?

Don't think of it as a cache.  The caching happens inside each
backend's relcache; the shared memory structure is just a tool to
force those caches to be revalidated when necessary.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Heap truncation without AccessExclusiveLock (9.4)

2013-05-16 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 2. If you don't find an entry for your target rel in the cache, aren't
 you still going to have to do an lseek?

 Don't think of it as a cache.  The caching happens inside each
 backend's relcache; the shared memory structure is just a tool to
 force those caches to be revalidated when necessary.

Hmm.  Now I see: it's not a cache, it's a Bloom filter.  The failure
mode I was thinking of is inapplicable, but there's a different one:
you have to be absolutely positive that *any* operation that extends the
file will update the relevant filter entry.  Still, I guess that we're
already assuming that any such op will take the relation's extension
lock, so it should be easy enough to find all the places to fix.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch proposal: query result history in psql

2013-05-16 Thread David E. Wheeler
On May 16, 2013, at 7:02 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:

 I find this feature quite useful, but I understand that my use case
 may be quite unique.
 
 Just to say that I too find what you've done quite useful. Please add
 your patch to the next commit fest for consideration in 9.4!

FYI, you can add it here:

  https://commitfest.postgresql.org/action/commitfest_view?id=18

Best,

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Heap truncation without AccessExclusiveLock (9.4)

2013-05-16 Thread Robert Haas
On Thu, May 16, 2013 at 1:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 2. If you don't find an entry for your target rel in the cache, aren't
 you still going to have to do an lseek?

 Don't think of it as a cache.  The caching happens inside each
 backend's relcache; the shared memory structure is just a tool to
 force those caches to be revalidated when necessary.

 Hmm.  Now I see: it's not a cache, it's a Bloom filter.

Yes.

 The failure
 mode I was thinking of is inapplicable, but there's a different one:
 you have to be absolutely positive that *any* operation that extends the
 file will update the relevant filter entry.  Still, I guess that we're
 already assuming that any such op will take the relation's extension
 lock, so it should be easy enough to find all the places to fix.

I would think so.  The main thing that's held me back from actually
implementing this is the fact that lseek is so darn cheap on Linux,
and I don't have reliable data one way or the other for any other
platform.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.3 beta breaks some extensions make install

2013-05-16 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Then there's the outright weird stuff using ancient makefiles ..
 *grumble* pg_filedump *grumble*

I've never made any effort to improve the original makefile for that.
Want to send a patch?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.3 beta breaks some extensions make install

2013-05-16 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
  Then there's the outright weird stuff using ancient makefiles ..
  *grumble* pg_filedump *grumble*
 
 I've never made any effort to improve the original makefile for that.
 Want to send a patch?

Not right away, but I will get to it eventually, thanks.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allowing parallel pg_restore from pipe

2013-05-16 Thread Shaun Thomas

On 05/16/2013 12:16 PM, Timothy Garnett wrote:


I think you'd have to be real careful around foreign key constraints
for that to work.


Not especially. All you need to do is bootstrap the database with a 
bunch of empty table targets (no constraints, keys, etc), then restore 
with the xargs command. Then you can apply the constraints, keys, and 
indexes later by doing a schema-only parallel pg_restore.


--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-676-8870
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allowing parallel pg_restore from pipe

2013-05-16 Thread Timothy Garnett
 If you need something like this short term, we actually found a way to do it
 ourselves for a migration we performed back in October. The secret is xargs
 with the -P option:

 xargs -I{} -P 8 -a table-list.txt \
 bash -c pg_dump -Fc -t {} my_db | pg_restore -h remote -d my_db

 Fill table-list.txt with as many, or as few tables as you want. The above
 example would give you 8 parallel threads. Well equipped systems may be able
 to increase this.

 Admittedly it's a gross hack, but it works. :)

I think you'd have to be real careful around foreign key constraints
for that to work.

Tim


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] streaming replication, frozen snapshot backup on it and missing relfile (postgres 9.2.3 on xfs + LVM)

2013-05-16 Thread David Powers
I'll try to get the primary upgraded over the weekend when we can afford a
restart.

In the meantime I have a single test showing that a shutdown, snapshot,
restart produces a backup that passes the vacuum analyze test.  I'm going
to run a full vacuum today.

-David


On Wed, May 15, 2013 at 3:53 PM, Heikki Linnakangas hlinnakan...@vmware.com
 wrote:

 On 15.05.2013 22:50, Benedikt Grundmann wrote:

 On Wed, May 15, 2013 at 2:50 PM, Heikki Linnakangashlinnakangas@**
 vmware.com hlinnakan...@vmware.com

 The subject says 9.2.3. Are you sure you're running 9.2.4 on all the

 servers? There was a fix to a bug related to starting a standby server
 from
 a filesystem snapshot. I don't think it was quite the case you have, but
 pretty close.


 So this is delightfully embarrassing I just went back to double check and

 - primary box is 9.2.3
 - standby is 9.2.4
 - testing is 9.2.4

 I guess that alone could possibly explain it?


 Hmm, no, it should still work. There haven't been any changes in the WAL
 format. I do recommend upgrading the primary, of course, but I don't really
 see how that would explain what you're seeing.

 - Heikki



Re: [HACKERS] Better handling of archive_command problems

2013-05-16 Thread Robert Haas
On Wed, May 15, 2013 at 6:40 PM, Peter Geoghegan p...@heroku.com wrote:
 On Wed, May 15, 2013 at 3:46 AM, Robert Haas robertmh...@gmail.com wrote:
 One possible objection to this line of attack is that, IIUC, waits to
 acquire a LWLock are non-interruptible.  If someone tells PostgreSQL
 to wait for some period of time before performing each WAL write,
 other backends that grab the WALWriteLock will not respond to query
 cancels during that time.

 I don't see any reasonable way to make LWLocks care about interrupts
 (using all 3 possible underlying semaphore implementations, no less).

It couldn't be done across the board, but that doesn't mean that
certain cases couldn't get special treatment.

 As it says within LWLockAcquire:

 /*
  * Lock out cancel/die interrupts until we exit the code section protected
  * by the LWLock.  This ensures that interrupts will not interfere with
  * manipulations of data structures in shared memory.
  */
 HOLD_INTERRUPTS();

 We've been pretty judicious about placing CHECK_FOR_INTERRUPTS() calls
 in the right places, but it's still quite possible to see the server
 take multiple seconds - perhaps even as many as 10 - to respond to an
 interrupt (by psql SIGINT). Now, I didn't have enough of an interest
 at the times I noticed this to figure out exactly why that may have
 been or to somehow characterize it, but I don't accept that it's a
 violation of some Postgres precept that this setting could result in
 interrupts taking multiple seconds, and maybe even as many as 10. I'd
 go so far as to let the user make the throttling sleep take as long as
 they like, even though this admittedly would sort of break such a
 precept.

Well, I think it IS a Postgres precept that interrupts should get a
timely response.  You don't have to agree, but I think that's
important.

 There is a setting called zero_damaged_pages, and enabling it causes
 data loss. I've seen cases where it was enabled within postgresql.conf
 for years.

That is both true and bad, but it is not a reason to do more bad things.

 Now despite all that, I can see this being useful enough that Heroku
 might want to insert a very small patch into their version of
 PostgreSQL to do it this way, and just live with the downsides.  But
 anything that can propagate non-interruptible waits across the entire
 system does not sound to me like a feature that is sufficiently
 polished that we want to expose it to users less sophisticated than
 Heroku (i.e. nearly all of them).  If we do this, I think we ought to
 find a way to make the waits interruptible, and to insert them in
 places where they really don't interfere with read-only backends.

 It would be nice to be able to be sure that CLogControlLock could not
 be held for multiple seconds as a result of this. However, I don't see
 any reasons to let the perfect be the enemy of the good, or at least
 the better. Just how likely is it that the scenario you describe will
 affect reads in the real world? In any case, this is a problem in its
 own right.

That's true, but you're proposing to add a knob which would make it
much easier for users to notice the bad behavior that already exists,
and to prolong it for unbounded periods of time even when the system
is not under load.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Better handling of archive_command problems

2013-05-16 Thread Peter Geoghegan
On Thu, May 16, 2013 at 11:16 AM, Robert Haas robertmh...@gmail.com wrote:
 Well, I think it IS a Postgres precept that interrupts should get a
 timely response.  You don't have to agree, but I think that's
 important.

Well, yes, but the fact of the matter is that it is taking high single
digit numbers of seconds to get a response at times, so I don't think
that there is any reasonable expectation that that be almost
instantaneous. I don't want to make that worse, but then it might be
worth it in order to ameliorate a particular pain point for users.

 There is a setting called zero_damaged_pages, and enabling it causes
 data loss. I've seen cases where it was enabled within postgresql.conf
 for years.

 That is both true and bad, but it is not a reason to do more bad things.

I don't think it's bad. I think that we shouldn't be paternalistic
towards our users. If anyone enables a setting like zero_damaged_pages
(or, say, wal_write_throttle) within their postgresql.conf
indefinitely for no good reason, then they're incompetent. End of
story.

Would you feel better about it if the setting had a time-out? Say, the
user had to explicitly re-enable it after one hour at the most?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [9.3] Automatically updatable views vs writable foreign tables

2013-05-16 Thread Dean Rasheed
Hi,

I've just started 9.3 beta testing and I noticed that a simple view
defined on top of a writable foreign table is not automatically
updatable.

Given that these are both new-to-9.3 features, I think it would be a
shame if they don't work together. It's basically a 1-line patch to
make such views automatically updatable, plus a small extra code block
in relation_is_updatable() to reflect the change in the
information_schema views.

The attached patch does that and adds a couple of extra regression tests.


The tests, however, reveal a separate issue with writable foreign
tables --- the information_schema views haven't been updated to
reflect the fact that foreign tables may now be updatable.
Specifically, for foreign tables
information_schema.tables.is_insertable_into and
information_schema.columns.is_updatable always say 'NO' even if the
foreign table is writable. Fixing that would require new C functions
along the same lines as pg_view_is_insertable/updatable(), or those
functions could just be renamed and repurposed to do the check for all
relation kinds, except those known to be always/never updatable.

Regards,
Dean


writable-fdw-view.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3] Automatically updatable views vs writable foreign tables

2013-05-16 Thread Tom Lane
Dean Rasheed dean.a.rash...@gmail.com writes:
 I've just started 9.3 beta testing and I noticed that a simple view
 defined on top of a writable foreign table is not automatically
 updatable.

 Given that these are both new-to-9.3 features, I think it would be a
 shame if they don't work together. It's basically a 1-line patch to
 make such views automatically updatable, plus a small extra code block
 in relation_is_updatable() to reflect the change in the
 information_schema views.

Meh.  This is assuming that an FDW that defines, say, ExecForeignDelete
is thereby promising that *all* tables it supports are deletable.  That
is not required by the current FDW API spec.

If we want to do something about this, I'd be a bit inclined to say that
we should add a new FDW callback function to let the FDW say whether
a particular rel is updatable or not.

I think it would be a good idea to get that done for 9.3, since all this
support is new in 9.3, and it's not too late to adjust the API now.
If we wait, there will be compatibility headaches.

 Specifically, for foreign tables
 information_schema.tables.is_insertable_into and
 information_schema.columns.is_updatable always say 'NO' even if the
 foreign table is writable. Fixing that would require new C functions
 along the same lines as pg_view_is_insertable/updatable(), or those
 functions could just be renamed and repurposed to do the check for all
 relation kinds, except those known to be always/never updatable.

I'd vote to rename/extend them to be pg_relation_is_updatable I think.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3] Automatically updatable views vs writable foreign tables

2013-05-16 Thread Andrew Dunstan


On 05/16/2013 05:16 PM, Tom Lane wrote:

Dean Rasheed dean.a.rash...@gmail.com writes:

I've just started 9.3 beta testing and I noticed that a simple view
defined on top of a writable foreign table is not automatically
updatable.
Given that these are both new-to-9.3 features, I think it would be a
shame if they don't work together. It's basically a 1-line patch to
make such views automatically updatable, plus a small extra code block
in relation_is_updatable() to reflect the change in the
information_schema views.

Meh.  This is assuming that an FDW that defines, say, ExecForeignDelete
is thereby promising that *all* tables it supports are deletable.  That
is not required by the current FDW API spec.

If we want to do something about this, I'd be a bit inclined to say that
we should add a new FDW callback function to let the FDW say whether
a particular rel is updatable or not.

I think it would be a good idea to get that done for 9.3, since all this
support is new in 9.3, and it's not too late to adjust the API now.
If we wait, there will be compatibility headaches.


+1



cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3] Automatically updatable views vs writable foreign tables

2013-05-16 Thread Dean Rasheed
On 16 May 2013 22:16, Tom Lane t...@sss.pgh.pa.us wrote:
 This is assuming that an FDW that defines, say, ExecForeignDelete
 is thereby promising that *all* tables it supports are deletable.  That
 is not required by the current FDW API spec.


Ah OK, I didn't appreciate that distinction.

 If we want to do something about this, I'd be a bit inclined to say that
 we should add a new FDW callback function to let the FDW say whether
 a particular rel is updatable or not.

 I think it would be a good idea to get that done for 9.3, since all this
 support is new in 9.3, and it's not too late to adjust the API now.
 If we wait, there will be compatibility headaches.


+1. That seems like something that should be part of the API, even if
we didn't have an immediate use for it.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PLJava for Postgres 9.2.

2013-05-16 Thread Paul Hammond
Hi all,

I've downloaded PLJava, the latest version, which doesn't seem to have a binary 
distribution at all for 9.2, so I'm trying to build it from the source for 
Postgres 9.2. I have the DB itself installed on Windows 7 64 bit as a binary 
install. I've had to do a fair bit of hacking with the makefiles on cygwin to 
get PLJava to build, but I have succeeded in compiling the Java and JNI code, 
the pljava_all and deploy_all targets effectively.

But I'm coming unstuck at the next target where it's doing the target c_all. 
It's trying to find the following makefile in the Postgres dist:

my postgres installation dir/lib/pgxs/src/makefiles/pgxs.mk: No such file or 
directory

What do I need to do to obtain the required files, and does anybody know why, 
given Postgres 9.2 is out some time, and 9.3 is in beta, why no prebuild binary 
PLJavas exist for 9.2?

Thanks,

Paul


Re: [HACKERS] 9.3 Beta1 status report

2013-05-16 Thread 'Bruce Momjian'
On Thu, May 16, 2013 at 08:38:59PM +0530, Amit Kapila wrote:
  I usually mention items that have a user-visible change, or are easy to
  explain, or apply to most queries.  I am not sure this falls into any
  of
  those categories.
  
  Can you suggest some release note text for this item?
 
 I can think of below text:
 
 Reduce query processing overhead by avoiding insertion of useless plan nodes
 OR
 Improve performance of certain kind of queries by avoiding extra processing
 of doing projection
 
 This applies to queries doing identity projection (SELECT * FROM ...) for
 partitioned tables.

Uh, that's pretty complex for our release notes, and hard to understand
for most users.  All they will know is that PG is faster --- we don't
document every speedup.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Beta1 status report

2013-05-16 Thread Tom Lane
'Bruce Momjian' br...@momjian.us writes:
 On Thu, May 16, 2013 at 08:38:59PM +0530, Amit Kapila wrote:
 Reduce query processing overhead by avoiding insertion of useless plan nodes
 OR
 Improve performance of certain kind of queries by avoiding extra processing
 of doing projection
 
 This applies to queries doing identity projection (SELECT * FROM ...) for
 partitioned tables.

 Uh, that's pretty complex for our release notes, and hard to understand
 for most users.  All they will know is that PG is faster --- we don't
 document every speedup.

No, but this is user-visible if they look at EXPLAIN output, and people
might wonder why they were getting different results.

Possibly text like

Omit unnecessary Result and Limit nodes from query plans.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 Beta1 status report

2013-05-16 Thread 'Bruce Momjian'
On Thu, May 16, 2013 at 06:49:33PM -0400, Tom Lane wrote:
 'Bruce Momjian' br...@momjian.us writes:
  On Thu, May 16, 2013 at 08:38:59PM +0530, Amit Kapila wrote:
  Reduce query processing overhead by avoiding insertion of useless plan 
  nodes
  OR
  Improve performance of certain kind of queries by avoiding extra processing
  of doing projection
  
  This applies to queries doing identity projection (SELECT * FROM ...) for
  partitioned tables.
 
  Uh, that's pretty complex for our release notes, and hard to understand
  for most users.  All they will know is that PG is faster --- we don't
  document every speedup.
 
 No, but this is user-visible if they look at EXPLAIN output, and people
 might wonder why they were getting different results.
 
 Possibly text like
 
   Omit unnecessary Result and Limit nodes from query plans.

Yes, that would be user-visible, though we rarely add details like that.
What queries are faster, that users would understand?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PLJava for Postgres 9.2.

2013-05-16 Thread Andrew Dunstan


On 05/16/2013 05:59 PM, Paul Hammond wrote:

Hi all,

I've downloaded PLJava, the latest version, which doesn't seem to have 
a binary distribution at all for 9.2, so I'm trying to build it from 
the source for Postgres 9.2. I have the DB itself installed on Windows 
7 64 bit as a binary install. I've had to do a fair bit of hacking 
with the makefiles on cygwin to get PLJava to build, but I have 
succeeded in compiling the Java and JNI code, the pljava_all and 
deploy_all targets effectively.



Cygwin is not a recommended build platform for native Windows builds. 
See the docs for the recommended ways to build Postgres.





But I'm coming unstuck at the next target where it's doing the target 
c_all. It's trying to find the following makefile in the Postgres dist:


my postgres installation dir/lib/pgxs/src/makefiles/pgxs.mk: No such 
file or directory


What do I need to do to obtain the required files, and does anybody 
know why, given Postgres 9.2 is out some time, and 9.3 is in beta, why 
no prebuild binary PLJavas exist for 9.2?



Because nobody has built them?


FYI, PL/Java is not maintained by the PostgreSQL project.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fallocate / posix_fallocate for new WAL file creation (etc...)

2013-05-16 Thread Greg Smith

On 5/16/13 9:16 AM, Jon Nelson wrote:

Am I doing this the right way? Should I be posting the full patch each
time, or incremental patches?


There are guidelines for getting your patch in the right format at 
https://wiki.postgresql.org/wiki/Working_with_Git#Context_diffs_with_Git 
that would improve this one.  You have some formatting issues with tab 
spacing at lines 120 through 133 in your v3 patch.  And it looks like 
there was a formatting change on line 146 that is making the diff larger 
than it needs to be.


The biggest thing missing from this submission is information about what 
performance testing you did.  Ideally performance patches are submitted 
with enough information for a reviewer to duplicate the same test the 
author did, as well as hard before/after performance numbers from your 
test system.  It often turns tricky to duplicate a performance gain, and 
being able to run the same test used for initial development eliminates 
a lot of the problems.


Second bit of nitpicking.  There are already some GUC values that appear 
or disappear based on compile time options.  They're all debugging 
related things though.  I would prefer not to see this one go away when 
it's implementation isn't available.  That's going to break any scripts 
that SHOW the setting to see if it's turned on or not as a first 
problem.  I think the right model to follow here is the IFDEF setup used 
for effective_io_concurrency.  I wouldn't worry about this too much 
though.  Having a wal_use_fallocate GUC is good for testing.  But if it 
works out well, when it's ready for commit I don't see why anyone would 
want it turned off on platforms where it works.  There are already too 
many performance tweaking GUCs.  Something has to be very likely to be 
changed from the default before its worth adding one for it.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extent Locks

2013-05-16 Thread Robert Haas
On Wed, May 15, 2013 at 8:54 PM, Stephen Frost sfr...@snowman.net wrote:
   Starting a new thread to avoid hijacking Heikki's original, but..

 * Heikki Linnakangas (hlinnakan...@vmware.com) wrote:
 Truncating a heap at the end of vacuum, to release unused space back to
 the OS, currently requires taking an AccessExclusiveLock. Although
 it's only held for a short duration, it can be enough to cause a
 hiccup in query processing while it's held. Also, if there is a
 continuous stream of queries on the table, autovacuum never succeeds
 in acquiring the lock, and thus the table never gets truncated.

   Extent locking suffers from very similar problems and we really need
   to improve this situation.  With today's fast i/o systems, and massive
   numbers of CPUs in a single system, it's absolutely trivial to have a
   whole slew of processes trying to add data to a single relation and
   that access getting nearly serialized due to everyone waiting on the
   extent lock.

   Perhaps one really simple approach would be to increase the size of
   the extent which is created in relation to the size of the relation.
   I've no clue what level of effort is involved there but I'm hoping
   such an approach would help.  I've long thought that it'd be very neat
   if we could simply give each bulk-inserter process their own 1G chunk
   to insert directly into w/o having to talk to anyone else.  The
   creation of the specific 1G piece could, hopefully, be made atomic
   easily (either thanks to the OS or with our own locking), etc, etc.

   I'm sure it's many bricks shy of a load, but I wanted to raise the
   issue, again, as I've seen it happening on yet another high-volume
   write-intensive system.

I think you might be confused, or else I'm confused, because I don't
believe we have any such thing as an extent lock.  What we do have is
a relation extension lock, but the size of the segment on disk has
nothing to do with that: there's only one for the whole relation, and
you hold it when adding a block to the relation.  The organization of
blocks into 1GB segments happens at a much lower level of the system,
and is completely disconnected from the locking subsystem.  So
changing the segment size wouldn't help with this problem, and would
actually be quite difficult to do, because everything in the system
except at the very lowermost layer just knows about block numbers and
has no idea what extent the block is in.

But that having been said, it just so happens that I was recently
playing around with ways of trying to fix the relation extension
bottleneck.  One thing I tried was: every time a particular backend
extends the relation, it extends the relation by more than 1 block at
a time before releasing the relation extension lock.  Then, other
backends can find those blocks in the free space map instead of having
to grab the relation extension lock, so the number of acquire/release
cycles on the relation extension lock goes down.  This does help...
but at least in my tests, extending by 2 blocks instead of 1 was the
big winner, and after that you didn't get much further relief.
Another thing I tried was pre-extending the relation to the estimated
final size.  That worked a lot better, and might be worth doing (e.g.
ALTER TABLE zorp SET MINIMUM SIZE 1GB) but a less manual solution
would be preferable if we can come up with one.

After that, I ran out of time for investigation.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] counting algorithm for incremental matview maintenance

2013-05-16 Thread Robert Haas
On Tue, May 14, 2013 at 3:52 PM, Kevin Grittner kgri...@ymail.com wrote:
 We could drive the triggering of incremental maintenance off of the
 dependency information which is already stored, but for performance
 we probably want to add a new pg_class flag to indicate that the
 relation is referenced by a matview definition which specifies
 incremental update.  That would allow a fast path for skipping
 other tests for DML on non-referenced relations, at the expense of
 some additional catalog updates on some DDL.

I'm afraid this might require creating a matview or updating the
definition of a matview to refer to different relations to take
AccessExclusiveLock on those relations, in order to avoid SnapshotNow
problems while updating this flag for those relations, and I think
that's probably unacceptable.  Some thought may be needed here to come
up with a good solution.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #8167: false EINVAL -22 for opening a file

2013-05-16 Thread Robert Haas
On Thu, May 16, 2013 at 11:55 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I've not really reviewed fd.c for other instances of the same bug,
 but I do see some places where there are DO_DB(...) debug calls in
 places where errno had better not change.  I'm thinking maybe we should
 improve that macro to include save/restore errno.  Or else rip that
 stuff out entirely --- I've sure never built this code with FDDEBUG
 set, has anyone else?

Not me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logging of PAM Authentication Failure

2013-05-16 Thread Robert Haas
On Thu, May 16, 2013 at 7:01 AM, Andres Freund and...@2ndquadrant.com wrote:
 I unfortunately have to say I don't really see the point of this. The
 cost of the additional connection attempt is rather low and we have to
 deal with the superflous attempts anyway since there will be old libpqs
 around for years. Why is this worth the effort?

I have always considered this a wart, and I think we've had customers
complain about these kinds of things, too.  So +1 from me for fixing
it.  If not everyone updates their client to take advantage of the new
APIs, so be it.  If psql and pgAdmin do, it'll solve 90% of the
problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Proposal to add connection request Wait-time in PSQL client.

2013-05-16 Thread amul sul
Hello,

I have observed the following same situation in PG 9.3beta1
Multiple PSQL clients are connected to server, some of them running  
transaction and some of them are idle state.

 
When one of the backend is killed or crashed (using kill -9 backend-pid).
The connection reset attempt from the active clients( that is, which were 
running a  transaction and crashed in between) fails, since they immediately 
make the attempt while the server is in startup phase.
 
 
I just gone through and found following:
 
1. When backend crashes , server goes into recovery mode and come in the  
normal state to accept connection, it take little time.
2. But at busy client(which was running transaction before crash),  immediately 
tries to reconnect to server which is under startup phase so it  gets a 
negative reply and fails to reconnect.
 
So I thought, before sending reconnect request from client need to wait for  
the server come to a state when it can accept connections.  It should have some 
timeout wait.
 
I am not sure is this correct way to code modification or does it have any 
other impact.
I tried wait to client before sending reconnect request to server.
For that added some sleep time for client in src/bin/psql/common.c (that is it 
changes things only  for psql clients)
 
Please check the attached patch for the modification.
 

Regards,
Amul Sul

0001-psql-connection-reset-wait.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] counting algorithm for incremental matview maintenance

2013-05-16 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 Kevin Grittner kgri...@ymail.com wrote:
 We could drive the triggering of incremental maintenance off of the
 dependency information which is already stored, but for performance
 we probably want to add a new pg_class flag to indicate that the
 relation is referenced by a matview definition which specifies
 incremental update.  That would allow a fast path for skipping
 other tests for DML on non-referenced relations, at the expense of
 some additional catalog updates on some DDL.

 I'm afraid this might require creating a matview or updating the
 definition of a matview to refer to different relations to take
 AccessExclusiveLock on those relations, in order to avoid SnapshotNow
 problems while updating this flag for those relations, and I think
 that's probably unacceptable.  Some thought may be needed here to come
 up with a good solution.

Thanks for the feedback.

I had been thinking that such a flag would be the moral equivalent
of such existing flags as relhaspkey, relhasrules, relhastriggers,
and relhassubclass.  Those all require owner rights to change, and
perhaps we don't want to require that a user be the owner of a
table to define a materialized view which references that table and
specifies incremental update.  On the other hand, we might not want
to let just any old user who has SELECT permission on a table to
specify that it feeds an incrementally updated matview, since there
is no escaping the fact that extra work will be needed for DML
against that table if it is doing that.  I seem to recall that at
least one other product requires the owner of a table to ALTER it
to set a flag specifying that the table is allowed to be used to
back incrementally updated matviews; perhaps that's the way to go?

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Better handling of archive_command problems

2013-05-16 Thread Robert Haas
On Thu, May 16, 2013 at 2:42 PM, Peter Geoghegan p...@heroku.com wrote:
 On Thu, May 16, 2013 at 11:16 AM, Robert Haas robertmh...@gmail.com wrote:
 Well, I think it IS a Postgres precept that interrupts should get a
 timely response.  You don't have to agree, but I think that's
 important.

 Well, yes, but the fact of the matter is that it is taking high single
 digit numbers of seconds to get a response at times, so I don't think
 that there is any reasonable expectation that that be almost
 instantaneous. I don't want to make that worse, but then it might be
 worth it in order to ameliorate a particular pain point for users.

At times, like when the system is under really heavy load?  Or at
times, like depending on what the backend is doing?  We can't do a
whole lot about the fact that it's possible to beat a system to death
so that, at the OS level, it stops responding.  Linux is unfriendly
enough to put processes into non-interruptible kernel wait states when
they're waiting on the disk, a decision that I suspect to have been
made by a sadomasochist.  But if there are times when a system that is
not responding to cancels in under a second when not particularly
heavily loaded, I would consider that a bug, and we should fix it.

 There is a setting called zero_damaged_pages, and enabling it causes
 data loss. I've seen cases where it was enabled within postgresql.conf
 for years.

 That is both true and bad, but it is not a reason to do more bad things.

 I don't think it's bad. I think that we shouldn't be paternalistic
 towards our users. If anyone enables a setting like zero_damaged_pages
 (or, say, wal_write_throttle) within their postgresql.conf
 indefinitely for no good reason, then they're incompetent. End of
 story.

That's a pretty user-hostile attitude.  Configuration mistakes are a
very common user error.  If those configuration hose the system, users
expect to be able to change them back, hit reload, and get things back
on track.  But you're proposing a GUC that, if set to a bad value,
will very plausibly cause the entire system to freeze up in such a way
that it won't respond to a reload request - or for that matter a fast
shutdown request.  I think that's 100% unacceptable.  Despite what you
seem to think, we've put a lot of work into ensuring interruptibility,
and it does not make sense to abandon that principle for this or any
other feature.

 Would you feel better about it if the setting had a time-out? Say, the
 user had to explicitly re-enable it after one hour at the most?

No, but I'd feel better about it if you figured out a way avoid
creating a scenario where it might lock up the entire database
cluster.  I am convinced that it is possible to avoid that, and that
without that this is not a feature worthy of being included in
PostgreSQL.  Yeah, it's more work that way.  But that's the difference
between a quick hack that is useful in our shop and a
production-quality feature ready for a general audience.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Fwd: request a new feature in fuzzystrmatch

2013-05-16 Thread Liming Hu
-- Forwarded message --
From: Liming Hu dawnin...@gmail.com
Date: Thu, May 16, 2013 at 5:41 PM
Subject: request a new feature in fuzzystrmatch
To: pgsql-hackers@postgresql.org


Hi,

I am studying your levenshtein edit distance contribution in Postgresql.

Can we add Levenshtein-Damerau edit distance since it is used in
spelling checker
instead of levenshtein edit distance?

Levenshtein-Damerau edit distance C source code:
http://stackoverflow.com/questions/10727174/dameraulevenshtein-distance-edit-distance-with-transposition-c-implementation.



Thanks,

Liming


--
Liming Hu
cell: (435)-512-4190
Seattle Washington


--
Liming Hu
cell: (435)-512-4190
Seattle Washington


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extent Locks

2013-05-16 Thread Stephen Frost
Robert,

  For not understanding me, we seem to be in violent agreement. ;)

* Robert Haas (robertmh...@gmail.com) wrote:
 I think you might be confused, or else I'm confused, because I don't
 believe we have any such thing as an extent lock.

The relation extension lock is what I was referring to.  Apologies for
any confusion there.

 What we do have is
 a relation extension lock, but the size of the segment on disk has
 nothing to do with that: there's only one for the whole relation, and
 you hold it when adding a block to the relation.  

Yes, which is farrr too small.  I'm certainly aware that the segments on
disk are dealt with in the storage layer- currently.  My proposal was to
consider how we might change that, a bit, to allow improved throughput
when there are multiple writers.

Consider this, for example- when we block on the relation extension
lock, rather than sit and wait or continue to compete with the other
threads, simply tell the storage layer to give us a dedicated file to
work with.  Once we're ready to commit, move that file into place as the
next segment (through some command to the storage layer), using an
atomic command to ensure that it either works and doesn't overwrite
anything, or fails and we try again by moving the segment number up.

We would need to work out, at the storage layer, how to handle cases
where the file is less than 1G and realize that we should just skip over
those blocks on disk as being known-to-be-empty.  Those blocks would
also be then put on the free space map and used for later processes
which need to find somewhere to put new data, etc.

 But that having been said, it just so happens that I was recently
 playing around with ways of trying to fix the relation extension
 bottleneck.  One thing I tried was: every time a particular backend
 extends the relation, it extends the relation by more than 1 block at
 a time before releasing the relation extension lock.  

Right, exactly.  One idea that I was discussing w/ Greg was to do this
using some log(relation-size) approach or similar.

 This does help...
 but at least in my tests, extending by 2 blocks instead of 1 was the
 big winner, and after that you didn't get much further relief.

How many concurrent writers did you have and what kind of filesystem was
backing this?  Was it a temp filesystem where writes are essentially to
memory, causing this relation extention lock to be much more
contentious?

 Another thing I tried was pre-extending the relation to the estimated
 final size.  That worked a lot better, and might be worth doing (e.g.
 ALTER TABLE zorp SET MINIMUM SIZE 1GB) but a less manual solution
 would be preferable if we can come up with one.

Slightly confused here- above you said that '2' was way better than '1',
but you implied that more than 2 wasn't really much better- yet wayyy
more than 2 is much better?  Did I follow that right?  I can certainly
understand such a case, just want to understand it and make sure it's
what you meant.  What small-number options did you try?

 After that, I ran out of time for investigation.

Too bad!  Thanks much for the work in this area, it'd really help if we
could improve this for our data warehouse, in particular, users.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] counting algorithm for incremental matview maintenance

2013-05-16 Thread Robert Haas
On Thu, May 16, 2013 at 8:33 PM, Kevin Grittner kgri...@ymail.com wrote:
 Robert Haas robertmh...@gmail.com wrote:
 Kevin Grittner kgri...@ymail.com wrote:
 We could drive the triggering of incremental maintenance off of the
 dependency information which is already stored, but for performance
 we probably want to add a new pg_class flag to indicate that the
 relation is referenced by a matview definition which specifies
 incremental update.  That would allow a fast path for skipping
 other tests for DML on non-referenced relations, at the expense of
 some additional catalog updates on some DDL.

 I'm afraid this might require creating a matview or updating the
 definition of a matview to refer to different relations to take
 AccessExclusiveLock on those relations, in order to avoid SnapshotNow
 problems while updating this flag for those relations, and I think
 that's probably unacceptable.  Some thought may be needed here to come
 up with a good solution.

 Thanks for the feedback.

 I had been thinking that such a flag would be the moral equivalent
 of such existing flags as relhaspkey, relhasrules, relhastriggers,
 and relhassubclass.  Those all require owner rights to change, and
 perhaps we don't want to require that a user be the owner of a
 table to define a materialized view which references that table and
 specifies incremental update.  On the other hand, we might not want
 to let just any old user who has SELECT permission on a table to
 specify that it feeds an incrementally updated matview, since there
 is no escaping the fact that extra work will be needed for DML
 against that table if it is doing that.  I seem to recall that at
 least one other product requires the owner of a table to ALTER it
 to set a flag specifying that the table is allowed to be used to
 back incrementally updated matviews; perhaps that's the way to go?

Possibly.  That at least has the advantage of transparency: if you do
ALTER TABLE wunk ENABLE DELTA QUEUE or somesuch syntax, it's very
clear that you're buying an AccessExclusiveLock.  And while
AccessExclusiveLocks are not a lot of fun, one that you know is coming
is a lot better than one that comes as a surprise.

I feel like it would be nicer, though, to come up with some trick that
avoids the need to update the referenced table's pg_class entry
altogether.  I don't immediately have a good idea, but I'll mull it
over and see if I come up with anything.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Better handling of archive_command problems

2013-05-16 Thread Peter Geoghegan
On Thu, May 16, 2013 at 5:43 PM, Robert Haas robertmh...@gmail.com wrote:
 At times, like when the system is under really heavy load?  Or at
 times, like depending on what the backend is doing?  We can't do a
 whole lot about the fact that it's possible to beat a system to death
 so that, at the OS level, it stops responding.  Linux is unfriendly
 enough to put processes into non-interruptible kernel wait states when
 they're waiting on the disk, a decision that I suspect to have been
 made by a sadomasochist.

I find it more plausible that the decision was made by someone making
an engineering trade-off.

 But if there are times when a system that is
 not responding to cancels in under a second when not particularly
 heavily loaded, I would consider that a bug, and we should fix it.

It's not as if the DBA is going to have a hard time figuring out why
that is. It's taking a long time to respond because they've throttled
the entire server. Clearly, if that's something they're doing very
casually, they have bigger problems.

 I don't think it's bad. I think that we shouldn't be paternalistic
 towards our users. If anyone enables a setting like zero_damaged_pages
 (or, say, wal_write_throttle) within their postgresql.conf
 indefinitely for no good reason, then they're incompetent. End of
 story.

 That's a pretty user-hostile attitude.

I think paternalism is user-hostile. Things should be easy to user
correctly and hard to use incorrectly. I certainly think we should be
novice friendly, but not if that implies being expert hostile.

The fact that a PANIC shutdown can occur when the pg_xlog filesystem
runs out of space is pretty user-hostile. It's hostile to both novices
and experts.

 Configuration mistakes are a
 very common user error.  If those configuration hose the system, users
 expect to be able to change them back, hit reload, and get things back
 on track.  But you're proposing a GUC that, if set to a bad value,
 will very plausibly cause the entire system to freeze up in such a way
 that it won't respond to a reload request - or for that matter a fast
 shutdown request.  I think that's 100% unacceptable.  Despite what you
 seem to think, we've put a lot of work into ensuring interruptibility,
 and it does not make sense to abandon that principle for this or any
 other feature.

 Would you feel better about it if the setting had a time-out? Say, the
 user had to explicitly re-enable it after one hour at the most?

 No, but I'd feel better about it if you figured out a way avoid
 creating a scenario where it might lock up the entire database
 cluster.  I am convinced that it is possible to avoid that, and that
 without that this is not a feature worthy of being included in
 PostgreSQL.

What if the WALWriter slept on its proc latch within
XLogBackgroundFlush(), rather than calling pg_usleep? That way,
WalSigHupHandler() will set the process latch on a reload, and the
sleep will end immediately if the user determines that they've made a
mistake in setting the sleep. Ditto all other signals. As with all
extant latch sleeps, we wake on postmaster death, so that an
inordinately long sleep doesn't create a denial-of-service that
prevents a restart if, say, the postmaster receives SIGKILL.

Maybe it wouldn't even be much additional work to figure out a way of
making LWLocks care about interrupts. I think an upper limit on the
relevant GUC is sufficient given the nature of what I propose to do,
but I might be convinced if a better approach came to light. Do you
have one?

 Yeah, it's more work that way.  But that's the difference
 between a quick hack that is useful in our shop and a
 production-quality feature ready for a general audience.

It is certainly the case that it would be far easier for me to just
deploy this on our own customer instances. If I'm expected to solve
the problem of this throttling conceivably affecting backends
executing read queries due to the CLogControlLock scenario you
describe, just so users can have total assurance read queries are
unaffected for a couple of hours or less once in a blue moon when
they're fighting off a PANIC shutdown, then the bar is set almost
impossibly high. This is unfortunate, because there is plenty of
evidence that archive_command issues cause serious user pain all the
time.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Better handling of archive_command problems

2013-05-16 Thread Daniel Farina
On Thu, May 16, 2013 at 5:43 PM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, May 16, 2013 at 2:42 PM, Peter Geoghegan p...@heroku.com wrote:
 On Thu, May 16, 2013 at 11:16 AM, Robert Haas robertmh...@gmail.com wrote:
 Well, I think it IS a Postgres precept that interrupts should get a
 timely response.  You don't have to agree, but I think that's
 important.

 Well, yes, but the fact of the matter is that it is taking high single
 digit numbers of seconds to get a response at times, so I don't think
 that there is any reasonable expectation that that be almost
 instantaneous. I don't want to make that worse, but then it might be
 worth it in order to ameliorate a particular pain point for users.

 At times, like when the system is under really heavy load?  Or at
 times, like depending on what the backend is doing?  We can't do a
 whole lot about the fact that it's possible to beat a system to death
 so that, at the OS level, it stops responding.  Linux is unfriendly
 enough to put processes into non-interruptible kernel wait states when
 they're waiting on the disk, a decision that I suspect to have been
 made by a sadomasochist.  But if there are times when a system that is
 not responding to cancels in under a second when not particularly
 heavily loaded, I would consider that a bug, and we should fix it.

 There is a setting called zero_damaged_pages, and enabling it causes
 data loss. I've seen cases where it was enabled within postgresql.conf
 for years.

 That is both true and bad, but it is not a reason to do more bad things.

 I don't think it's bad. I think that we shouldn't be paternalistic
 towards our users. If anyone enables a setting like zero_damaged_pages
 (or, say, wal_write_throttle) within their postgresql.conf
 indefinitely for no good reason, then they're incompetent. End of
 story.

 That's a pretty user-hostile attitude.  Configuration mistakes are a
 very common user error.  If those configuration hose the system, users
 expect to be able to change them back, hit reload, and get things back
 on track.  But you're proposing a GUC that, if set to a bad value,
 will very plausibly cause the entire system to freeze up in such a way
 that it won't respond to a reload request - or for that matter a fast
 shutdown request.  I think that's 100% unacceptable.  Despite what you
 seem to think, we've put a lot of work into ensuring interruptibility,
 and it does not make sense to abandon that principle for this or any
 other feature.

The inability to shut down in such a situation is not happy at all, as
you say, and the problems with whacking the GUC around due to
non-interruptability is pretty bad too.

 Would you feel better about it if the setting had a time-out? Say, the
 user had to explicitly re-enable it after one hour at the most?

 No, but I'd feel better about it if you figured out a way avoid
 creating a scenario where it might lock up the entire database
 cluster.  I am convinced that it is possible to avoid that

Do you have a sketch about mechanism to not encounter that problem?

 and that without that this is not a feature worthy of being included
 in PostgreSQL.  Yeah, it's more work that way.  But that's the
 difference between a quick hack that is useful in our shop and a
 production-quality feature ready for a general audience.

However little it may matter, I would like to disagree with your
opinion on this one: the current situation as I imagine encountered by
*all* users of archiving is really unpleasant, 'my' shop or no.  It
would probably not be inaccurate to say that 99.% of archiving
users have to live with a hazy control over the amount of data loss,
only bounded by how long it takes for the system to full up the WAL
file system and then for PostgreSQL to PANIC and crash (hence, no more
writes are processed, and no more data can be lost).

Once one factors in the human cost of having to deal with that down
time or monitor it to circumvent this, I feel as though the bar for
quality should be lowered.  As you see, we've had to resort to
horrific techniques that to get around this problem.

I think this is something serious enough that it is worth doing
better, but the bind that people doing archiving find themselves in is
much worse at the margins -- involving data loss and loss of
availability -- and accordingly, I think the bar for some kind of
solution should be lowered, insomuch as that at least the interface
should be right enough to not be an albatross later (of which this
proposal may not meet).

That said, there is probably a way to please everyone and do something
better.  Any ideas?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers