Re: [HACKERS] gaussian distribution pgbench

2014-07-28 Thread Heikki Linnakangas

On 07/17/2014 11:13 PM, Fabien COELHO wrote:



However, ISTM that it is not the purpose of pgbench documentation to be a
primer about what is an exponential or gaussian distribution, so the idea
would yet be to have a relatively compact explanation, and that the
interested but clueless reader would document h..self from wikipedia or a
text book or a friend or a math teacher (who could be a friend as well:-).


Well, I think it's a balance.  I agree that the pgbench documentation
shouldn't try to substitute for a text book or a math teacher, but I
also think that you shouldn't necessarily need to refer to a text book
or a math teacher in order to figure out how to use pgbench.  Saying
it's complicated, so we don't have to explain it would be a cop out;
we need to *make* it simple.  And if there's no way to do that, then
IMHO we should reject the patch in favor of some future patch that
implements something that will be easy for users to understand.


  [nttcom@localhost postgresql]$ contrib/pgbench/pgbench --exponential=10
starting vacuum...end.
transaction type: Exponential distribution TPC-B (sort of)
scaling factor: 1
exponential threshold: 10.0

decile percents: 63.2% 23.3% 8.6% 3.1% 1.2% 0.4% 0.2% 0.1% 0.0% 0.0%
highest/lowest percent of the range: 9.5% 0.0%


I don't have a clue what that means.  None.


Maybe we could add in front of the decile/percent

distribution of increasing account key values selected by pgbench:


I still wouldn't know what that meant.  And it misses the point
anyway: if the documentation is good, this will be unnecessary.  If
the documentation is bad, a printout that tries to illustrate it by
example is not an acceptable substitute.


The decile description is quite classic when discussing statistics.


IMHO we should include a diagram for each distribution. A diagram would 
be much more easy to understand than a decile or verbal explanation.


The only problem is that the build infrastructure doesn't currently 
support including images in the docs. That's been discussed before, and 
I think we even used to have a couple of images there a long time ago. 
Now would be a good time to bite the bullet and add the support.
We got fairly close to a consensus on how to do it in this thread: 
www.postgresql.org/message-id/flat/20120712181636.gc11...@momjian.us. 
The biggest problem was choosing an editor that has a fairly stable file 
format, so that we don't get huge diffs every time someone moves a line 
in a diagram. One work-around for that is to use graphviz and/or gnuplot 
as the source format, instead of a graphical editor.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Introducing coarse grain parallelism by postgres_fdw.

2014-07-28 Thread Kyotaro HORIGUCHI
Hello, thank you for the comment.

 Hi Kyotaro,
 fetch_more_rows() always runs FETCH 100 FROM cursor_name on the foreign
 server to get the next set of rows. The changes you have made seem to run
 only the first FETCHes from all the nodes but not the subsequent ones. The
 optimization will be helpful only when there are less than 100 rows per
 postgres connection in the query. If there are more than 100 rows from a
 single foreign server, the second onwards FETCHes will be serialized.
 
 Is my understanding correct?

Yes, you're right. So I wrote that as following.

Me it almost halves the response time because the remote queries
Me take far longer startup time than running time.

Parallelizing all FETCHes would be effective if the connection
transfers bytes at a speed near the row fetch speed but I
excluded the case because of the my assumption that the chance is
relatively lower for the gain, and for the simplicity as PoC. If
this approach is not so inappropriate and not getting objections,
I will work on this for the more complete implement, including
cost estimation.

 On Fri, Jul 25, 2014 at 2:05 PM, Kyotaro HORIGUCHI 
 horiguchi.kyot...@lab.ntt.co.jp wrote:
 
  Hello,
 
  I noticed that postgresql_fdw can run in parallel by very small
  change. The attached patch let scans by postgres_fdws on
  different foreign servers run sumiltaneously. This seems a
  convenient entry point to parallel execution.
 
  For the testing configuration which the attched sql script makes,
  it almost halves the response time because the remote queries
  take far longer startup time than running time. The two foreign
  tables fvs1, fvs2 and fvs1_2 are defined on the same table but
  fvs1 and fvs1_2 are on the same foreign server pgs1 and fvs2 is
  on the another foreign server pgs2.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Introducing coarse grain parallelism by postgres_fdw.

2014-07-28 Thread Kyotaro HORIGUCHI
Hello,

 In order to minimize the impact, what can be done is to execute
 fetch_more_data() in asynchronous mode every time, when there only few rows
 left to be consumed. So in current code below
 1019 /*
 1020  * Get some more tuples, if we've run out.
 1021  */
 1022 if (fsstate-next_tuple = fsstate-num_tuples)
 1023 {
 1024 /* No point in another fetch if we already detected EOF,
 though. */
 1025 if (!fsstate-eof_reached)
 1026 fetch_more_data(node, false);
 1027 /* If we didn't get any tuples, must be end of data. */
 1028 if (fsstate-next_tuple = fsstate-num_tuples)
 1029 return ExecClearTuple(slot);
 1030 }
 
 replace line 1022 with if (fsstate-next_tuple = fsstate-num_tuples)
 with if (fsstate-next_tuple = fsstate-num_tuples -
 SOME_BUFFER_NUMBER_ROWS)
 Other possibility is to call PQsendQuery(conn, sql), after line 2100 and if
 eof_reached is false.
 
 2096 /* Must be EOF if we didn't get as many tuples as we asked
 for. */
 2097 fsstate-eof_reached = (numrows  fetch_size);
 2098
 2099 PQclear(res);
 2100 res = NULL;

I see, I'll consider it. If late (lazy) error detection is
allowed, single row mode seems available, too.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/PgSQL: EXIT USING ROLLBACK

2014-07-28 Thread Simon Riggs
On 26 July 2014 18:14, Marko Tiikkaja ma...@joh.to wrote:

 Today I'd like to present a way to get rid of code like this:

You haven't explained this very well... there is nothing that explains
WHY you want this.

In the absence of a good explanation and a viable benefit, I would
vote -1 for this feature suggestion.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/PgSQL: EXIT USING ROLLBACK

2014-07-28 Thread Marko Tiikkaja

On 7/28/14 11:27 AM, Simon Riggs wrote:

On 26 July 2014 18:14, Marko Tiikkaja ma...@joh.to wrote:


Today I'd like to present a way to get rid of code like this:


You haven't explained this very well... there is nothing that explains
WHY you want this.

In the absence of a good explanation and a viable benefit, I would
vote -1 for this feature suggestion.


Yes, I did a poor job in the original email, but I did explain my 
reasoning later:


 Yes, you can already do this with RAISE but that seems more like an
 accident than anything else.  I feel a dedicated syntax is less error
 prone and makes the intent clearer to people reading the code.  But I
 realize I might be in the minority with this.

I guess -3, +0 is enough that I'll be dropping the patch.  Thanks to 
everyone who had a look.



.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use unique index for longer pathkeys.

2014-07-28 Thread Kyotaro HORIGUCHI
Hello,

   I think there is one more disadvantage in the way current patch is
   done which is that you need to collect index path keys for all relations
   irrespective of whether they will be of any use to eliminate useless
   pathkeys from query_pathkeys.  One trivial case that comes to mind is
   when there are multiple relations involved in query and ORDER BY is
   base on columns of only part of the tables involved in query.
 
  Like this?
 
  select x.a, x.b, y.b from x, y where x.a = y.a order by x.a, x.b;
 
  Equivalent class consists of (x.a=y.a) and (x.b), so index
  pathkeys for i_y is (y.a.=x.a). As a result, no common primary
  pathkeys found.
 
 I think it will find common pathkey incase you have an unique index
 on x.a (please see the example below), but currently I am not clear
 why there is a need for a common index path key in such a case to
 eliminate useless keys in ORDER BY, why can't we do it based
 on individual table's path key.
 
 Example:
 
 create table t (a int not null, b int not null, c int, d text);
 create unique index i_t_pkey on t(a, b);
 insert into t (select a % 10, a / 10, a, 't' from generate_series(0,
 10) a);
 analyze;
 
 create table t1 (a int not null, b int not null, c int, d text);
 create unique index i_t1_pkey_1 on t1(a);
 create unique index i_t1_pkey_2 on t1(a, b);
 insert into t1 (select a * 2, a / 10, a, 't' from generate_series(0,
 10) a);
 explain (costs off, analyze off) select * from t,t1 where t.a=t1.a order by
 t1.a,t1.b,t1.c,t1.d;
 
 QUERY PLAN
 --
  Merge Join
Merge Cond: (t.a = t1.a)
-  Index Scan using i_t_pkey on t
-  Index Scan using i_t1_pkey_1 on t1
 (4 rows)
 
 Here we can notice that there is no separate sort key in plan.

Sure, 

 Now drop the i_t1_pkey_1 and check the query plan again.
 
 drop index i_t1_pkey_1;
 explain (costs off, analyze off) select * from t,t1 where t.a=t1.a order by
 t1.a,t1.b,t1.c,t1.d;
QUERY PLAN
 
  Sort
Sort Key: t.a, t1.b, t1.c, t1.d
-  Merge Join
  Merge Cond: (t.a = t1.a)
  -  Index Scan using i_t_pkey on t
  -  Index Scan using i_t1_pkey_2 on t1
 (6 rows)
 
 Can't above plan eliminate Sort Key even after dropping index
 (i_t1_pkey_1)?

My patch doesn't so since there no longer a 'common primary
pathkeys' in this query. Perhaps the query doesn't allow the sort
eliminated. Since a is no more a pkey, t1 can have dulicate rows
for the same a, so the joined relation also may have duplicte
values in the column a. Therefore the joined relation may be half
sorted only by the column a so the sort pathkeys cannot be
trimmed.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/PgSQL: EXIT USING ROLLBACK

2014-07-28 Thread Simon Riggs
On 28 July 2014 10:34, Marko Tiikkaja ma...@joh.to wrote:
 On 7/28/14 11:27 AM, Simon Riggs wrote:

 On 26 July 2014 18:14, Marko Tiikkaja ma...@joh.to wrote:

 Today I'd like to present a way to get rid of code like this:


 You haven't explained this very well... there is nothing that explains
 WHY you want this.

 In the absence of a good explanation and a viable benefit, I would
 vote -1 for this feature suggestion.


 Yes, I did a poor job in the original email, but I did explain my reasoning
 later:

With respect, I think you did a poor job the second time too. I can't
find a clearly explained reasoning behind the proposal, nor do I
understand what the problem was.

One of the things I do is work hard on my initial explanations and
reasoning. This helps me because I frequently end up not proposing
something because my reasoning was poor, but it also helps me focus on
whether I am solving a real problem by sharepening my understanding of
the actual problem. And it also helps Tom (or others) demolish things
more quickly with a well placed indeed ;-)

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should smgrtruncate() avoid sending sinval message for temp relations

2014-07-28 Thread MauMau

From: MauMau maumau...@gmail.com
I must add one thing.  After some client processes closed the connection 
without any hang, their server processes were stuck with a stack trace 
like this (I'll look for and show the exact stack trace tomorrow):


I found two kinds of stack traces:

#0  0x003199ec488f in poll () from /lib64/libc.so.6
#1  0x00609f24 in WaitLatchOrSocket ()
#2  0x0063ad92 in SyncRepWaitForLSN ()
#3  0x004ad474 in CommitTransaction ()
#4  0x004aef53 in CommitTransactionCommand ()
#5  0x0064b547 in shmem_exit ()
#6  0x0064b625 in proc_exit_prepare ()
#7  0x0064b6a8 in proc_exit ()
#8  0x00668a94 in PostgresMain ()
#9  0x00617f2c in ServerLoop ()
#10 0x0061ae96 in PostmasterMain ()
#11 0x005b2ccf in main ()

#0  0x003f4badf258 in poll () from /lib64/libc.so.6
#1  0x00619b94 in WaitLatchOrSocket ()
#2  0x00640c4c in SyncRepWaitForLSN ()
#3  0x00491c18 in RecordTransactionCommit ()
#4  0x00491d98 in CommitTransaction ()
#5  0x00493135 in CommitTransactionCommand ()
#6  0x00653fc5 in ProcessCatchupEvent ()
#7  0x006540ed in HandleCatchupInterrupt ()
#8  0x006533e3 in procsignal_sigusr1_handler ()
#9  signal handler called
#10 0x003f4bae96b0 in recv () from /lib64/libc.so.6
#11 0x005b75f6 in secure_read ()
#12 0x005c223b in pq_recvbuf ()
#13 0x005c263b in pq_getbyte ()
#14 0x0066e081 in PostgresMain ()
#15 0x00627d81 in PostmasterMain ()
#16 0x005c4803 in main ()


I'll try the fix tomorrow if possible.  What kind of problems do you hink 
of for back-patching?


I could reproduce the problem with 9.2.8, but have not yet with 9.5dev. 
I'll try with 9.2.9, and create the fix.


Regards
MauMau




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] config.sgml referring to unix_socket_directories on older releases

2014-07-28 Thread Fujii Masao
On Sun, Jul 27, 2014 at 12:09 AM, Guillaume Lelarge
guilla...@lelarge.info wrote:
 Hi,

 While updating the french translation of the latest releases, I stumbled
 upon a small issue on the config.sgml file.

 It talks about unix_socket_directories whereas this parameter only appears
 with the 9.3 release. It should probably be replaced with
 unix_socket_directory for all releases where this has been commited (8.4 to
 9.2). The patch attached does this. It applies cleanly on all releases (with
 a hunk though).

Thanks for the patch! Applied it to 9.2, 9.1 and 9.0. I didn't applied it to 8.4
because it's already EOL.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Audit of logout

2014-07-28 Thread Fujii Masao
On Mon, Jul 28, 2014 at 12:22 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Thu, Jul 3, 2014 at 1:13 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Wed, Jul 2, 2014 at 11:39 PM, Joe Conway m...@joeconway.com wrote:

 No. If we change it to PGC_SIGHUP, SHOW command does display
 the changed value after a reload. It's the same behavior as other
 PGC_SIGHUP parameters do. Attached patch just changes it to PGC_SIGHUP.
 You can test that by using the patch.

 As this patch is marked as Needs Review, so I went ahead and
 picked up for review, however after reading mail chain, it seems to
 me that there is a general inclination to have a new category in
 GucContext for this feature.  I don't see the patch implementing the
 same in this thread, so I think it is better to move it to next CF
 (2014-08).

Yep, agreed. I just moved this to next CF.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [GENERAL] pg_dump behaves differently for different archive formats

2014-07-28 Thread Albe Laurenz
Tom Lane wrote on Dec 16, 2013:
 Albe Laurenz laurenz.a...@wien.gv.at writes:
 Restoring a plain format dump and a custom format dump of
 the same database can lead to different results:

 pg_dump organizes the SQL statements it creates in TOC entries.
 If a custom format dump is restored with pg_restore, all
 SQL statements in a TOC entry will be executed as a single command
 and thus in a single transaction.

 Yeah, this is a bug I think.  pg_dump was designed around the idea
 that the output would be executed as a simple script, and in a
 number of places there's an expectation that one SQL statement
 can fail without affecting following ones.  So if pg_restore can't
 provide that behavior it's not good.
 
 On the other hand, I'm not sure how much enthusiasm there'd be for
 complex or fragile changes to fix this.  A lot of people invariably
 run restores in single-transaction mode and don't really care about
 fault-tolerant restores.  Also, it's easy enough to dodge the problem
 if you must: just pipe the output into psql rather than
 direct-to-database.
 
 So to me the question is can we fix this without doing something like
 duplicating psql's lexer?  If we have to parse out the statements
 contained in each text blob, it's probably going to be too painful.
 Some cautionary history about this sort of thing can be read at
 http://www.postgresql.org/message-id/flat/18006.1325700...@sss.pgh.pa.us

I thought that changing the dump format for this would be too
much trouble, so I came up with the attached.

It assumes that custom- or tar-format archives are written by pg_dump
and cannot contain arbitrary SQL statements, which allows me to get away
with very simple parsing.

If this is not shot down immediately on account of fragility, I'd
add it to the next commitfest page.

The problem has been a pain point for my co-workers in the past;
using single-transaction mode doesn't work for us, since we have custom objects
in our template database that cause expected errors when a dump is restored.

Yours,
Laurenz Albe


pg_restore.patch
Description: pg_restore.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [GENERAL] pg_dump behaves differently for different archive formats

2014-07-28 Thread Tom Lane
Albe Laurenz laurenz.a...@wien.gv.at writes:
 I thought that changing the dump format for this would be too
 much trouble, so I came up with the attached.

 It assumes that custom- or tar-format archives are written by pg_dump
 and cannot contain arbitrary SQL statements, which allows me to get away
 with very simple parsing.

I don't think this can be trusted in the least.  To begin with, where'd
you get the idea dumps cannot contain arbitrary SQL statements?  CREATE
RULE at least could contain some pretty weird stuff.  This thing doesn't
look like it's even bothering to count nested parentheses, so it will
certainly fail on a multi-statement rule.  I believe you're also at risk
of SQL injection attacks from failing to account for multibyte characters
in non-ASCII-safe client encodings.

While those specific problems could no doubt be fixed, I object to the
entire concept of assuming that what pg_dump emits is always going to be
trivially parsable.  If we are to go down this path, I think we have to
replicate what psql is doing to identify statement boundaries ... and
as I mentioned upthread, that's rather a lot of code :-(

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] building pdfs

2014-07-28 Thread Andrew Dunstan


On 07/27/2014 11:28 AM, Tom Lane wrote:



Personally I find the PDF docs to be an anachronism: surely nobody
is printing them on dead trees any more, and for on-computer usage,
what do they offer that the HTML format doesn't?  So I'm unexcited
about making them slightly prettier.





If they are then maybe there's no point in trying to build them in the 
buildfarm constantly.


One advantage that they have over the HTML docs is that they encapsulate 
the docs in a single file. But then, so does the epub format, which, 
unlike PDFs, can adapt to display dimensions.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [GENERAL] pg_dump behaves differently for different archive formats

2014-07-28 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Albe Laurenz laurenz.a...@wien.gv.at writes:
  I thought that changing the dump format for this would be too
  much trouble, so I came up with the attached.

If we're going to change this, it seems to me that the only option would
be to change the dump format...  Just off-the-cuff, I'm wondering if we
could actually not change the real 'format' but simply promote each ACL
entry (and similar cases..) to top-level objects and declare that TOC
entries should be single statements.

 While those specific problems could no doubt be fixed, I object to the
 entire concept of assuming that what pg_dump emits is always going to be
 trivially parsable.  If we are to go down this path, I think we have to
 replicate what psql is doing to identify statement boundaries ... and
 as I mentioned upthread, that's rather a lot of code :-(

Agreed.  If we want this, we should handle it on the pg_dump side, not
try and work it out on the pg_restore side.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] delta relations in AFTER triggers

2014-07-28 Thread Marti Raudsepp
On Sat, Jul 5, 2014 at 5:38 PM, Kevin Grittner kgri...@ymail.com wrote:
 it seems to me that we need the full tuple to support triggers on
 FDWs, so the TID approach would be an optimization for a subset of
 the cases, and would probably be more appropriate, if we do it at
 all, in a follow-on patch
 If you disagree with that assessment, now would be a good
 time to explain your reasoning.

Maybe I just have a limited imagination because I've never found a use
for FDWs personally. But recording changes from a trigger on a FDW
table doesn't seem that useful, since you can only capture changes
done by the local node. I expect that in many situations there are
multiple writers accessing the same underlying remote table. Certainly
it's can't guarantee the consistency of materialized views.

 I took a look at whether I could avoid making OLD and NEW
 non-reserved keywords, but I didn't see how to do that without
 making FOR at least partially reserved.  If someone sees a way to
 do this without creating three new unreserved keywords
 (REFERENCING, OLD, and NEW) I'm all ears.

Sorry, I know I am very late to make this point, so feel free to ignore this.

I'm not a fan of the SQL standard syntax for this feature. One nice
thing about PostgreSQL's triggers is that you can declare the trigger
function once and re-use it on many tables. It would make more sense
if the same function declaration could say what variable/relation
names it wants to use. They're more like function argument names, not
some metadata about a table-function relationship.

Putting these in the CREATE TRIGGER command means you have to repeat
them for each table you want to apply the trigger to. It introduces
the possibility of making more mistakes without any gain in
flexibility.

But then again, I understand that there's value in supporting standard syntax.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] building pdfs

2014-07-28 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 07/27/2014 11:28 AM, Tom Lane wrote:
 Personally I find the PDF docs to be an anachronism: surely nobody
 is printing them on dead trees any more, and for on-computer usage,
 what do they offer that the HTML format doesn't?  So I'm unexcited
 about making them slightly prettier.

 If they are then maybe there's no point in trying to build them in the 
 buildfarm constantly.

 One advantage that they have over the HTML docs is that they encapsulate 
 the docs in a single file. But then, so does the epub format, which, 
 unlike PDFs, can adapt to display dimensions.

Hm.  I recall Peter was experimenting with building epub format.  Maybe we
could start moving over to that, if there's a mature toolchain.  I keep
fearing we are going to run up against document-size limits in the
jadetex toolchain (and unlike the last time, it may not be so easy to
modify the limits --- there are hard-wired limits inside TeX).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [GENERAL] pg_dump behaves differently for different archive formats

2014-07-28 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 If we're going to change this, it seems to me that the only option would
 be to change the dump format...  Just off-the-cuff, I'm wondering if we
 could actually not change the real 'format' but simply promote each ACL
 entry (and similar cases..) to top-level objects and declare that TOC
 entries should be single statements.

I don't think we want even more TOC entries, but it would not be
unreasonable to insist that the statement(s) within a TOC entry be
subdivided somehow.  Essentially the payload of a TOC entry becomes
a list of strings rather than just one string.

That would mean that the problem could not be fixed for existing archive
files; but that seems OK, given the rather small number of complaints
so far.

If we had something like that, I'd be strongly inclined to get rid of
the existing convention whereby comments and ACL commands are separate
TOC entries, and make them part of the parent object's TOC entry (which'd
mean we'd want to label the sub-strings so we can tell whether they are
main object, comment, or ACL).  The fewer TOC entries we can have, the
better; there is no reason why comments/ACLs should be independently
sortable.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSoC2014] Patch ALTER TABLE ... SET LOGGED

2014-07-28 Thread Fabrízio de Royes Mello
On Wed, Jul 23, 2014 at 5:48 PM, Fabrízio de Royes Mello 
fabriziome...@gmail.com wrote:


 On Tue, Jul 22, 2014 at 3:29 PM, Fabrízio de Royes Mello 
fabriziome...@gmail.com wrote:
 
  On Tue, Jul 22, 2014 at 12:01 PM, Fabrízio de Royes Mello 
fabriziome...@gmail.com wrote:
  
  
  
   On Thu, Jul 17, 2014 at 8:02 PM, Andres Freund and...@2ndquadrant.com
wrote:
   
 That means should I FlushRelationBuffers(rel) before change the
 relpersistence?
   
I don't think that'd help. I think what this means that you simply
cannot change the relpersistence of the old relation before the heap
swap is successful. So I guess it has to be something like
(pseudocode):
   
OIDNewHeap = make_new_heap(..);
newrel = heap_open(OIDNewHeap, AEL);
   
/*
 * Change the temporary relation to be unlogged/logged. We have to
do
 * that here so buffers for the new relfilenode will have the right
 * persistency set while the original filenode's buffers won't get
read
 * in with the wrong (i.e. new) persistency setting. Otherwise a
 * rollback after the rewrite would possibly result with buffers
for the
 * original filenode having the wrong persistency setting.
 *
 * NB: This relies on swap_relation_files() also swapping the
 * persistency. That wouldn't work for pg_class, but that can't be
 * unlogged anyway.
 */
AlterTableChangeCatalogToLoggedOrUnlogged(newrel);
FlushRelationBuffers(newrel);
/* copy heap data into newrel */
finish_heap_swap();
   
And then in swap_relation_files() also copy the persistency.
   
   
That's the best I can come up right now at least.
   
  
   Isn't better if we can set the relpersistence as an argument to
make_new_heap ?
  
   I'm thinking to change the make_new_heap:
  
   From:
  
make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, bool forcetemp,
  LOCKMODE lockmode)
  
   To:
  
make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, char relpersistence,
  LOCKMODE lockmode)
  
   That way we can create the new heap with the appropriate
relpersistence, so in the swap_relation_files also copy the persistency, of
course.
  
   And after copy the heap data to the new table (ATRewriteTable) change
relpersistence of the OldHeap's indexes, because in the finish_heap_swap
they'll be rebuild.
  
   Thoughts?
  
 
  The attached patch implement my previous idea based on Andres thoughts.
 

 I don't liked the last version of the patch, especially this part:

 +/* check if SetUnlogged or SetLogged exists in subcmds */
 +for(pass = 0; pass  AT_NUM_PASSES; pass++)
 +{
 +List *subcmds = tab-subcmds[pass];
 +ListCell*lcmd;
 +
 +if (subcmds == NIL)
 +continue;
 +
 +foreach(lcmd, subcmds)
 +{
 +AlterTableCmd *cmd = (AlterTableCmd *) lfirst(lcmd);
 +
 +if (cmd-subtype == AT_SetUnLogged || cmd-subtype
== AT_SetLogged)
 +{
 +/*
 + * Change the temporary relation to be
unlogged/logged. We have to do
 + * that here so buffers for the new relfilenode
will have the right
 + * persistency set while the original filenode's
buffers won't get read
 + * in with the wrong (i.e. new) persistency
setting. Otherwise a
 + * rollback after the rewrite would possibly
result with buffers for the
 + * original filenode having the wrong
persistency setting.
 + *
 + * NB: This relies on swap_relation_files() also
swapping the
 + * persistency. That wouldn't work for pg_class,
but that can't be
 + * unlogged anyway.
 + */
 +if (cmd-subtype == AT_SetUnLogged)
 +newrelpersistence = RELPERSISTENCE_UNLOGGED;
 +
 +isSetLoggedUnlogged = true;
 +}
 +}
 +}


 So I did a refactoring adding new items to AlteredTableInfo to pass the
information through the phases.


Hi all,

There are something that should I do on this patch yet?

Regards

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] parametric block size?

2014-07-28 Thread Robert Haas
On Sat, Jul 26, 2014 at 1:37 PM, Fabien COELHO coe...@cri.ensmp.fr wrote:
 And I don't see that as being warranted at this point. But further
 benchmarks sound like a good idea.

 Yep. A 10% potential performance impact looks worth the investigation.

I wonder, though, whether this isn't using a crowbar where some finer
instrument is called for.  If, for example, bigger heap blocks give
better performance because a bigger I/O size just plain works better,
well then that's interesting in its own right.  But if a bigger or
smaller block size yields better results on index scans, the right
solution might be to change the internal page structure used by that
index.  For example, I remember reading a paper a few years back where
the authors found that large page sizes were inefficient because you
had to do a linear scan of all the items on the page; so they added
some kind of btree-like structure within the page and got great
results.  So the page size itself wasn't the fundamental issue; it had
more to do with what kind of page layout made sense at one page size
vs. another page size.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT

2014-07-28 Thread Robert Haas
On Wed, Jul 23, 2014 at 7:32 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Because nobody wants an operation to either insert 1 tuple or update
 n=1 tuples.  The intention is that the predicate should probably be
 something like WHERE unique_key = 'some_value', but you can use
 something else.  So it's kinda like saying which index you care about
 for a particular operation, except without having to explicitly name
 an index.  But in any event you should use a predicate that uniquely
 identifies the tuple you want to update.

 This seemed a nice idea when I first read it earlier today, but now I'm
 not so sure.  Are you saying that it wouldn't be allowed to use an
 UPSERT with some sort of join, such that each joined row would produce
 either one insert or one update?  To clarify: suppose I import some
 external data into a temp table, then run UPSERT USING that table so
 that the rows end up in a permanent table; some of the rows might be
 already in the permanent table, some others might not.  I would hope
 that by the time UPSERT is done, all the rows are in the permanent
 table.  Would that raise an error, with your proposed design?

Yeah, my syntax didn't have a mechanism for that.  I agree we should
have one.  I was just brainstorming.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] delta relations in AFTER triggers

2014-07-28 Thread Kevin Grittner
Marti Raudsepp ma...@juffo.org wrote:
 On Sat, Jul 5, 2014 at 5:38 PM, Kevin Grittner kgri...@ymail.com wrote:
 it seems to me that we need the full tuple to support triggers on
 FDWs, so the TID approach would be an optimization for a subset of
 the cases, and would probably be more appropriate, if we do it at
 all, in a follow-on patch
 If you disagree with that assessment, now would be a good
 time to explain your reasoning.

 Maybe I just have a limited imagination because I've never found a use
 for FDWs personally. But recording changes from a trigger on a FDW
 table doesn't seem that useful,

It's a matter of whether AFTER triggers on an FDW can see the 
modified data in table form.  We just recently added the ability 
for FDW triggers to see the data in *row* form; it seemed odd to 
immediately follow that with a new way to get at the data and 
cripple FDW triggers for it.

 since you can only capture changes
 done by the local node. I expect that in many situations there are
 multiple writers accessing the same underlying remote table. Certainly
 it's can't guarantee the consistency of materialized views.

While I expect the techniques used here to help with development of 
incremental maintenance of materialized views, this seems like a 
useful feature in its own right.  I think the question is what the 
basis would be for supporting access to the changes in row format 
but not table format for FDWs, if we're supporting both formats for 
other tables.

 I took a look at whether I could avoid making OLD and NEW
 non-reserved keywords, but I didn't see how to do that without
 making FOR at least partially reserved.  If someone sees a way to
 do this without creating three new unreserved keywords
 (REFERENCING, OLD, and NEW) I'm all ears.

 Sorry, I know I am very late to make this point, so feel free to ignore this.

 I'm not a fan of the SQL standard syntax for this feature. One nice
 thing about PostgreSQL's triggers is that you can declare the trigger
 function once and re-use it on many tables. It would make more sense
 if the same function declaration could say what variable/relation
 names it wants to use. They're more like function argument names, not
 some metadata about a table-function relationship.

 Putting these in the CREATE TRIGGER command means you have to repeat
 them for each table you want to apply the trigger to. It introduces
 the possibility of making more mistakes without any gain in
 flexibility.

 But then again, I understand that there's value in supporting standard
 syntax.

Do you have some other suggestion?  Keep in mind that it must allow 
the code which will *generate* the transition tables to know 
whether any of the attached triggers use a given transition table 
for the specific operation, regardless of the language of the 
trigger function.  Using the standard syntax has the advantage of 
making it pretty easy to put the information exactly where it is 
needed for easy access at run time.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT

2014-07-28 Thread Robert Haas
On Wed, Jul 23, 2014 at 7:35 PM, Peter Geoghegan p...@heroku.com wrote:
 It's certain arguable whether you should INSERT and then turn failures
 into an update or try to UPDATE and then turn failures into an INSERT;
 we might even want to have both options available, though that smells
 a little like airing too much of our dirty laundry.  But I think I
 generally favor optimizing for the UPDATE case for more or less the
 same reasons Kevin articulated.

 I don't see the connection between this and Kevin's remarks. And FWIW,
 I don't see a reason to favor inserts or updates. Fortunately, what I
 have balances both cases very well, and doesn't cause bloat. The work
 of descending the index to lock it isn't wasted if an update is
 required. My implementation decides to either insert or update at
 literally the latest possible moment.

AFAIUI, this is because your implementation uses lwlocks in a way that
Andres and I both find unacceptable.  My suspicion is that any version
of this that ends up getting committed is going to involve a risk of
bloat in cases involving retries, and I think it will be easier to
minimize bloat in an update-driven implementation.  But I suppose
that's speculative.

 Here you seem to be suggested that I intended to propose your existing
 design rather than something else, which I didn't.  In this design,
 you find the conflict (at most one) but scanning for the tuple to be
 updated.

 Yes, but what if you don't see a conflict because it isn't visible to
 your snapshot, and then you insert, and only then (step 5), presumably
 with a dirty snapshot, you find a conflict? How does the loop
 terminate if that brings you back to step 1 with the same MVCC
 snapshot feeding the update?

Good point.  Maybe the syntax should be something like:

UPSERT table (keycol [, keycol] ...) { VALUES (val [, val] ...) [,
...] | select_query }

That would address both the concern about being able to pipe multiple
tuples through it and the point you just raised.  We look for a row
that matches each given tuple on the key columns; if one is found, we
update it; if none is found, we insert.

 I agree that you want to uniquely identify each tuple. What I meant
 was, why should we not be able to upsert multiple rows in a single
 command? What's wrong with that?

Nothing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gaussian distribution pgbench -- splits v4

2014-07-28 Thread Robert Haas
On Wed, Jul 23, 2014 at 12:39 PM, Fabien COELHO coe...@cri.ensmp.fr wrote:
 3. Similarly, I suggest that the use of gaussian or uniform be an
 error when argc  6 OR argc  6.  I also suggest that the
 parenthesized distribution type be dropped from the error message in
 all cases.

 I wish to agree, but my interpretation of the previous code is that they
 were ignored before, so ISTM that we are stuck with keeping the same
 unfortunate behavior.

I don't agree.  I'm not in a huge hurry to fix all the places where
pgbench currently lacks error checks just because I don't have enough
to do (hint: I do have enough to do), but when we're adding more
complicated syntax in one particular place, bringing the error checks
in that portion of the code up to scratch is an eminently sensible
thing to do, and we should do it.

Also, please stop changing the title of this thread every other post.
It breaks threading for me (and anyone else using gmail), and that
makes the thread hard to follow.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Least Active Transaction ID function

2014-07-28 Thread Robert Haas
On Thu, Jul 24, 2014 at 3:42 PM, Rohit Goyal rhtgyl...@gmail.com wrote:
 This was really -2 helpful.

I'm not sure what it means to be -2 helpful.  Hopefully it's a good thing.

 1. Can I use this xmin variable directly anytime anywhere in my code as it
 is a global variable.

I don't really know what you're asking.  If you want a tutorial on how
global variables work in C, this is the wrong mailing list to ask
about that.

 2. What is the difference b/w recentXmin and RecentGlobalXmin. I read the
 description but any small detail  can clear my mind. :)

RecentXmin is the oldest transaction ID that was still running as of
the last time it was updated.  RecentGlobalXmin is the oldest
transaction ID that was part of somebody's snapshot as of the last
time it was updated.  Transaction IDs older than RecentXmin can be
assumed not to be running, but there could be still-running
transactions that can't see the effected of some committed transaction
whose ID precedes RecentXmin.  Transaction IDs older than
RecentGlobalXmin are no longer running, and furthermore any the
effects of any such transactions which went on to commit are
guaranteed to be visible to the snapshots of all currently-running
transactions, and all future transactions.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW

2014-07-28 Thread Robert Haas
On Fri, Jul 25, 2014 at 3:39 AM, Albe Laurenz laurenz.a...@wien.gv.at wrote:
 Shigeru Hanada wrote:
 * Naming of new behavior
 You named this optimization Direct Update, but I'm not sure that
 this is intuitive enough to express this behavior.  I would like to
 hear opinions of native speakers.

 How about batch foreign update or batch foreign modification?
 (Disclaimer: I'm not a native speaker either.)

I think direct update sounds pretty good.  Batch does not sound as
good to me, since it doesn't clearly describe what makes this patch
special as opposed to some other grouping of updates that happens to
produce a speedup.

Another term that might be used is update pushdown, since we are
pushing the whole update to the remote server instead of having the
local server participate.  Without looking at the patch, I don't have
a strong opinion on whether that's better than direct update in this
context.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL

2014-07-28 Thread Merlin Moncure
On Fri, Jul 25, 2014 at 3:16 PM, Baker, Keith [OCDUS Non-JJ]
kbak...@its.jnj.com wrote:
 I propose that a QNX 6.5 port be introduced to PostgreSQL.

 I am new to PostgreSQL development, so please bear with me.



 I have made good progress (with 1 outstanding issue, details below):

 · I created a QNX 6.5 port of PostgreSQL 9.3.4 which passes
 regression tests.

 · I merged my changes into 9.4beta2, and with a few minor changes,
 it passes regression tests.

 · QNX support states that QNX 6.5 SP1 binaries run on QNX 6.6
 without modification, which I confirmed with a few quick tests.



 Summary of changes required for PostgreSQL 9.3.4 on QNX 6.5:

 · Typical changes required for any new port (template, configure.in,
 dynloader, etc.)

 · QNX lacks System V shared memory: I created
 “src/backend/port/posix_shmem.c” which replaces System V calls (shmget,
 shmat, shmdt, …) with POSIX calls (shm_open, mmap, munmap, shm_unlink)

 · QNX lacks sigaction SA_RESTART: I modified “src/include/port.h” to
 define macros to retry system calls upon EINTR (open,read,write,…) when
 compiled on QNX

 · A few files required addition of #include sys/select.h on QNX
 (for fd_set).



 Additional changes required for PostgreSQL9.4beta2 on QNX 6.5:

 · “DSM” changes introduced in 9.4 (R. Haas) required that I make
 minor updates to my new “posix_shmem.c” code.

 · src\include\replication\logical.h: struct LogicalDecodingContext
 field “write” interferes with my “write” retry macro.  Renaming field
 “write” to “do_write” solved this problem.



 Outstanding Issue #1:

 src/backend/commands/dbcommands.c :: createdb() complains when copying
 template1 to template0 (apparently a locale issue)

 “FATAL:  22023: new LC_CTYPE (C;collate:POSIX;ctype:POSIX) is incompatible
 with the LC_CTYPE of the template database (POSIX;messages:C)”

 I would appreciate help from an experienced PostgreSQL hacker to address
 this.

 I have temporarily disabled this check on QNX (I can live with the
 assumption/limitation that template0 and template1 contain strictly ASCII).

 I can work toward setting up a build farm member should this proposal be
 accepted.

Maybe step #1 is to get a buildfarm member set up.  Is there any
policy against unsupported environments in the buildfarm? (I hope not)

You're going to have to run it against a git repository containing
your custom patches.  It's a long and uncertain road to getting a new
port (re-) accepted, but demonstrated commitment to support is a
necessary first step. It will also advertise support for the platform.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL

2014-07-28 Thread Andres Freund
On 2014-07-28 11:19:48 -0500, Merlin Moncure wrote:
 Maybe step #1 is to get a buildfarm member set up.  Is there any
 policy against unsupported environments in the buildfarm? (I hope not)
 
 You're going to have to run it against a git repository containing
 your custom patches.  It's a long and uncertain road to getting a new
 port (re-) accepted, but demonstrated commitment to support is a
 necessary first step. It will also advertise support for the platform.

I don't think a buildfarm animal that doesn't run the actual upstream
code is a good idea. That'll make it a lot harder to understand what's
going on when something breaks after a commit.  It'd also require the
custom patches being rebased ontop of $branch before every run...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parametric block size?

2014-07-28 Thread Thomas Kellerer
 Possibly stopping at the tablespace level might be more straightforward. 
 To avoid messing up the pages in shared buffers we'd perhaps need 
 something like several shared buffer pools - each with either its own 
 blocksize or associated with a (set of) tablespace(s).

This is exactly how Oracle does it. You can specify the blocksize when
creating a tablespace.

For each blocksize a separate buffer cache (shared buffers in Postgres
terms) can be configured. So the cache is not maintained on tablespace level
but on blocksize level.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/parametric-block-size-tp5812350p5813060.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL

2014-07-28 Thread Merlin Moncure
On Mon, Jul 28, 2014 at 11:22 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-07-28 11:19:48 -0500, Merlin Moncure wrote:
 Maybe step #1 is to get a buildfarm member set up.  Is there any
 policy against unsupported environments in the buildfarm? (I hope not)

 You're going to have to run it against a git repository containing
 your custom patches.  It's a long and uncertain road to getting a new
 port (re-) accepted, but demonstrated commitment to support is a
 necessary first step. It will also advertise support for the platform.

 I don't think a buildfarm animal that doesn't run the actual upstream
 code is a good idea. That'll make it a lot harder to understand what's
 going on when something breaks after a commit.  It'd also require the
 custom patches being rebased ontop of $branch before every run...

hm. oh well.  maybe if there was a separate page for custom builds
(basically, an unsupported section).

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSoC2014] Patch ALTER TABLE ... SET LOGGED

2014-07-28 Thread Christoph Berg
Re: Fabrízio de Royes Mello 2014-07-28 
CAFcNs+pctx4Q2UYsLOvVFWaznO3U0XhPpkMx5DRhR=jw8w3...@mail.gmail.com
 There are something that should I do on this patch yet?

I haven't got around to have a look at the newest incarnation yet, but
I plan to do that soonish. (Of course that shouldn't stop others from
doing that as well if they wish.)

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSoC2014] Patch ALTER TABLE ... SET LOGGED

2014-07-28 Thread Fabrízio de Royes Mello
On Mon, Jul 28, 2014 at 1:41 PM, Christoph Berg c...@df7cb.de wrote:

 Re: Fabrízio de Royes Mello 2014-07-28
CAFcNs+pctx4Q2UYsLOvVFWaznO3U0XhPpkMx5DRhR=jw8w3...@mail.gmail.com
  There are something that should I do on this patch yet?

 I haven't got around to have a look at the newest incarnation yet, but
 I plan to do that soonish. (Of course that shouldn't stop others from
 doing that as well if they wish.)


Thanks!

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] [RFC] Should smgrtruncate() avoid sending sinval message for temp relations

2014-07-28 Thread Andres Freund
On 2014-07-26 20:20:05 +0200, Andres Freund wrote:
 On 2014-07-26 13:58:38 -0400, Tom Lane wrote:
 
  Andres Freund and...@2ndquadrant.com writes:
   That'd require either renegging on SA_RESTART or
   using WaitLatchOrSocket() and nonblocking send/recv.
  
  Yeah, I was wondering about using WaitLatchOrSocket for client I/O too.
  We already have a hook that lets us do the actual recv even when using
  OpenSSL, and in principle that function could do interrupt-service-like
  functions if it got kicked off the recv().
 
 I've started playing with this. Looks clearly worthwile.
 
 I think if we do it right we pretty much can get rid of the whole
 prepare_for_client_read() machinery and handle everything via
 ProcessInterrupts(). EnableCatchupInterrupt() et al don't really fill me
 with joy.

One thing I am wondering about around this is: Why are we only
processing catchup events when DoingCommandRead? There's other paths
where we can wait for data from the client for a long time. Obviously we
don't want to process async.c stuff from inside copy, but I don't see
why that's the case for sinval.c.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] delta relations in AFTER triggers

2014-07-28 Thread Marti Raudsepp
On Mon, Jul 28, 2014 at 6:24 PM, Kevin Grittner kgri...@ymail.com wrote:
 Do you have some other suggestion?  Keep in mind that it must allow
 the code which will *generate* the transition tables to know
 whether any of the attached triggers use a given transition table
 for the specific operation, regardless of the language of the
 trigger function.

You will need to access the pg_proc record of the trigger function
anyway, so it's just a matter of coming up with syntax that makes
sense, right?

What I had in mind was that we could re-use function argument
declaration syntax. For instance, use the argmode specifier to
declare OLD and NEW. Shouldn't cause grammar conflicts because the
current OUT and INOUT aren't reserved keywords.

We could also re-use the refcursor type, which already has bindings in
some PLs, if that's not too much overhead. That would make the
behavior straightforward without introducing new constructs, plus you
can pass them around between functions. Though admittedly it's
annoying to integrate cursor results into queries.

Something like:

CREATE FUNCTION trig(OLD old_rows refcursor, NEW new_rows refcursor)
RETURNS trigger LANGUAGE plpgsql AS '...';

Or maybe if the grammar allows, we could spell out NEW TABLE, OLD
TABLE, but that's redundant since you can already deduce that from
the refcursor type.

It could also be extended for different types, like tid[], and maybe
record for the FOR EACH ROW variant (dunno if that can be made to
work).

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sendLong in custom communication doesn't work

2014-07-28 Thread Robert Haas
On Fri, Jul 25, 2014 at 5:53 AM, Krystian Piećko
krystian.pie...@gmail.com wrote:
 I’m implementing the functionality that will pass all the queries native to
 postgresql (that asks about structures and versions) to the hidden
 postgresql and other queries I would like to parse myself. I have a big
 problem with outputing PG_TYPE_INT[248] value. I’m doing it because in my
 system I would like to provide the method to send by jdbc XML messages that
 would be resolved by my software and other queries would be answered by
 PostgreSQL. I don’t mind the speed and the double processing I just want to
 learn how to do it.
 Outputing texts works fine.

 So when I receive native query like (select * from pg_class) I have active
 JDBC connection to hidden postgresql (9.3) and this query is executed on
 that server. When I get ResultSet from the hidden postgresql I try to output
 the results in postgres form to the output. Example communication that
 returns current date in long (in my opinion) should look like this:

 //Example code that outputs current date in long format
 public void sendExampleResponse() {
 server.trace(Query);
 String query = readString();
 System.out.println(query =  + query);
[ ...and there's more... ]

I don't think you're going to get much help writing Java code here; if
you want to discuss the pgsql-jdbc code, you should use that mailing
list rather than this one.  This mailing list would be the right place
for discussions about the wire protocol itself, though, so maybe
someone could give you some advice on that if you were more specific
about what problem you're having.  From the provided information it's
hard to be sure what's going wrong.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT

2014-07-28 Thread Peter Geoghegan
On Mon, Jul 28, 2014 at 8:37 AM, Robert Haas robertmh...@gmail.com wrote:
 AFAIUI, this is because your implementation uses lwlocks in a way that
 Andres and I both find unacceptable.

That's not the case. My implementation uses page-level heavyweight
locks. The nbtree AM used to use them for other stuff. Plenty of other
systems use index level locks managed by a heavyweight lock manager.

 Here you seem to be suggested that I intended to propose your existing
 design rather than something else, which I didn't.  In this design,
 you find the conflict (at most one) but scanning for the tuple to be
 updated.

 Yes, but what if you don't see a conflict because it isn't visible to
 your snapshot, and then you insert, and only then (step 5), presumably
 with a dirty snapshot, you find a conflict? How does the loop
 terminate if that brings you back to step 1 with the same MVCC
 snapshot feeding the update?

 Good point.  Maybe the syntax should be something like:

 UPSERT table (keycol [, keycol] ...) { VALUES (val [, val] ...) [,
 ...] | select_query }

 That would address both the concern about being able to pipe multiple
 tuples through it and the point you just raised.  We look for a row
 that matches each given tuple on the key columns; if one is found, we
 update it; if none is found, we insert.

That basically is my design, except that (tangentially) yours risks
bloat in exchange for not having to use a real locking mechanism, and
has a different syntax. The parts of inserting into an index scan that
I stagger include an initial part that is more or less just an index
scan. With this design you'll have to set up things so that all
indexes can be directly accessed in the manner of ExecInsert() (get a
list of them from the relcache, open them in an order that avoids
possible deadlocks, etc). Why not just use ExecInsert()? I don't think
I'm alone in seeing things that way.

On a mostly unrelated note, I'll remind you of the reason that I felt
it was best to lock indexes. It wasn't so much about avoiding bloat as
it was about avoiding deadlocks. When I highlighted the issue,
Heikki's prototype, which did insert optimistically rather than
locking, was then made to go and physically super delete the
upsert-insert conflicting heap tuple (inserted optimistically before
its index tuples), before going to lock a row, in order to avoid
unprincipled deadlocking. In contrast, my design just used a callback
that released page level heavyweight locks before going to lock a row.
Heikki's prototype involved making it so that *even someone else's
dirty snapshot* didn't see our dead speculatively-inserted heap tuple.

Anyway, making all that happen is fairly invasive to a bunch of places
that are just as critical as the nbtree code. I'm not saying it can't
be done, or even that it definitely shouldn't be, but taking an
approach that produces bloat, rather than locking values the way other
systems do (and, to a limited extent Postgres already does) is at
least way more invasive than it first appears. Plus,  I ask you to
consider that.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT

2014-07-28 Thread Peter Geoghegan
On Mon, Jul 28, 2014 at 10:43 AM, Peter Geoghegan p...@heroku.com wrote:
 Plus,  I ask you to
 consider that.

Excuse me. I meant Plus, you avoid bloat. I ask you to consider that.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_background (and more parallelism infrastructure patches)

2014-07-28 Thread Andres Freund
On 2014-07-26 12:20:34 -0400, Robert Haas wrote:
 On Sat, Jul 26, 2014 at 4:37 AM, Andres Freund and...@2ndquadrant.com wrote:
  On 2014-07-25 14:11:32 -0400, Robert Haas wrote:
  Attached is a contrib module that lets you launch arbitrary command in
  a background worker, and supporting infrastructure patches for core.
 
  Cool.
 
  I assume this 'fell out' of the work towards parallelism? Do you think
  all of the patches (except the contrib one) are required for that or is
  some, e.g. 3), only required to demonstrate the others?

 I'm fairly sure that patches 3, 4, and 5 are all required in some form
 as building blocks for parallelism.  Patch 1 contains two functions,
 one of which (shm_mq_set_handle) I think is generally useful for
 people using background workers, but not absolutely required; and one
 of which is infrastructure for patch 3 which might not be necessary
 with different design choices (shm_mq_sendv).  Patch 2 is only
 included because pg_background can benefit from it; we could instead
 use an eoxact callback, at the expense of doing cleanup at
 end-of-transaction rather than end-of-query.  But it's a mighty small
 patch and seems like a reasonable extension to the API, so I lean
 toward including it.

Don't get me wrong, I don't object to anything in here. It's just that
the bigger picture can help giving sensible feedback.

  Patch 3 adds the ability for a backend to request that the protocol
  messages it would normally send to the frontend get redirected to a
  shm_mq.  I did this by adding a couple of hook functions.  The best
  design is definitely arguable here, so if you'd like to bikeshed, this
  is probably the patch to look at.
 
  Uh. This doesn't sound particularly nice. Shouldn't this rather be
  clearly layered by making reading/writing from the client a proper API
  instead of adding hook functions here and there?

 I don't know exactly what you have in mind here.  There is an API for
 writing to the client that is used throughout the backend, but right
 now the client always has to be a socket.  Hooking a couple of parts
 of that API lets us write someplace else instead.  If you've got
 another idea how to do this, suggest away...

What I'm thinking of is providing an actual API for the writes instead
of hooking into the socket API in a couple places. I.e. have something
like

typedef struct DestIO DestIO;

struct DestIO
{
void (*flush)(struct DestIO *io);
int (*putbytes)(struct DestIO *io, const char *s, size_t len);
int (*getbytes)(struct DestIO *io, const char *s, size_t len);
...
}

and do everything through it. I haven't thought much about the specific
API we want, but abstracting the communication properly instead of
adding hooks here and there is imo much more likely to succeed in the
long term.

  Also, you seem to have only touched receiving from the client, and not
  sending back to the subprocess. Is that actually sufficient? I'd expect
  that for this facility to be fully useful it'd have to be two way
  communication. But perhaps I'm overestimating what it could be used for.

 Well, the basic shm_mq infrastructure can be used to send any kind of
 messages you want between any pair of processes that care to establish
 them.  But in general I expect that data is going to flow mostly in
 one direction - the user backend will launch workers and give them an
 initial set of instructions, and then results will stream back from
 the workers to the user backend.  Other messaging topologies are
 certainly possible, and probably useful for something, but I don't
 really know exactly what those things will be yet, and I'm not sure
 the FEBE protocol will be the right tool for the job anyway.

It's imo not particularly unreasonable to e.g. COPY to/from a bgworker. Which
would require the ability to both read/write from the other side.

 But
 error propagation, which is the main thrust of this, seems like a need
 that will likely be pretty well ubiquitous.

Agreed.

  This patch also adds a function to
  help you parse an ErrorResponse or NoticeResponse and re-throw the
  error or notice in the originating backend.  Obviously, parallelism is
  going to need this kind of functionality, but I suspect a variety of
  other applications people may develop using background workers may
  want it too; and it's certainly important for pg_background itself.
 
  I would have had use for it previously.

 Cool.  I know Petr was interested as well (possibly for the same project?).

Well, I was aware of Petr's project, but I also have my own pet project
I'd been playing with :). Nothing real.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.auto.conf and reload

2014-07-28 Thread Fujii Masao
On Sat, Jul 26, 2014 at 1:07 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Fri, Jul 25, 2014 at 6:11 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Wed, Jul 9, 2014 at 11:05 PM, Amit Kapila amit.kapil...@gmail.com
 wrote:
  Okay. As mentioned upthread, I have fixed by ensuring that for duplicate
  config params, retain only which comes later during parsing.
  I think it might have been bit simpler to fix, if we try to fix after
  parsing
  is complete, but I think for that we might need to replicate the logic
  at multiple places.

 ISTM that the patch works fine. Only concern is that the logic needs
 O(n^2) comparison, which may cause performance problem. But
 n in O(n^2) is the number of uncommented parameters and I don't
 think it's so big, ISTM I can live with the logic...

 Thanks for reviewing the patch.  I also think that having O(n^2)
 comparisons should not be a problem in this logic as it will be processed
 only during load/parse of config file which we don't do in performance
 sensitive path.

Yep.

There is other side effect on this patch. With the patch, when reloading
the configurartion file, the server cannot warm an invalid setting value
if it's not the last setting of the parameter. This may cause problematic
situation as follows.

1. ALTER SYSTEM SET work_mem TO '1024kB';
2. Reload the configuration file --- success
3. Then, a user accidentally adds work_mem = '2048KB' into postgresql.conf
 The setting value '2048KB' is invalid, and the unit should be
'kB' instead of 'KB'.
4. Reload the configuration file --- success
 The invalid setting is ignored because the setting of work_mem in
 postgresql.auto.conf is preferred. So a user cannot notice that
postgresql.conf
 has an invalid setting.
5. Failover on shared-disk HA configuration happens, then PostgreSQL fails to
start up because of such an invalid setting. When PostgreSQL
starts up, the last
setting is preferred. But all the settings are checked.

Can we live with this issue?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgresql.auto.conf and reload

2014-07-28 Thread Josh Berkus
On 07/28/2014 11:03 AM, Fujii Masao wrote:
 On Sat, Jul 26, 2014 at 1:07 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Fri, Jul 25, 2014 at 6:11 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Wed, Jul 9, 2014 at 11:05 PM, Amit Kapila amit.kapil...@gmail.com
 wrote:
 Okay. As mentioned upthread, I have fixed by ensuring that for duplicate
 config params, retain only which comes later during parsing.
 I think it might have been bit simpler to fix, if we try to fix after
 parsing
 is complete, but I think for that we might need to replicate the logic
 at multiple places.

 ISTM that the patch works fine. Only concern is that the logic needs
 O(n^2) comparison, which may cause performance problem. But
 n in O(n^2) is the number of uncommented parameters and I don't
 think it's so big, ISTM I can live with the logic...

 Thanks for reviewing the patch.  I also think that having O(n^2)
 comparisons should not be a problem in this logic as it will be processed
 only during load/parse of config file which we don't do in performance
 sensitive path.
 
 Yep.
 
 There is other side effect on this patch. With the patch, when reloading
 the configurartion file, the server cannot warm an invalid setting value
 if it's not the last setting of the parameter. This may cause problematic
 situation as follows.
 
 1. ALTER SYSTEM SET work_mem TO '1024kB';
 2. Reload the configuration file --- success
 3. Then, a user accidentally adds work_mem = '2048KB' into postgresql.conf
  The setting value '2048KB' is invalid, and the unit should be
 'kB' instead of 'KB'.
 4. Reload the configuration file --- success
  The invalid setting is ignored because the setting of work_mem in
  postgresql.auto.conf is preferred. So a user cannot notice that
 postgresql.conf
  has an invalid setting.
 5. Failover on shared-disk HA configuration happens, then PostgreSQL fails to
 start up because of such an invalid setting. When PostgreSQL
 starts up, the last
 setting is preferred. But all the settings are checked.
 
 Can we live with this issue?

I'd think so, yes.  That's pretty extreme corner-case.

Also, it's my perspective that users who change conf by concurrently
editing pg.conf *and* doing ALTER SYSTEM SET are hopeless anyway.
There's simply no way we can protect them from themselves.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should smgrtruncate() avoid sending sinval message for temp relations

2014-07-28 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 One thing I am wondering about around this is: Why are we only
 processing catchup events when DoingCommandRead? There's other paths
 where we can wait for data from the client for a long time. Obviously we
 don't want to process async.c stuff from inside copy, but I don't see
 why that's the case for sinval.c.

It might be all right to do it during copy, but I'd just as soon treat
that as a separate issue.  If you merge it into the basic patch then it
might be hard to get rid of if there are problems.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Should smgrtruncate() avoid sending sinval message for temp relations

2014-07-28 Thread Andres Freund
On 2014-07-28 15:29:57 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  One thing I am wondering about around this is: Why are we only
  processing catchup events when DoingCommandRead? There's other paths
  where we can wait for data from the client for a long time. Obviously we
  don't want to process async.c stuff from inside copy, but I don't see
  why that's the case for sinval.c.
 
 It might be all right to do it during copy, but I'd just as soon treat
 that as a separate issue.  If you merge it into the basic patch then it
 might be hard to get rid of if there are problems.

Yea, not planning to merge it. Just wondering to make sure I understand
all the implications.

Another thing I'm wondering about - also not for the basic patch - is
accepting termination while writing to the client. It's rather annoying
that we currently don't allow to pg_terminate_backend() when writing to
the client.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [w32] test_shm_mq test suite permanently burns connections slots

2014-07-28 Thread Robert Haas
On Fri, Jul 25, 2014 at 3:25 PM, Noah Misch n...@leadboat.com wrote:
 On a Windows or other EXEC_BACKEND build, the following eventually gets
 failures because all, or all but one, max_connections slot is consumed:

   for run in `seq 1 100`; do make -C contrib/test_shm_mq installcheck; done

 When I use max_connections=40, it fails on the sixth iteration.  Only the six
 basic processes are actually running at that time.

The tests start 7 workers each time, so that makes sense: 7 * 5  40
but 7 * 6  40.  What I'm not sure is why they are leaking connection
slots, and why they're only doing it in EXEC_BACKEND mode.  A quick
code audit didn't uncover any obvious explanation, so I'll try to
reproduce and debug.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Fix src/backend/storage/smgr/README

2014-07-28 Thread Fabrízio de Royes Mello
Hi all,

Small fix in src/backend/storage/smgr/README about where is assigned fork
numbers.

Regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
diff --git a/src/backend/storage/smgr/README b/src/backend/storage/smgr/README
index 12df42a..37ed40b 100644
--- a/src/backend/storage/smgr/README
+++ b/src/backend/storage/smgr/README
@@ -50,7 +50,7 @@ independently of the main data file, while still treating it all as a single
 physical relation in system catalogs.
 
 It is assumed that the main fork, fork number 0 or MAIN_FORKNUM, always
-exists. Fork numbers are assigned in src/include/storage/relfilenode.h.
+exists. Fork numbers are assigned in src/include/common/relpath.h.
 Functions in smgr.c and md.c take an extra fork number argument, in addition
 to relfilenode and block number, to identify which relation fork you want to
 access. Since most code wants to access the main fork, a shortcut version of

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix src/backend/storage/smgr/README

2014-07-28 Thread Tom Lane
=?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= fabriziome...@gmail.com writes:
 Small fix in src/backend/storage/smgr/README about where is assigned fork
 numbers.

Ah, looks like I missed that reference when I moved the enum :-(.
Thanks, will fix.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [w32] test_shm_mq test suite permanently burns connections slots

2014-07-28 Thread Robert Haas
On Mon, Jul 28, 2014 at 3:59 PM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Jul 25, 2014 at 3:25 PM, Noah Misch n...@leadboat.com wrote:
 On a Windows or other EXEC_BACKEND build, the following eventually gets
 failures because all, or all but one, max_connections slot is consumed:

   for run in `seq 1 100`; do make -C contrib/test_shm_mq installcheck; done

 When I use max_connections=40, it fails on the sixth iteration.  Only the six
 basic processes are actually running at that time.

 The tests start 7 workers each time, so that makes sense: 7 * 5  40
 but 7 * 6  40.  What I'm not sure is why they are leaking connection
 slots, and why they're only doing it in EXEC_BACKEND mode.  A quick
 code audit didn't uncover any obvious explanation, so I'll try to
 reproduce and debug.

OK, I think I see the problem.  In EXEC_BACKEND mode,
SubPostmasterMain() calls InitProcess() before IsBackgroundWorker has
been set.  InitProcess() therefore pulls the PGPROC for the worker
from freeProcs rather than bgworkerFreeProcs.  By exit time,
IsBackgroundWorker has been set, so the PGPROC gets put back on the
bgworkerFreeProcs list.  Eventually there are no regular PGPROCs left;
they've all been moved to the bgworkerFreeProcs list.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT

2014-07-28 Thread Peter Geoghegan
On Mon, Jul 28, 2014 at 10:43 AM, Peter Geoghegan p...@heroku.com wrote:
 On a mostly unrelated note, I'll remind you of the reason that I felt
 it was best to lock indexes. It wasn't so much about avoiding bloat as
 it was about avoiding deadlocks. When I highlighted the issue,
 Heikki's prototype, which did insert optimistically rather than
 locking, was then made to go and physically super delete the
 upsert-insert conflicting heap tuple (inserted optimistically before
 its index tuples), before going to lock a row, in order to avoid
 unprincipled deadlocking. In contrast, my design just used a callback
 that released page level heavyweight locks before going to lock a row.
 Heikki's prototype involved making it so that *even someone else's
 dirty snapshot* didn't see our dead speculatively-inserted heap tuple.

 Anyway, making all that happen is fairly invasive to a bunch of places
 that are just as critical as the nbtree code.

I think I should be more concrete about why this is more complicated
than it first appears. Andres said at pgCon that he would still go
with a design where promise tuples are inserted into indexes ahead
of any heap tuple (which differs from Heikki's prototype, where heap
tuple insertion occurs first). Accounting for deadlocking issues could
be particularly problematic with that design, since we must kill
tuples from each index in turn before row locking. In any case the
need to efficiently *release* locks must be weighed carefully. It
isn't obvious, but we must release locks if there is a conflict.

After Heikki acknowledged the problem [1], he produced a revision
addressing it. The details of the workaround and a patch were posted
[2]. I think it's fair to say that this area is a lot messier than it
first appears. If anyone wants to propose an alternative design, they
are of course quite welcome to, but I ask that the very real
difficulties with those designs be acknowledged. AFAICT, only Heikki
has independently acknowledged these issue on list.

In case it isn't obvious, let me be clear about what I care about
here. I feel it's important to get something that is easy to reason
about - you write a DML statement, and within certain fairly
reasonable parameters Postgres does the rest. I think we should not
accept something that may even occasionally through deadlock errors,
or unique violations, or RC-level serialization failures through no
fault of the user. That would be inferior to the plpgql looping
pattern we promote that does the right thing and avoids all of this.
It would be awful to have to tell users who hit problems like this
that they should just stop doing so much upserting, or use the old
pattern.

[1] http://www.postgresql.org/message-id/52b4aaf0.5090...@vmware.com
[2] http://www.postgresql.org/message-id/52d00d2d.6030...@vmware.com

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SKIP LOCKED DATA (work in progress)

2014-07-28 Thread Thomas Munro
On 27 July 2014 23:19, Thomas Munro mu...@ip9.org wrote:
 On the subject of isolation tests, I think skip-locked.spec is only
 producing schedules that reach third of the three 'return
 HeapTupleWouldBlock' statements in heap_lock_tuple.  I will follow up
 with some more thorough isolation tests in the next week or so to
 cover the other two, and some other scenarios and interactions with
 other feature.

Now with extra isolation tests so that the three different code
branches that can skip rows are covered.  I temporarily added some
logging lines to double check that the expected branches are reached
by each permutation while developing the specs.  They change the
output and are not part of the patch -- attaching separately.
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index 231dc6a..0469705 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -45,7 +45,7 @@ SELECT [ ALL | DISTINCT [ ON ( replaceable class=parameterexpression/replac
 [ LIMIT { replaceable class=parametercount/replaceable | ALL } ]
 [ OFFSET replaceable class=parameterstart/replaceable [ ROW | ROWS ] ]
 [ FETCH { FIRST | NEXT } [ replaceable class=parametercount/replaceable ] { ROW | ROWS } ONLY ]
-[ FOR { UPDATE | NO KEY UPDATE | SHARE | KEY SHARE } [ OF replaceable class=parametertable_name/replaceable [, ...] ] [ NOWAIT ] [...] ]
+[ FOR { UPDATE | NO KEY UPDATE | SHARE | KEY SHARE } [ OF replaceable class=parametertable_name/replaceable [, ...] ] [ NOWAIT | SKIP LOCKED ] [...] ]
 
 phrasewhere replaceable class=parameterfrom_item/replaceable can be one of:/phrase
 
@@ -1283,7 +1283,7 @@ FETCH { FIRST | NEXT } [ replaceable class=parametercount/replaceable ] {
 The locking clause has the general form
 
 synopsis
-FOR replaceablelock_strength/ [ OF replaceable class=parametertable_name/replaceable [, ...] ] [ NOWAIT ]
+FOR replaceablelock_strength/ [ OF replaceable class=parametertable_name/replaceable [, ...] ] [ NOWAIT | SKIP LOCKED ]
 /synopsis
 
 where replaceablelock_strength/ can be one of
@@ -1359,11 +1359,17 @@ KEY SHARE
 
para
 To prevent the operation from waiting for other transactions to commit,
-use the literalNOWAIT/ option.  With literalNOWAIT/, the statement
-reports an error, rather than waiting, if a selected row
-cannot be locked immediately.  Note that literalNOWAIT/ applies only
-to the row-level lock(s) mdash; the required literalROW SHARE/literal
-table-level lock is still taken in the ordinary way (see
+use either the literalNOWAIT/ or literalSKIP LOCKED/literal
+option.  With literalNOWAIT/, the statement reports an error, rather
+than waiting, if a selected row cannot be locked immediately.
+With literalSKIP LOCKED/literal, any selected rows that cannot be
+immediately locked are skipped.  Skipping locked rows provides an
+inconsistent view of the data, so this is not suitable for general purpose
+work, but can be used to avoid lock contention with multiple consumers
+accessing a queue-like table.  Note that literalNOWAIT/
+and literalSKIP LOCKED/literal apply only to the row-level lock(s)
+mdash; the required literalROW SHARE/literal table-level lock is
+still taken in the ordinary way (see
 xref linkend=mvcc).  You can use
 xref linkend=sql-lock
 with the literalNOWAIT/ option first,
@@ -1386,14 +1392,14 @@ KEY SHARE
/para
 
para
-Multiple locking
-clauses can be written if it is necessary to specify different locking
-behavior for different tables.  If the same table is mentioned (or
-implicitly affected) by more than one locking clause,
-then it is processed as if it was only specified by the strongest one.
-Similarly, a table is processed
-as literalNOWAIT/ if that is specified in any of the clauses
-affecting it.
+Multiple locking clauses can be written if it is necessary to specify
+different locking behavior for different tables.  If the same table is
+mentioned (or implicitly affected) by more than one locking clause, then
+it is processed as if it was only specified by the strongest one.
+Similarly, a table is processed as literalNOWAIT/ if that is specified
+in any of the clauses affecting it.  Otherwise, it is processed
+as literalSKIP LOCKED/literal if that is specified in any of the
+clauses affecting it.
/para
 
para
@@ -1930,9 +1936,9 @@ SELECT distributors.* WHERE distributors.name = 'Westward';
 productnamePostgreSQL/productname allows it in any commandSELECT/
 query as well as in sub-commandSELECT/s, but this is an extension.
 The literalFOR NO KEY UPDATE/, literalFOR SHARE/ and
-literalFOR KEY SHARE/ variants,
-as well as the literalNOWAIT/ option,
-do not appear in the standard.
+literalFOR KEY SHARE/ variants, as well as the literalNOWAIT/
+and literalSKIP LOCKED/literal options, do not appear in the
+standard.
  

Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-07-28 Thread Peter Geoghegan
On Sun, Jul 27, 2014 at 12:34 PM, Peter Geoghegan p...@heroku.com wrote:
 It's more or less testing for a primary weight level (i.e. the first
 part of the blob) that is no larger than the original characters of
 the string, and has no header bytes or other redundancies.  It also
 matches secondary and subsequently weight levels to ensure that they
 match, since the two stings tested have identical case, use of
 diacritics, etc (they're both lowercase ASCII-safe strings). I don't
 set a locale, but that shouldn't matter.

Actually, come to think of it that might not quite be true. Consider
this output from Robert's strxfrm test program:

pg@hamster:~/code$ ./strxfrm hu_HU.utf8 potyty
potyty - 2826303001090909090109090909 (14 bytes)
pg@hamster:~/code$ ./strxfrm hu_HU.utf8 potyta
potyta - 2826302e0c010909090909010909090909 (17 bytes)

This is a very esoteric Hungarian collation rule [1], which at one
point we found we had to plaster over within varstr_cmp() to prevent
indexes giving wrong answers [2]. It turns out that with this
collation, strcoll(potyty, potty) == 0. The point specifically is
that in principle, collations can alter the number of weights that
appear in the primary level of the blob. This might imply that the
number of primary level bytes for the ASCII-safe string abcdefgh
might not equal those of ijklmnop for some collation, because of the
application of some similar esoteric rule. In principle, collations
are at liberty to make that happen, even though this hardly ever
occurs in practice (we first heard about it in 2005, although the
Unicode algorithm standard warns of this), and even though any of the
cases where it does occur it probably happens to not affect my little
AC_TRY_RUN program. Even still, I'm not comfortable with the
deficiency of the program. I don't want my optimization to
accidentally not apply just because some hypothetical collation where
this is true was used when Postgres was built. It probably couldn't
happen, but I must admit guaranteeing that it can't is a mess.
I suppose I could fix this by no longer assuming that the number of
bytes that appear in the primary level are fixed at n for n original
ASCII code point strings. I think that in theory even that could
break, though, because we have no principled way of parsing out
different weight levels (the Unicode standard has some ideas about how
given strxfrm()'s no NULL bytes in blob restriction, but that's
clearly implementation defined).

Given that Mac OS X is the only platform that appears to have this
header byte problem at all, I think we'd be better off specifically
disabling it on Mac OS X. I was very surprised to learn of the problem
on Mac OS X. Clearly it's going against the grain by having the
problem.

[1] http://www.postgresql.org/message-id/43a16bb7.7030...@mage.hu
[2] 
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=656beff59033ccc5261a615802e1a85da68e8fad
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Reminder: time to stand down from 8.4 maintenance

2014-07-28 Thread Tom Lane
PG 8.4.x is EOL as of last week's releases, so it's time to remove that
branch from any auto-update scripts you might have, reconfigure buildfarm
members that are force-building it, etc.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-07-28 Thread Peter Geoghegan
On Mon, Jul 28, 2014 at 4:41 PM, Peter Geoghegan p...@heroku.com wrote:
 Actually, come to think of it that might not quite be true.

Another issue is that we might just happen to use the C locale when
the AC_TRY_RUN program is invoked, which probably doesn't exhibit the
broken behavior of Mac OS X, since at least with glibc on Linux that
leaves you with a blob exactly matching the original string. Then
again, who knows? The Mac OS X behavior seems totally arbitrary to me.
If I had to guess I'd say it has something to do with their providing
an open standard shim to a UTF-16 based proprietary API.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL

2014-07-28 Thread Peter Geoghegan
On Mon, Jul 28, 2014 at 9:41 AM, Merlin Moncure mmonc...@gmail.com wrote:
 I don't think a buildfarm animal that doesn't run the actual upstream
 code is a good idea. That'll make it a lot harder to understand what's
 going on when something breaks after a commit.  It'd also require the
 custom patches being rebased ontop of $branch before every run...

 hm. oh well.  maybe if there was a separate page for custom builds
 (basically, an unsupported section).

I think that's a bad idea. The QNX OS seems to be mostly used in
safety-critical systems; it has a microkernel design. I think it would
be particularly bad to have iffy support for something like that.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-07-28 Thread Wim Lewis
On 28 Jul 2014, at 4:57 PM, Peter Geoghegan wrote:
 [] Then
 again, who knows? The Mac OS X behavior seems totally arbitrary to me.
 If I had to guess I'd say it has something to do with their providing
 an open standard shim to a UTF-16 based proprietary API.

A quick glance at OSX's strxfrm() suggests they're using an implementation of 
strxfrm() from FreeBSD. You can find the source here:


http://www.opensource.apple.com/source/Libc/Libc-997.90.3/string/FreeBSD/strxfrm.c

(and a really quick glance at the contents of libc on OSX 10.9 reinforces 
this--- I don't see any calls into their CoreFoundation unicode string APIs.)





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-07-28 Thread Peter Geoghegan
On Mon, Jul 28, 2014 at 5:14 PM, Wim Lewis w...@omnigroup.com wrote:
 A quick glance at OSX's strxfrm() suggests they're using an implementation of 
 strxfrm() from FreeBSD. You can find the source here:

 
 http://www.opensource.apple.com/source/Libc/Libc-997.90.3/string/FreeBSD/strxfrm.c

 (and a really quick glance at the contents of libc on OSX 10.9 reinforces 
 this--- I don't see any calls into their CoreFoundation unicode string APIs.)

Something isn't quite accounted for, then. The FreeBSD behavior is to
append the primary weights only. That makes their returned blobs
smaller than those you'll see on Linux, but also appears to imply that
their implementation is substandard (The PostgreSQL port uses ICU on
FreeBSD for a reason, I suppose). But FreeBSD did not add extra,
redundant header bytes right in the primary level when I tested it,
but I'm told Mac OS X does. I guess it could be that the collations
shipped differ, but I can't think why that would be. It does seem
peculiar that the Mac OS X blobs are always printable, whereas that
isn't the case with Glibc (the only restriction like that is that
there are no NULL bytes), and the Unicode algorithm standard
specifically says that that's okay.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reminder: time to stand down from 8.4 maintenance

2014-07-28 Thread Andrew Dunstan


I have removed it from the buildfarm server's branches_of_interest.txt.

buildfarm members that rely in this file won't need to take any action, 
except possibly to clean up their build root.


cheers

andrew


On 07/28/2014 07:41 PM, Tom Lane wrote:

PG 8.4.x is EOL as of last week's releases, so it's time to remove that
branch from any auto-update scripts you might have, reconfigure buildfarm
members that are force-building it, etc.

regards, tom lane






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SKIP LOCKED DATA (work in progress)

2014-07-28 Thread Alvaro Herrera
Tom Lane wrote:

 It might be better if we'd declared AclMode in a single-purpose header,
 say utils/aclmode.h, and then #include'd that into parsenodes.h.
 There's certainly plenty of other single-datatype headers laying about.

Do you mean src/include/datatype/aclmode.h?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SKIP LOCKED DATA (work in progress)

2014-07-28 Thread Alvaro Herrera
David Rowley wrote:

 I've also been looking at the isolation tests and I see that you've added a
 series of tests for NOWAIT. I was wondering why you did that as that's
 really existing code, probably if you thought the tests were a bit thin
 around NOWAIT then maybe that should be a separate patch?

The isolation tester is new so we don't have nearly enough tests for it.
Adding more meaningful tests is good even if they're unrelated to the
patch at hand.

FWIW you can use configure --enable-coverage and make coverage-html to
get coverage reports.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SKIP LOCKED DATA (work in progress)

2014-07-28 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Tom Lane wrote:
 It might be better if we'd declared AclMode in a single-purpose header,
 say utils/aclmode.h, and then #include'd that into parsenodes.h.
 There's certainly plenty of other single-datatype headers laying about.

 Do you mean src/include/datatype/aclmode.h?

I was thinking src/include/utils/, actually, but maybe datatype/ would
be a good choice.

OTOH, what we've got in there now is just timestamp.h, and IIRC it was put
there because it needed to be accessible from both frontend and backend
contexts.  That would not be true of aclmode.h, so perhaps aclmode.h
doesn't belong there.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [w32] test_shm_mq test suite permanently burns connections slots

2014-07-28 Thread Alvaro Herrera
Robert Haas wrote:

 OK, I think I see the problem.  In EXEC_BACKEND mode,
 SubPostmasterMain() calls InitProcess() before IsBackgroundWorker has
 been set.  InitProcess() therefore pulls the PGPROC for the worker
 from freeProcs rather than bgworkerFreeProcs.  By exit time,
 IsBackgroundWorker has been set, so the PGPROC gets put back on the
 bgworkerFreeProcs list.  Eventually there are no regular PGPROCs left;
 they've all been moved to the bgworkerFreeProcs list.

Doh.  I'm surprised -- I tested a worker that crashed over and over to
ensure PGPROCs were reused sanely.  I guess I forgot to run it under
EXEC_BACKEND.

Are you fixing it?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] whether I can see other alternative plantrees for one query?

2014-07-28 Thread 土卜皿
hi, all
  NOTE: Version is 8.4   Fedora 20 X86_64

  for understanding optimizer's internals, I

*set debug_print_plan=on*

and created two tables as follows :


*create table Reserves (sid integer, bid integer,day date,rname
char(25));create table Sailors(sid integer,sname char(25),rating
integer,age real);*

and add 1,000,000 records for each.

and execute the cmd:




*select S.rating,count(*) from Sailors S where S.rating  5
and S.age = 20 group by S.rating;*

but from the log, I only found the final selected planTree, so I want to
ask:
what should I do if I want to see the other alternative planTrees?  any
advice will be apprecitaed!

Dillon Peng


Re: [HACKERS] 9.4 documentation: duplicate paragraph in logical decoding example

2014-07-28 Thread Fujii Masao
On Mon, Jul 14, 2014 at 2:33 PM, Christoph Moench-Tegeder
c...@burggraben.net wrote:
 ## Andres Freund (and...@2ndquadrant.com):

 Care to submit a patch for it Christoph?

 There it is.

Thanks! Applied.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] whether I can see other alternative plantrees for one query?

2014-07-28 Thread Tom Lane
=?UTF-8?B?5Zyf5Y2c55q/?= pengcz.n...@gmail.com writes:
   NOTE: Version is 8.4   Fedora 20 X86_64

You do realize that's five years out of date?  Not sure why you're running
an end-of-life database on a bleeding-edge OS.

   for understanding optimizer's internals, I
 *set debug_print_plan=on*
 ...
 but from the log, I only found the final selected planTree, so I want to
 ask:
 what should I do if I want to see the other alternative planTrees?  any
 advice will be apprecitaed!

The printed plan tree is actually the only one that the planner follows to
completion; other alternatives are discarded as soon as possible to save
useless processing.

For a query as simple as you're showing here, there aren't all that many
alternatives.  You can probably explore all of them by fooling with the
planner parameters enable_seqscan etc, by repeatedly disallowing whatever
plan the planner thought was cheapest so that it will select and print
the next cheapest.

If you want to look more closely than that, you could add some code to the
add_path() subroutine so that it prints rejected paths --- but they'll
necessarily be pretty abstract and not user-friendly (or perhaps I should
say even less user-friendly than EXPLAIN usually is :-() because the
details simply aren't there.

There's lots of previous discussion in the PG archives, eg here here
and here:
http://www.postgresql.org/message-id/flat/CAN3Hncy1X9Zm4gJjGPc4ApYQe0Qs_pjZe=vw0V_J=rma-cl...@mail.gmail.com
http://www.postgresql.org/message-id/flat/canp-bfaraah2f9a55wqsanh4trbeerfp_g3karwc-jlu-kx...@mail.gmail.com
http://www.postgresql.org/message-id/flat/CAFcOn2-9j4fTcJ39xvdCByF6dg3U_=tgumccp1-7sh_j9g+...@mail.gmail.com

I'm fairly sure that I remember seeing some more-completely-worked-out
patches for printing partial paths as they're considered, but my search fu
is failing me at the moment.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] whether I can see other alternative plantrees for one query?

2014-07-28 Thread Kyotaro HORIGUCHI
Hi,

 hi, all
   NOTE: Version is 8.4   Fedora 20 X86_64

Why don't you play on 9.3 or later? 8.4 is now on the edge to EOL.

   for understanding optimizer's internals, I set debug_print_plan=on
 and created two tables as follows :
 
 create table Reserves (sid integer, bid integer,day date,rname char(25));
 create table Sailors(sid integer,sname char(25),rating integer,age real);
 
 and add 1,000,000 records for each.
 
 and execute the cmd:
 
 select S.rating,count(*)
 from Sailors S
 where S.rating  5 and S.age = 20
 group by S.rating;
 
 but from the log, I only found the final selected planTree, so I want to
 ask:
 what should I do if I want to see the other alternative planTrees?  any
 advice will be apprecitaed!

Forcing another plan by configuration parameters would help.

http://www.postgresql.org/docs/9.3/static/runtime-config-query.html

For example, set enable_hashagg to off makes the planner to try
to avoid using HashAggregate for grouping. If you got a plan
using HashAgregate, you will get another one using GroupAggregate
by that.

What you can do otherwise would be building PG with
CFLAGS=-DOPTIMIZER_DEBUG. This will show you a bit more than
debug_print_plan, but the query you mentioned is too simple so
that planner has almost no alternative. Creating some index (say,
on age) would give planner some alternatives.

Have a good day,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers