Re: [HACKERS] JSON NULLs

2013-02-10 Thread Yeb Havinga

On 2013-02-08 15:15, Andrew Dunstan wrote:




Revised patch attached. The problem also existed with the get*_as_text 
functions (and their operators). Some additional regression tests are 
added to test these cases.


Hi,

I did some minor things with the patch today.

1. thanks for the work on the json type, great to see it in Postgres and 
also more functions on it!


2.
during compile on

jsonfuncs.c: In function `each_object_field_end':
jsonfuncs.c:1151:13: warning: assignment makes integer from pointer 
without a cast


yeb@unix:~/ff$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/x86_64-linux-gnu/gcc/x86_64-linux-gnu/4.5.2/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 
4.5.2-8ubuntu4' --with-bugurl=file:///usr/share/doc/gcc-4.5/README.Bugs 
--enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr 
--program-suffix=-4.5 --enable-shared --enable-multiarch 
--with-multiarch-defaults=x86_64-linux-gnu --enable-linker-build-id 
--with-system-zlib --libexecdir=/usr/lib/x86_64-linux-gnu 
--without-included-gettext --enable-threads=posix 
--with-gxx-include-dir=/usr/include/c++/4.5 
--libdir=/usr/lib/x86_64-linux-gnu --enable-nls --with-sysroot=/ 
--enable-clocale=gnu --enable-libstdcxx-debug 
--enable-libstdcxx-time=yes --enable-plugin --enable-gold 
--enable-ld=default --with-plugin-ld=ld.gold --enable-objc-gc 
--disable-werror --with-arch-32=i686 --with-tune=generic 
--enable-checking=release --build=x86_64-linux-gnu 
--host=x86_64-linux-gnu --target=x86_64-linux-gnu

Thread model: posix
gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu4)

3. I was wondering how to access the first author from this json snippet:

{
 id: QZr82w_eSi8C,
 etag: KZ+JsrkCdqw,
 volumeInfo: {
  title: Heads Up Software Construction,
  authors: [
   Dan Malone,
   Dave Riles
  ],


and played a bit with json_get_path_as_text(document,  'volumeInfo', 
'authors') that accepts a list of keys as arguments. Have you thought 
about an implementation that would accept a single path argument like 
'volumeInfo.authors[0]' ? This might be more powerful and easy to use, 
since the user does not need to call another function to get the first 
element from the author array, and the function call does not need to be 
changed when path lenghts change.


My apologies if this has been discussed before - I've gone through 
threads from nov 2012 but did not find a previous discussion about this 
topic.


regards,
Yeb Havinga



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: Successful post to pgsql-hackers

2013-02-10 Thread Magnus Hagander
On Sun, Feb 10, 2013 at 3:25 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Magnus Hagander escribió:
 On Sat, Feb 9, 2013 at 4:31 PM, Peter Geoghegan
 peter.geoghega...@gmail.com wrote:
  On 9 February 2013 15:24, Magnus Hagander mag...@hagander.net wrote:
  It's in your personal majordomo settings. I don't think it's related
  to that at all, but it's a flag you set on your subscription, so
  perhaps you set it by mistake.
 
  Then I must have set it by mistake too, when I recently changed e-mail
  addresses.

 Hmm. I wonder if Alvaro may have accidentally switched the default, if
 it happened to more than one person. Alvaro, can you check?

 I hadn't touched this, but Andres Freund had already complained about
 this before.  I just checked and yes, it seems that the flag to send a
 confirmation for each post is set.  I have reset it.

 I also took the opportunity to set the flags to send confirmation emails
 when a posting is rejected or stalled for moderation.  We discussed this
 previously (Pavel Stehule complained about it).

 I changed the defaults for new subscribers, and also the flags values
 for all existing subscribers, note.

For *all* existing subscribers, or those that had not changed their
defaults? And did you change just those flags, or for all flags?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: proposal: a width specification for s specifier (format function), fix behave when positional and ordered placeholders are used

2013-02-10 Thread Pavel Stehule
2013/2/10 Dean Rasheed dean.a.rash...@gmail.com:
 On 9 February 2013 18:30, Pavel Stehule pavel.steh...@gmail.com wrote:
 There are a new question

 what should be result of

 format(%2$*1$s, NULL, hello)

 ???

 My first thought is that a NULL width should be treated the same as no
 width at all (equivalent to width=0), rather than raising an
 exception.

 minor update - fix align NULL for %L

 You need to do the same for a NULL value with %s, which currently
 produces an empty string regardless of the width.

 have others same opinion? Usually I don't like hide NULLs, but this is
 corner case (and specific function) and I have not strong opinion on
 this issue.


 One use case for this might be something like

 SELECT format('%*s', minimum_width, value) FROM some_table;

 Throwing an exception when minimum_width is NULL doesn't seem
 particularly useful. Intuitively, it just means that row has no
 minimum width, so I think we should allow it.

 I think the case where the value is NULL is much more clear-cut.
 format('%s') produces '' (empty string). So format('%3s') should
 produce '   '.



ok - in this case I can accept NULL as ignore width




 The documentation also needs to be updated. I'm thinking perhaps
 format() should now have its own separate sub-section in the manual,
 rather than trying to cram it's docs into a single table row. I can
 help with the docs if you like.

 please, if you can, write it. I am sure, so you do it significantly
 better than me.


 Here is my first draft. I've also attached the generated HTML page,
 because it's not so easy to read an SGML patch.


nice

I have only one point - I am think, so format function should be in
table 9-6 - some small text with reference to special section.

Regards

Pavel

 Regards,
 Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON NULLs

2013-02-10 Thread Andrew Dunstan


On 02/10/2013 05:43 AM, Yeb Havinga wrote:

On 2013-02-08 15:15, Andrew Dunstan wrote:




Revised patch attached. The problem also existed with the 
get*_as_text functions (and their operators). Some additional 
regression tests are added to test these cases.


Hi,

I did some minor things with the patch today.

1. thanks for the work on the json type, great to see it in Postgres 
and also more functions on it!


2.
during compile on

jsonfuncs.c: In function `each_object_field_end':
jsonfuncs.c:1151:13: warning: assignment makes integer from pointer 
without a cast



Thanks, I have fixed this in my code, and it will be included in the 
next patch I post.







3. I was wondering how to access the first author from this json snippet:

{
 id: QZr82w_eSi8C,
 etag: KZ+JsrkCdqw,
 volumeInfo: {
  title: Heads Up Software Construction,
  authors: [
   Dan Malone,
   Dave Riles
  ],


and played a bit with json_get_path_as_text(document, 'volumeInfo', 
'authors') that accepts a list of keys as arguments. Have you thought 
about an implementation that would accept a single path argument like 
'volumeInfo.authors[0]' ? This might be more powerful and easy to use, 
since the user does not need to call another function to get the first 
element from the author array, and the function call does not need to 
be changed when path lenghts change.



try:

   json_get_path_as_text(document,  'volumeInfo', 'authors', '0')


There are other ways to spell this, too:

   json_get_path_as_text(document,  variadic
   '{volumeInfo,authors,0}'::text[])


or

   document-'{volumeInfo,authors,0}'::text[]

I'm actually wondering if we should use different operator names for the 
get_path*op functions so we wouldn't need to type qualify the path 
argument. Maybe ? and ? although I'm reluctant to use ? in an 
operator given the recent JDBC discussion. Or perhaps # and #.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Department of Redundancy Department: makeNode(FuncCall) division

2013-02-10 Thread Tom Lane
David Fetter da...@fetter.org writes:
 Per suggestions and lots of help from Andrew Gierth, please find
 attached a patch to clean up the call sites for FuncCall nodes, which
 I'd like to expand centrally rather than in each of the 37 (or 38, but
 I only redid 37) places where it's called.  The remaining one is in
 src/backend/nodes/copyfuncs.c, which has to be modified for any
 changes in the that struct anyhow.

TBH, I don't think this is an improvement.

The problem with adding a new field to any struct is that you have to
run around and examine (and, usually, modify) every place that
manufactures that type of struct.  With a makeFuncCall defined like
this, you'd still have to do that; it would just become a lot easier
to forget to do so.

If the subroutine were defined like most other makeXXX subroutines,
ie you have to supply *all* the fields, that argument would go away,
but the notational advantage is then dubious.

The bigger-picture point is that you're proposing to make the coding
conventions for building FuncCalls different from what they are for
any other grammar node.  I don't think that's a great idea; it will
mostly foster confusion.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Department of Redundancy Department: makeNode(FuncCall) division

2013-02-10 Thread David Fetter
On Sun, Feb 10, 2013 at 10:09:19AM -0500, Tom Lane wrote:
 David Fetter da...@fetter.org writes:
  Per suggestions and lots of help from Andrew Gierth, please find
  attached a patch to clean up the call sites for FuncCall nodes, which
  I'd like to expand centrally rather than in each of the 37 (or 38, but
  I only redid 37) places where it's called.  The remaining one is in
  src/backend/nodes/copyfuncs.c, which has to be modified for any
  changes in the that struct anyhow.
 
 TBH, I don't think this is an improvement.
 
 The problem with adding a new field to any struct is that you have to
 run around and examine (and, usually, modify) every place that
 manufactures that type of struct.  With a makeFuncCall defined like
 this, you'd still have to do that; it would just become a lot easier
 to forget to do so.

So you're saying that I've insufficiently put the choke point there?

It seems to me that needing to go back to each and every jot and
tittle of the code to add a new default parameter in each place is a
recipe for mishaps, where in this one, the new correct defaults will
just appear in all the places they need to.

 If the subroutine were defined like most other makeXXX subroutines,
 ie you have to supply *all* the fields, that argument would go away,
 but the notational advantage is then dubious.
 
 The bigger-picture point is that you're proposing to make the coding
 conventions for building FuncCalls different from what they are for
 any other grammar node.  I don't think that's a great idea; it will
 mostly foster confusion.

I find this a good bit less confusing than the litany of repeated
parameter settings, the vast majority of which in most cases are along
the lines of NULL/NIL/etc.

This way, anything set that's not the default stands out.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A question about the psql \copy command

2013-02-10 Thread Robert Haas
On Thu, Feb 7, 2013 at 7:45 AM, Etsuro Fujita
fujita.ets...@lab.ntt.co.jp wrote:
 Through the work on the patch [1], I had a question about the psql \copy
 command.  We are permitted 1) but not permitted 2):
 1) \copy foo from stdin ;
 2) \copy foo from stdin;
 Is this intentional?  I think it would be better to allow for 2).  Attached 
 is a
 patch.

Sounds reasonable to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] backup.sgml patch that adds information on custom format backups

2013-02-10 Thread Robert Haas
On Fri, Feb 8, 2013 at 1:56 PM, Ivan Lezhnjov IV i...@commandprompt.com wrote:
 Hello,

 I'd like to submit the following patch that extends backup.sgml with a bit of 
 practical but important information.

 Project: postgresql
 Patch filename: backup.sgml-cmd-v001.patch

 The patch extends backup.sgml and adds practical information on custom format 
 backups approach. Basically, we believe that plaintext backup format is 
 suitable for a very limited range of use cases, and that in real world people 
 are usually better off with a custom format backup. This is what we want 
 PostgreSQL users to be aware of and provide some hands-on examples of how to 
 do backups using this approach.

 It is meant for application, and is against master branch.

 The patch does pass 'make check' and 'make html' successfully.

 PS: this is my first submission ever. So, if I'm missing something or not 
 doing it as expected, please, do let me know. Thank you.

Generally it's a good idea to discuss whatever you'd like to change
before you actually write the patch, so as to get consensus on it.
I'm not sure what others think, but the proposed wording seems a bit
hard on plain text dumps to me.  For many people, the advantage of
having a file in a format you can read may outweigh the disadvantages
of being unable to do a selective restore.  Not that custom-format
dumps aren't great; they certainly are.  But it isn't a bug that we
support more than one format.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG #7493: Postmaster messages unreadable in a Windows console

2013-02-10 Thread Noah Misch
On Wed, Jan 30, 2013 at 10:00:01AM +0400, Alexander Law wrote:
 30.01.2013 05:51, Noah Misch wrote:
 On Tue, Jan 29, 2013 at 09:54:04AM -0500, Tom Lane wrote:
 Alexander Law exclus...@gmail.com writes:
 Please look at the following l10n bug:
 http://www.postgresql.org/message-id/502a26f1.6010...@gmail.com
 and the proposed patch.

 Even then, I wouldn't be surprised to find problematic consequences beyond
 error display.  What if all the databases are EUC_JP, the platform encoding 
 is
 KOI8, and some postgresql.conf settings contain EUC_JP characters?  Does the
 postmaster not rely on its use of SQL_ASCII to allow those values?

 I would look at fixing this by making the error output machinery smarter in
 this area before changing the postmaster's notion of server_encoding.

With your proposed change, the problem will resurface in an actual SQL_ASCII
database.  At the problem's root is write_console()'s assumption that messages
are in the database encoding.  pg_bind_textdomain_codeset() tries to make that
so, but it only works for encodings with a pg_enc2gettext_tbl entry.  That
excludes SQL_ASCII, MULE_INTERNAL, and others.  write_console() needs to
behave differently in such cases.

 Maybe I still miss something but I thought that  
 postinit.c/CheckMyDatabase will switch encoding of a messages by  
 pg_bind_textdomain_codeset to EUC_JP so there will be no issues with it.  
 But until then KOI8 should be used.
 Regarding postgresql.conf, as it has no explicit encoding specification,  
 it should be interpreted as having the platform encoding. So in your  
 example it should contain KOI8, not EUC_JP characters.

Following some actual testing, I see that we treat postgresql.conf values as
byte sequences; any reinterpretation as encoded text happens later.  Hence,
contrary to my earlier suspicion, your patch does not make that situation
worse.  The present situation is bad; among other things, current_setting() is
a vector for injecting invalid text data.  But unconditionally validating
postgresql.conf values in the platform encoding would not be an improvement.
Suppose you have a UTF-8 platform encoding and KOI8R databases.  You may wish
to put KOI8R strings in a GUC, say search_path.  That's possible today; if we
required that postgresql.conf conform to the platform encoding and no other,
it would become impossible.  This area warrants improvement, but doing so will
entail careful design.

Thanks,
nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] pg_isready (was: [WIP] pg_ping utility)

2013-02-10 Thread Phil Sorber
On Fri, Feb 8, 2013 at 1:07 PM, Phil Sorber p...@omniti.com wrote:
 On Fri, Feb 8, 2013 at 12:46 PM, Fujii Masao masao.fu...@gmail.com wrote:
 No maybe. But I think that all the client commands should follow the
 same rule. Otherwise a user would get confused when specifying
 options.

 I would consider the rest of the apps using it as a consensus. I will
 make sure it aligns in behavior.


I've done as you suggested, and made sure they align with other
command line utils. What I have found is that dbname is passed
(almost) last in the param array so that it clobbers all previous
values. I have made this patch as minimal as possible basing it off of
master and not off of my previous attempt. For the record I still like
the overall design of my previous attempt better, but I have not
included a new version based on that here so as not to confuse the
issue, however I would gladly do so upon request.

Updated patch attached.


 Regards,

 --
 Fujii Masao


pg_isready_con_str_v4.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] backup.sgml patch that adds information on custom format backups

2013-02-10 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:

 I'm not sure what others think, but the proposed wording seems a
 bit hard on plain text dumps to me.

Agreed.  I don't know how many times I've piped the output of
pg_dump to the input of psql.  Certainly that was very common
before pg_upgrade was available.  And for development purposes a
text script file is often exactly what is needed -- to store it in
custom format first would just be an extra step which would waste
time.  Since I got my head around PITR dumps, I've rarely had a
reason to use the pg_dump custom format, myself.

-Kevin



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] backup.sgml patch that adds information on custom format backups

2013-02-10 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I'm not sure what others think, but the proposed wording seems a bit
 hard on plain text dumps to me.

I wasn't terribly thrilled with labeling pg_dumpall deprecated,
either.  It might be imperfect for some use cases, but that adjective
suggests that we're trying to get rid of it, which surely isn't the
case.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pgbench --startup option

2013-02-10 Thread Jeff Janes
While looking at some proposed patches and pondering some questions on
performance, I realized I desperately needed ways to run benchmarks with
different settings without needing to edit postgresql.conf and
restart/reload the server each time.

Most commonly, I want to run with synchronous_commit on or off at whim.  I
thought of writing a patch specifically for that, but decided a more
general approach would be better.  The attached patch allows any arbitrary
command to be executed upon the start up of each connection intended for
benchmarking (as opposed to utility connections, intended for either -i or
for counting the rows in in pgbench_branches and for vacuuming), and so
covers the need for changing any session-changeable GUCs, plus doing other
things as well.

I created doBenchMarkConnect() to segregate bench-marking connections from
utility connections.  At first I thought of adding the startup code to only
the normal path and leaving support for -C in the wind, but decided that
was just lazy.

Will add to commitfest-next.

Cheers,

Jeff
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
new file mode 100644
index 11c0062..3a00127
*** a/contrib/pgbench/pgbench.c
--- b/contrib/pgbench/pgbench.c
*** double		sample_rate = 0.0;
*** 142,147 
--- 142,148 
  char	   *tablespace = NULL;
  char	   *index_tablespace = NULL;
  
+ char	   *startup = NULL;
  /*
   * end of configurable parameters
   */
*** usage(void)
*** 394,400 
  		 --unlogged-tables\n
  		  create tables as unlogged tables\n
  		   \nBenchmarking options:\n
! 		  -c NUM   number of concurrent database clients (default: 1)\n
  		 -C   establish new connection for each transaction\n
  		 -D VARNAME=VALUE\n
  		  define variable for use by custom script\n
--- 395,401 
  		 --unlogged-tables\n
  		  create tables as unlogged tables\n
  		   \nBenchmarking options:\n
! 		 -c NUM   number of concurrent database clients (default: 1)\n
  		 -C   establish new connection for each transaction\n
  		 -D VARNAME=VALUE\n
  		  define variable for use by custom script\n
*** usage(void)
*** 412,420 
  		 -r   report average latency per command\n
  		 -s NUM   report this scale factor in output\n
  		 -S   perform SELECT-only transactions\n
! 	   -t NUM   number of transactions each client runs (default: 10)\n
  		 -T NUM   duration of benchmark test in seconds\n
  		 -v   vacuum all four standard tables before tests\n
  		   \nCommon options:\n
  		 -d print debugging output\n
  		 -h HOSTNAMEdatabase server host or socket directory\n
--- 413,423 
  		 -r   report average latency per command\n
  		 -s NUM   report this scale factor in output\n
  		 -S   perform SELECT-only transactions\n
! 		 -t NUM   number of transactions each client runs (default: 10)\n
  		 -T NUM   duration of benchmark test in seconds\n
  		 -v   vacuum all four standard tables before tests\n
+  		 --startup=COMMAND\n
+ 		  Run COMMAND upon creating each benchmarking connection\n
  		   \nCommon options:\n
  		 -d print debugging output\n
  		 -h HOSTNAMEdatabase server host or socket directory\n
*** executeStatement(PGconn *con, const char
*** 520,526 
  	res = PQexec(con, sql);
  	if (PQresultStatus(res) != PGRES_COMMAND_OK)
  	{
! 		fprintf(stderr, %s, PQerrorMessage(con));
  		exit(1);
  	}
  	PQclear(res);
--- 523,529 
  	res = PQexec(con, sql);
  	if (PQresultStatus(res) != PGRES_COMMAND_OK)
  	{
! 		fprintf(stderr, Command failed with %s, PQerrorMessage(con));
  		exit(1);
  	}
  	PQclear(res);
*** doConnect(void)
*** 593,598 
--- 596,613 
  	return conn;
  }
  
+ /* set up a connection to the backend intended for benchmarking */
+ static PGconn *
+ doBenchMarkConnect(void)
+ {
+ 	static PGconn * conn;
+ 	conn = doConnect();
+ 	if (startup != NULL) {
+ 		executeStatement(conn, startup);
+ 	};
+ 	return conn;
+ };
+ 
  /* throw away response from backend */
  static void
  discard_response(CState *state)
*** top:
*** 1131,1137 
  	end;
  
  		INSTR_TIME_SET_CURRENT(start);
! 		if ((st-con = doConnect()) == NULL)
  		{
  			fprintf(stderr, Client %d aborted in establishing connection.\n, st-id);
  			return clientDone(st, false);
--- 1146,1152 
  	end;
  
  		INSTR_TIME_SET_CURRENT(start);
! 		if ((st-con = doBenchMarkConnect()) == NULL)
  		{
  			fprintf(stderr, Client %d aborted in establishing connection.\n, st-id);
  			return clientDone(st, false);
*** main(int argc, char **argv)
*** 2139,2144 
--- 2154,2160 
  		{unlogged-tables, no_argument, 

Re: [HACKERS] pgbench --startup option

2013-02-10 Thread Peter Geoghegan
On 10 February 2013 23:27, Jeff Janes jeff.ja...@gmail.com wrote:
 While looking at some proposed patches and pondering some questions on
 performance, I realized I desperately needed ways to run benchmarks with
 different settings without needing to edit postgresql.conf and
 restart/reload the server each time.

I'd thought about hacking this capability into pgbench-tools, so that
there was a new outer-most loop that iterates through different
postgresql.conf files. Come to think of it, the need to vary settings
per test set (that is, per series of benchmarks, each of which is
plotted as a different color) generally only exists for one or two
settings, so it is probably better to pursue the approach that you
propose here.

I guess what I've outlined could still be useful for PGC_POSTMASTER
gucs, like shared_buffers.

-- 
Regards,
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BUG #7493: Postmaster messages unreadable in a Windows console

2013-02-10 Thread Tom Lane
Noah Misch n...@leadboat.com writes:
 Following some actual testing, I see that we treat postgresql.conf values as
 byte sequences; any reinterpretation as encoded text happens later.  Hence,
 contrary to my earlier suspicion, your patch does not make that situation
 worse.  The present situation is bad; among other things, current_setting() is
 a vector for injecting invalid text data.  But unconditionally validating
 postgresql.conf values in the platform encoding would not be an improvement.
 Suppose you have a UTF-8 platform encoding and KOI8R databases.  You may wish
 to put KOI8R strings in a GUC, say search_path.  That's possible today; if we
 required that postgresql.conf conform to the platform encoding and no other,
 it would become impossible.  This area warrants improvement, but doing so will
 entail careful design.

The key problem, ISTM, is that it's not at all clear what encoding to
expect the incoming data to be in.  I'm concerned about trying to fix
that by assuming it's in some platform encoding --- for one thing,
while that might be a well-defined concept on Windows, I don't believe
it is anywhere else.

If we knew that postgresql.conf was stored in, say, UTF8, then it would
probably be possible to perform encoding conversion to get string
variables into the database encoding.  Perhaps we should allow some
magic syntax to tell us the encoding of a config file?

file_encoding = 'utf8'  # must precede any non-ASCII in the file

There would still be a lot of practical problems to solve, like what to
do if we fail to convert some string into the database encoding.  But at
least the problems would be somewhat well-defined.

While we're thinking about this, it'd be nice to fix our handling (or
rather lack of handling) of encoding considerations for database names,
user names, and passwords.  I could imagine adding some sort of encoding
marker to connection request packets, which could fix the don't-know-
the-encoding problem as far as incoming data is concerned.  But how
shall we deal with storing the strings in shared catalogs, which have to
be readable from multiple databases possibly of different encodings?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] performance regression in 9.2 CTE with SRF function

2013-02-10 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 In Czech discussion group was reported performance regression of CTE
 query. I wrote a test, when I can show it.

I don't see anything tremendously wrong here.  The older branches are
choosing the right plan for entirely wrong reasons: they don't notice
that select foo(a) from pl has a set-returning function in the
targetlist and so don't adjust the estimated number of result rows
for that.  In this particular example, foo() seems to return an average
of about 11 rows per call, versus the default estimate of 1000 rows per
call, so the size of the result is overestimated and that dissuades
the planner from using a hashed subplan.  But the error could easily
have gone the other way, causing the planner to use a hashed subplan
when it shouldn't, and the consequences of that are even worse.  So
I don't think that ignoring SRFs in the targetlist is better.

If you add ROWS 10 or so to the declaration of the function, you
get a better row estimate and it goes back to the hashed subplan.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Improving pgbench to log index creation time etc.

2013-02-10 Thread Tatsuo Ishii
Hi PostgreSQL hackers,

I often need to calculate time to spend for index creation and vacuum
to analyze PostgreSQL data load performance. Index creation and vacuum
will take non trivial time for large scale data and it is important
information of data loading benchmark.

So I would like to propose to add new options to be used with pgbenh
-i (initialize) mode:

--log-index-creation-duration: log duration of index creation in seconds
--log-vacuum-duration: log duration of vacuum in seconds

What do you think?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: Successful post to pgsql-hackers

2013-02-10 Thread Gurjeet Singh
On Sat, Feb 9, 2013 at 5:09 PM, Euler Taveira eu...@timbira.com wrote:

 On 09-02-2013 13:45, Gurjeet Singh wrote:
  BTW, I hope I understand what selfcopy is: send a copy to yourself. Why
 would
  that be turned on by default?
 
 If you want to reply to yourself...


Wouldn't I be using the copy in my sent folder to do that?

I guess it is a non-issue for me since my mail provider (GMail) does not
show me the mail I sent to myself as a duplicate; it at least seems that
way.

-- 
Gurjeet Singh

http://gurjeet.singh.im/


Re: [HACKERS] Fwd: Successful post to pgsql-hackers

2013-02-10 Thread Alvaro Herrera
Magnus Hagander escribió:
 On Sun, Feb 10, 2013 at 3:25 AM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:

  I changed the defaults for new subscribers, and also the flags values
  for all existing subscribers, note.
 
 For *all* existing subscribers, or those that had not changed their
 defaults? And did you change just those flags, or for all flags?

For all existing subscribers, I changed the ackpost (to off), ackreject
(to on), ackstall (to on) flags.  Other flags were untouched.  There's
no way to distinguish unchanged from set to the same as the default
in mj2, I'm afraid.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] performance regression in 9.2 CTE with SRF function

2013-02-10 Thread Pavel Stehule
2013/2/11 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 In Czech discussion group was reported performance regression of CTE
 query. I wrote a test, when I can show it.

 I don't see anything tremendously wrong here.  The older branches are
 choosing the right plan for entirely wrong reasons: they don't notice
 that select foo(a) from pl has a set-returning function in the
 targetlist and so don't adjust the estimated number of result rows
 for that.  In this particular example, foo() seems to return an average
 of about 11 rows per call, versus the default estimate of 1000 rows per
 call, so the size of the result is overestimated and that dissuades
 the planner from using a hashed subplan.  But the error could easily
 have gone the other way, causing the planner to use a hashed subplan
 when it shouldn't, and the consequences of that are even worse.  So
 I don't think that ignoring SRFs in the targetlist is better.

no, there is strange estimation

 -  Seq Scan on public.x2  (cost=0.00..345560.00 rows=500
width=4) (actual time=17.914..9330.645 rows=133 loops=1)
   Output: x2.a
   Filter: (NOT (SubPlan 2))
   Rows Removed by Filter: 867
   SubPlan 2
 -  CTE Scan on pl pl_1  (cost=0.00..468.59
rows=89000 width=4) (actual time=0.023..8.379 rows=566 loops=1000)
   Output: foo(pl_1.a)

CTE Scan expect rows=89000

I don't know how is possible to take too high number

Regards

Pavel


 If you add ROWS 10 or so to the declaration of the function, you
 get a better row estimate and it goes back to the hashed subplan.

 regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] performance regression in 9.2 CTE with SRF function

2013-02-10 Thread Pavel Stehule

 no, there is strange estimation

  -  Seq Scan on public.x2  (cost=0.00..345560.00 rows=500
 width=4) (actual time=17.914..9330.645 rows=133 loops=1)
Output: x2.a
Filter: (NOT (SubPlan 2))
Rows Removed by Filter: 867
SubPlan 2
  -  CTE Scan on pl pl_1  (cost=0.00..468.59
 rows=89000 width=4) (actual time=0.023..8.379 rows=566 loops=1000)
Output: foo(pl_1.a)

 CTE Scan expect rows=89000

 I don't know how is possible to take too high number


respective why estimation is unstrable

first (1MB work_mem)

 -  CTE Scan on pl pl_1  (cost=0.00..468.59
rows=89000 width=4) (actual time=0.023..8.379 rows=566 loops=1000)
   Output: foo(pl_1.a)


second (3MB work_mem)

  -  Hash  (cost=1.78..1.78 rows=89 width=4) (actual
time=9.650..9.650 rows=89 loops=1)
 Output: pl.a
 Buckets: 1024  Batches: 1  Memory Usage: 3kB
 -  CTE Scan on pl  (cost=0.00..1.78 rows=89 width=4) (actual
time=8.468..9.346 rows=89 loops=1)
   Output: pl.a

I expect so estimation not depends on work_mem

Best regards

Pavel

 Regards

 Pavel


 If you add ROWS 10 or so to the declaration of the function, you
 get a better row estimate and it goes back to the hashed subplan.

 regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unlogged tables vs. GIST

2013-02-10 Thread Jeevan Chalke
Hi,

Any review comments on this ?


On Tue, Jan 29, 2013 at 6:03 PM, Jeevan Chalke 
jeevan.cha...@enterprisedb.com wrote:

 Hi Heikki,


 On Mon, Jan 28, 2013 at 2:34 PM, Heikki Linnakangas 
 hlinnakan...@vmware.com wrote:

 On 23.01.2013 17:30, Robert Haas wrote:

 On Wed, Jan 23, 2013 at 4:04 AM, Jeevan Chalke
 jeevan.chalke@enterprisedb.**com jeevan.cha...@enterprisedb.com
  wrote:

 I guess my earlier patch, which was directly incrementing

 ControlFile-unloggedLSN counter was the concern as it will take
 ControlFileLock several times.

 In this version of patch I did what Robert has suggested. At start of
 the
 postmaster, copying unloggedLSn value to XLogCtl, a shared memory
 struct.
 And
 in all access to unloggedLSN, using this shared variable using a
 SpinLock.
 And since we want to keep this counter persistent across clean shutdown,
 storing it in ControlFile before updating it.

 With this approach, we are updating ControlFile only when we shutdown
 the
 server, rest of the time we are having a shared memory counter. That
 means
 we
 are not touching pg_control every other millisecond or so. Also since
 we are
 not caring about crashes, XLogging this counter like OID counter is not
 required as such.


 On a quick read-through this looks reasonable to me, but others may
 have different opinions, and I haven't reviewed in detail.

  ...

 [a couple of good points]


 In addition to those things Robert pointed out:

  /*
 + * Temporary GiST indexes are not WAL-logged, but we need LSNs to detect
 + * concurrent page splits anyway. GetXLogRecPtrForUnloggedRel()
 provides a fake
 + * sequence of LSNs for that purpose. Each call generates an LSN that is
 + * greater than any previous value returned by this function in the same
 + * session using static counter
 + * Similarily unlogged GiST indexes are also not WAL-logged. But we
 need a
 + * persistent counter across clean shutdown. Use counter from
 ControlFile which
 + * is copied in XLogCtl.unloggedLSN to accomplish that
 + * If relation is UNLOGGED, return persistent counter from XLogCtl else
 return
 + * session wide temporary counter
 + */
 +XLogRecPtr
 +GetXLogRecPtrForUnloggedRel(**Relation rel)


 From a modularity point of view, it's not good that you xlog.c needs to
 know about Relation struct. Perhaps move the logic to check the relation is
 unlogged or not to a function in gistutil.c, and only have a small
 GetUnloggedLSN() function in xlog.c


 I kept a function as is, but instead sending Relation as a parameter, it
 now takes boolean, isUnlogged. Added a MACRO for that.



 I'd suggest adding a separate spinlock to protect unloggedLSN. I'm not
 sure if there's much contention on XLogCtl-info_lck today, but
 nevertheless it'd be better to keep this separate, also from a modularity
 point of view.


 Hmm. OK.
 Added ulsn_lck for this.



  @@ -7034,6 +7078,8 @@ CreateCheckPoint(int flags)
 LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
 ControlFile-state = DB_SHUTDOWNING;
 ControlFile-time = (pg_time_t) time(NULL);
 +   /* Store unloggedLSN value as we want it persistent
 across shutdown */
 +   ControlFile-unloggedLSN = XLogCtl-unloggedLSN;
 UpdateControlFile();
 LWLockRelease(ControlFileLock)**;
 }


 This needs to acquire the spinlock to read unloggedLSN.


 OK. Done.



 Do we need to do anything to unloggedLSN in pg_resetxlog?


 I guess NO.

 Also, added Robert's comment in bufmgr.c
 I am still unsure about the spinlock around buf flags as we are just
 examining it.

 Please let me know if I missed any.

 Thanks



 - Heikki




 --
 Jeevan B Chalke
 Senior Software Engineer, RD
 EnterpriseDB Corporation
 The Enterprise PostgreSQL Company

 Phone: +91 20 30589500

 Website: www.enterprisedb.com
 EnterpriseDB Blog: http://blogs.enterprisedb.com/
 Follow us on Twitter: http://www.twitter.com/enterprisedb

 This e-mail message (and any attachment) is intended for the use of the
 individual or entity to whom it is addressed. This message contains
 information from EnterpriseDB Corporation that may be privileged,
 confidential, or exempt from disclosure under applicable law. If you are
 not the intended recipient or authorized to receive this for the intended
 recipient, any use, dissemination, distribution, retention, archiving, or
 copying of this communication is strictly prohibited. If you have received
 this e-mail in error, please notify the sender immediately by reply e-mail
 and delete this message.




-- 
Jeevan B Chalke
Senior Software Engineer, RD
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone: +91 20 30589500

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of the
individual or entity to whom it is addressed. This message contains
information