Re: [HACKERS] New option for pg_basebackup, to specify a different directory for pg_xlog

2013-11-15 Thread Haribabu kommi
on 15 November 2013 17:26 Magnus Hagander wrote:
>On Fri, Nov 15, 2013 at 12:10 PM, Haribabu kommi 
>mailto:haribabu.ko...@huawei.com>> wrote:
>On 14 November 2013 23:59 Fujii Masao wrote:
>> On Thu, Nov 14, 2013 at 9:08 PM, Haribabu kommi
>> mailto:haribabu.ko...@huawei.com>> wrote:
>> > Please find attached the patch, for adding a new option for
>> > pg_basebackup, to specify a different directory for pg_xlog.
>>>
>> Sounds good! Here are the review comments:
>>> Don't we need to prevent users from specifying the same directory in
>>> both --pgdata and --xlogdir?
>>I feel no need to prevent, even if user specifies both --pgdata and --xlogdir 
>>as same directory
>>all the transaction log files will be created in the base directory instead 
>>of pg_xlog directory.

>Given how easy it would be to prevent that, I think we should. It would be an 
>easy misunderstanding to specify that when you actually want it in 
>/pg_xlog. Specifying that would be redundant in the first place, but 
>people ca do that, but it
>would also be very easy to do it by mistake, and you'd end up with something 
>that's really bad, including a recursive symlink.

Presently with initdb also user can specify both data and xlog directories as 
same.
To prevent the data directory and xlog directory as same, there is a way in 
windows (_fullpath api) to get absolute path from a relative path, but I didn't 
get any such possibilities in linux.
I didn't find any other way to check it, if anyone have any idea regarding this 
please let me know.

>I also think it would probably be worthwhile to support this in  tar format as 
>well, but I guess that's a separate patch really. There's really no reason we 
>should't support wal streaming into a separate tar file. But - separate issue.

Sure. I will prepare a separate patch for the same and submit it in the next 
commit fest.

Regards,
Hari babu.


Re: [HACKERS] New option for pg_basebackup, to specify a different directory for pg_xlog

2013-11-15 Thread Haribabu kommi
On 15 November 2013 17:26 Magnus Hagander wrote:
On Fri, Nov 15, 2013 at 12:10 PM, Haribabu kommi 
mailto:haribabu.ko...@huawei.com>> wrote:
On 14 November 2013 23:59 Fujii Masao wrote:
> On Thu, Nov 14, 2013 at 9:08 PM, Haribabu kommi
> mailto:haribabu.ko...@huawei.com>> wrote:
> > Please find attached the patch, for adding a new option for
> > pg_basebackup, to specify a different directory for pg_xlog.
>
> Sounds good! Here are the review comments:
>
> +printf(_("--xlogdir=XLOGDIR   location for the
> transaction log directory\n"));
>
> This message is not aligned well.
Fixed.

> -if (!streamwal || strcmp(filename +
> strlen(filename) - 8, "/pg_xlog") != 0)
> +if ((!streamwal && (strcmp(xlog_dir, "") == 0))
> +|| strcmp(filename + strlen(filename) -
> 8, "/pg_xlog") != 0)
>
> You need to update the source code comment.
Corrected the source code comment. Please check once.

> +#ifdef HAVE_SYMLINK
> +if (symlink(xlog_dir, linkloc) != 0)
> +{
> +fprintf(stderr, _("%s: could not create symbolic link
> \"%s\": %s\n"),
> +progname, linkloc, strerror(errno));
> +exit(1);
> +}
> +#else
> +fprintf(stderr, _("%s: symlinks are not supported on this
> platform "
> +  "cannot use xlogdir"));
> +exit(1);
> +#endif
> +}
>
> Is it better to call pg_free() at the end? Even if we don't do that, it
> would be almost harmless, though.
Added pg_free to free up the linkloc.

> Don't we need to prevent users from specifying the same directory in
> both --pgdata and --xlogdir?
I feel no need to prevent, even if user specifies both --pgdata and --xlogdir 
as same directory
all the transaction log files will be created in the base directory instead of 
pg_xlog directory.


Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])

2013-11-15 Thread Amit Kapila
On Fri, Nov 15, 2013 at 10:18 PM, Peter Eisentraut  wrote:
> On 11/14/13, 2:50 AM, Amit Kapila wrote:
>> Find the rebased version attached with this mail.
>
> Doesn't build:
>
> openjade  -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D . -c 
> /usr/share/sgml/docbook/stylesheet/dsssl/modular/catalog -d stylesheet.dsl -t 
> sgml -i output-html -V html-index postgres.sgml
> openjade:reference.sgml:61:3:E: cannot find "alter_system.sgml"; tried 
> "ref/alter_system.sgml", "./alter_system.sgml", "./alter_system.sgml", 
> "/usr/local/share/sgml/alter_system.sgml", "/usr/share/sgml/alter_system.sgml"
> openjade:config.sgml:164:27:X: reference to non-existent ID "SQL-ALTERSYSTEM"
> make[3]: *** [HTML.index] Error 1
> make[3]: *** Deleting file `HTML.index'
> osx -D. -x lower -i include-xslt-index postgres.sgml >postgres.xmltmp
> osx:reference.sgml:61:3:E: cannot find "alter_system.sgml"; tried 
> "ref/alter_system.sgml", "./alter_system.sgml", 
> "/usr/local/share/sgml/alter_system.sgml", "/usr/share/sgml/alter_system.sgml"
> osx:config.sgml:164:27:X: reference to non-existent ID "SQL-ALTERSYSTEM"
> make[3]: *** [postgres.xml] Error 1
>
> New file missing in patch?

Oops, missed the new sgml file in patch, updated patch to include it.
Many thanks for checking it.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


alter_system_v11.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pre-commit triggers

2013-11-15 Thread Andrew Dunstan


On 11/15/2013 09:07 PM, Peter Eisentraut wrote:

On Fri, 2013-11-15 at 13:01 -0500, Andrew Dunstan wrote:

Attached is a patch to provide a new event trigger that will fire on
transaction commit.

xact.c: In function ‘CommitTransaction’:
xact.c:1835:3: warning: implicit declaration of function 
‘PreCommitTriggersFire’ [-Wimplicit-function-declaration]





Oops. missed a #include. Revised patch attached.

cheers

andrew
diff --git a/doc/src/sgml/event-trigger.sgml b/doc/src/sgml/event-trigger.sgml
index ac31332..3bbf1a4 100644
--- a/doc/src/sgml/event-trigger.sgml
+++ b/doc/src/sgml/event-trigger.sgml
@@ -12,7 +12,7 @@
PostgreSQL also provides event triggers.  Unlike regular
triggers, which are attached to a single table and capture only DML events,
event triggers are global to a particular database and are capable of
-   capturing DDL events.
+   capturing DDL events or transaction commits.
   
 
   
@@ -29,8 +29,9 @@
  occurs in the database in which it is defined. Currently, the only
  supported events are
  ddl_command_start,
- ddl_command_end
- and sql_drop.
+ ddl_command_end,
+ sql_drop, and
+ transaction_commit.
  Support for additional events may be added in future releases.

 
@@ -65,6 +66,15 @@

 

+A transaction_commit trigger is called at the end of a
+transaction, just before any deferred triggers are fired, unless
+no data changes have been made by the transaction, or
+PostgreSQL is running in Single-User mode. This is so
+that you can recover from a badly specified transaction_commit
+trigger.
+   
+
+   
  Event triggers (like other functions) cannot be executed in an aborted
  transaction.  Thus, if a DDL command fails with an error, any associated
  ddl_command_end triggers will not be executed.  Conversely,
@@ -77,8 +87,13 @@

 

- For a complete list of commands supported by the event trigger mechanism,
- see .
+A transaction_commit trigger is also not called in an
+aborted transaction.
+   
+
+   
+ For a complete list of commands supported by the event trigger
+ mechanism, see .

 

@@ -101,6 +116,11 @@
  to intercept. A common use of such triggers is to restrict the range of
  DDL operations which users may perform.

+
+   
+transaction_commit triggers do not currently support
+WHEN clauses.
+   
   
 
   
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 0591f3f..e7f5095 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -30,6 +30,7 @@
 #include "catalog/namespace.h"
 #include "catalog/storage.h"
 #include "commands/async.h"
+#include "commands/event_trigger.h"
 #include "commands/tablecmds.h"
 #include "commands/trigger.h"
 #include "executor/spi.h"
@@ -1825,6 +1826,16 @@ CommitTransaction(void)
 	Assert(s->parent == NULL);
 
 	/*
+	 * First fire any pre-commit triggers, so if they in turn cause any
+	 * deferred triggers etc to fire this will be picked up below.
+	 * Only fire them, though, if we have a real transaction ID and
+	 * we're not running standalone. Not firing when standalone provides
+	 * a way to recover from setting up a bad transaction trigger.
+	 */
+	if (s->transactionId != InvalidTransactionId && IsUnderPostmaster)
+		PreCommitTriggersFire();
+
+	/*
 	 * Do pre-commit processing that involves calling user-defined code, such
 	 * as triggers.  Since closing cursors could queue trigger actions,
 	 * triggers could open cursors, etc, we have to keep looping until there's
@@ -2030,6 +2041,16 @@ PrepareTransaction(void)
 	Assert(s->parent == NULL);
 
 	/*
+	 * First fire any pre-commit triggers, so if they in turn cause any
+	 * deferred triggers etc to fire this will be picked up below.
+	 * Only fire them, though, if we have a real transaction ID and
+	 * we're not running standalone. Not firing when standalone provides
+	 * a way to recover from setting up a bad transaction trigger.
+	 */
+	if (s->transactionId != InvalidTransactionId && IsUnderPostmaster)
+		PreCommitTriggersFire();
+
+	/*
 	 * Do pre-commit processing that involves calling user-defined code, such
 	 * as triggers.  Since closing cursors could queue trigger actions,
 	 * triggers could open cursors, etc, we have to keep looping until there's
diff --git a/src/backend/commands/event_trigger.c b/src/backend/commands/event_trigger.c
index 328e2a8..f93441f 100644
--- a/src/backend/commands/event_trigger.c
+++ b/src/backend/commands/event_trigger.c
@@ -153,7 +153,8 @@ CreateEventTrigger(CreateEventTrigStmt *stmt)
 	/* Validate event name. */
 	if (strcmp(stmt->eventname, "ddl_command_start") != 0 &&
 		strcmp(stmt->eventname, "ddl_command_end") != 0 &&
-		strcmp(stmt->eventname, "sql_drop") != 0)
+		strcmp(stmt->eventname, "sql_drop") != 0 &&
+		strcmp(stmt->eventname, "transaction_commit") != 0)
 		ereport(ERROR,
 (errcode(ERRCODE_SYNTAX_ERROR),
  errmsg("unreco

Re: [HACKERS] pre-commit triggers

2013-11-15 Thread Peter Eisentraut
On Fri, 2013-11-15 at 13:01 -0500, Andrew Dunstan wrote:
> Attached is a patch to provide a new event trigger that will fire on 
> transaction commit.

xact.c: In function ‘CommitTransaction’:
xact.c:1835:3: warning: implicit declaration of function 
‘PreCommitTriggersFire’ [-Wimplicit-function-declaration]




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] -d option for pg_isready is broken

2013-11-15 Thread Fabrízio de Royes Mello
On Wed, Nov 13, 2013 at 9:37 PM, Josh Berkus  wrote:

>
> handyrep@john:~/handyrep$ pg_isready --version
> pg_isready (PostgreSQL) 9.3.1
>
> handyrep@john:~/handyrep$ pg_isready -h john -p 5432 -U postgres -d
> postgres -q
> pg_isready: missing "=" after "postgres" in connection info string
>
> handyrep@john:~/handyrep$ pg_isready --host=john --port=5432
> --user=postgres --dbname=postgres
> pg_isready: missing "=" after "postgres" in connection info string
>
> handyrep@john:~/handyrep$ pg_isready -h john -p 5432 -U postgres
> john:5432 - accepting connections
>
> so apparently the -d option:
>
> a) doesn't work, and
> b) doesn't do anything
>
> I suggest simply removing it from the utility.
>
> I'll note that the -U option doesn't appear to do anything relevant
> either, but at least it doesn't error unnecessarily:
>
> handyrep@john:~/handyrep$ pg_isready -h john -p 5432 -U no_such_user
> john:5432 - accepting connections
>
>
The attached patch fix it.

Regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
diff --git a/src/bin/scripts/pg_isready.c b/src/bin/scripts/pg_isready.c
index d27ccea..7568df5 100644
--- a/src/bin/scripts/pg_isready.c
+++ b/src/bin/scripts/pg_isready.c
@@ -130,7 +130,7 @@ main(int argc, char **argv)
 	/*
 	 * Get the host and port so we can display them in our output
 	 */
-	if (pgdbname)
+	if (pgdbname && strchr(pgdbname, '=') != NULL)
 	{
 		opts = PQconninfoParse(pgdbname, &errmsg);
 		if (opts == NULL)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: autovacuum_work_mem

2013-11-15 Thread Nigel Heron
On Sun, Oct 20, 2013 at 7:21 AM, Magnus Hagander  wrote:
> On Sun, Oct 20, 2013 at 2:22 AM, Peter Geoghegan  wrote:

>> It seemed neater to me to create a new flag, so that in principle any
>> vacuum() code path can request autovacuum_work_mem, rather than having
>> lazyvacuum.c code call IsAutoVacuumWorkerProcess() for the same
>> purpose. To date, that's only been done within vacuumlazy.c for things
>> like logging.
>

>But I'd suggest just a:
>int vac_work_mem = (IsAutoVacuumWorkerProcess() && autovacuum_work_mem
>!= -1) ? autovacuum_work_mem : maintenance_work_mem;
>
>and not sending around a boolean flag through a bunch of places when
>it really means just the same thing,

I agree with Magnus here, calling IsAutoVacuumWorkerProcess() seems
cleaner than the new flag and the function parameter changes.

>
> Also, isn't this quite confusing:
> + # Note:  autovacuum only prefers autovacuum_work_mem over 
> maintenance_work_mem
> + #autovacuum_work_mem = -1 # min 1MB, or -1 to disable
>
> Where does the "only" come from? And we don't really use the term
> "prefers over" anywhere else there. And -1 doesn't actually disable
> it. I suggest following the pattern of the other parameters that work
> the same way, which would then just be:
>
> #autovacuum_work_mem = -1  # amount of memory for each autovacuum
> worker. -1 means use maintenance_work_mem
>

 +1

here's my review of the v1 patch,

patch features tested:
- regular VACUUM * commands ignore autovacuum_work_mem.
- if autovacuum_work_mem = -1 then maintenance_work_mem is used by autovacuum.
- if autovacuum_work_mem is set then it is used instead of
maintenance_work_mem by autovacuum.
- the autovacuum_work_mem guc has a "sighup" context and the global
variable used in the code is correctly refreshed during a reload.
- a 1MB lower limit for autovacuum_work_mem is enforced.

build (platform is fedora 18):
- patch (context format) applies to current HEAD with offsets (please rebase).
- documentation patches included.
- "make" doesn't produce any extra warnings.
- "make check" passes all tests (no extra regression tests).

questions/comments:
- should the category of the guc be "RESOURCES_MEM" (as in the patch)
or "AUTOVACUUM"? seems to fit in both, but it's really autovacuum
specific.
- could you also add documentation to the autovacuum section of
maintenance.sgml? I think people tuning their autovacuum are likely to
look there for guidance.
- could you update the comments at the top of vacuumlazy.c to reflect
this new feature?

-nigel.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] The number of character limitation of custom script on pgbench

2013-11-15 Thread Tom Lane
Sawada Masahiko  writes:
> I attached the patch which solves this problem, and have submitted to CF3.
> I changed how to get the SQL from custom script file.

This needed a bit of work:

- Use of strncat didn't seem particularly safe, or efficient.  I changed
it to explicitly account for space consumption in the result buffer.
- It leaked the buffer space used.  While this likely doesn't matter for
foreseeable usage, it seemed worth fixing.
- Didn't do the right thing for a file not ending with a newline.
- Didn't follow project code layout standards.

I've committed the attached revised version.

regards, tom lane

diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index fff71e5..2c96fae 100644
*** a/contrib/pgbench/pgbench.c
--- b/contrib/pgbench/pgbench.c
*** process_commands(char *buf)
*** 2016,2021 
--- 2016,2064 
  	return my_commands;
  }
  
+ /*
+  * Read a line from fd, and return it in a malloc'd buffer.
+  * Return NULL at EOF.
+  *
+  * The buffer will typically be larger than necessary, but we don't care
+  * in this program, because we'll free it as soon as we've parsed the line.
+  */
+ static char *
+ read_line_from_file(FILE *fd)
+ {
+ 	char		tmpbuf[BUFSIZ];
+ 	char	   *buf;
+ 	size_t		buflen = BUFSIZ;
+ 	size_t		used = 0;
+ 
+ 	buf = (char *) palloc(buflen);
+ 	buf[0] = '\0';
+ 
+ 	while (fgets(tmpbuf, BUFSIZ, fd) != NULL)
+ 	{
+ 		size_t		thislen = strlen(tmpbuf);
+ 
+ 		/* Append tmpbuf to whatever we had already */
+ 		memcpy(buf + used, tmpbuf, thislen + 1);
+ 		used += thislen;
+ 
+ 		/* Done if we collected a newline */
+ 		if (thislen > 0 && tmpbuf[thislen - 1] == '\n')
+ 			break;
+ 
+ 		/* Else, enlarge buf to ensure we can append next bufferload */
+ 		buflen += BUFSIZ;
+ 		buf = (char *) pg_realloc(buf, buflen);
+ 	}
+ 
+ 	if (used > 0)
+ 		return buf;
+ 
+ 	/* Reached EOF */
+ 	free(buf);
+ 	return NULL;
+ }
+ 
  static int
  process_file(char *filename)
  {
*** process_file(char *filename)
*** 2024,2030 
  	Command   **my_commands;
  	FILE	   *fd;
  	int			lineno;
! 	char		buf[BUFSIZ];
  	int			alloc_num;
  
  	if (num_files >= MAX_FILES)
--- 2067,2073 
  	Command   **my_commands;
  	FILE	   *fd;
  	int			lineno;
! 	char	   *buf;
  	int			alloc_num;
  
  	if (num_files >= MAX_FILES)
*** process_file(char *filename)
*** 2046,2056 
  
  	lineno = 0;
  
! 	while (fgets(buf, sizeof(buf), fd) != NULL)
  	{
  		Command*command;
  
  		command = process_commands(buf);
  		if (command == NULL)
  			continue;
  
--- 2089,2102 
  
  	lineno = 0;
  
! 	while ((buf = read_line_from_file(fd)) != NULL)
  	{
  		Command*command;
  
  		command = process_commands(buf);
+ 
+ 		free(buf);
+ 
  		if (command == NULL)
  			continue;
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread Andrew Dunstan


On 11/15/2013 07:32 PM, Josh Berkus wrote:

On 11/15/2013 04:00 PM, David Johnston wrote:

Looking at this a different way: could we just implement BSON and leave json
alone?

http://bsonspec.org/

In short?  No.

For one thing, our storage format is different from theirs (better,
frankly), and as a result is not compliant with their "standard".

That's a reason why we won't use the name BSON, either, since it's a
trademark of 10gen.




What is more, it has restrictions which we do not wish to have. See for 
example its treatment of numerics.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread Josh Berkus
On 11/15/2013 04:00 PM, David Johnston wrote:
> Looking at this a different way: could we just implement BSON and leave json
> alone?
> 
> http://bsonspec.org/

In short?  No.

For one thing, our storage format is different from theirs (better,
frankly), and as a result is not compliant with their "standard".

That's a reason why we won't use the name BSON, either, since it's a
trademark of 10gen.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread David Johnston
Looking at this a different way: could we just implement BSON and leave json
alone?

http://bsonspec.org/

David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/additional-json-functionality-tp5777975p5778656.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread David Johnston
Josh Berkus wrote
> On 11/15/2013 02:59 PM, Merlin Moncure wrote:
>>  On Fri, Nov 15, 2013 at 4:31 PM, Hannu Krosing <

> hannu@

> > wrote:
>> I think you may be on to something here.  This might also be a way
>> opt-in to fast(er) serialization (upthread it was noted this is
>> unimportant; I'm skeptical).  I deeply feel that two types is not the
>> right path but I'm pretty sure that this can be finessed.
>> 
>>> As far as I understand merlin is mostly ok with stored json being
>>> normalised and the problem is just with constructing "extended"
>>> json (a.k.a. "processing instructions") to be used as source for
>>> specialised parsers and renderers.
> 
> Thing is, I'm not particularly concerned about *Merlin's* specific use
> case, which there are ways around. What I am concerned about is that we
> may have users who have years of data stored in JSON text fields which
> won't survive an upgrade to binary JSON, because we will stop allowing
> certain things (ordering, duplicate keys) which are currently allowed in
> those columns.  At the very least, if we're going to have that kind of
> backwards compatibilty break we'll want to call the new version 10.0.
> 
> That's why naming old JSON as "json_text" won't work; it'll be a
> hardened roadblock to upgrading.

Agreed.  I can't imagine a use-case that would warrant breaking the current
behavior of "json".  Either we live with just one, text-oriented, json type
and "finesse" whatever performance gains we can without breaking
compatibility; or we introduce additional types (I personally like adding 2
instead of one but just adding the binary one would be ok) which - barring
an overwhelming desire by -core to group-self-flagellate - means giving the
new type an as yet unused name.

>From a marketing perspective having 3 types with the following properties is
an easy message to sell:

1) json - liberal interpretation w/ validation only; stored as text; output
as-is
2) json_text - strict interpretation w/ validation only; stored as text;
output as-is
3) json_binary - strict interpretation w/ validation & parsing; stored as
binary; output "normalized"

This way "json" seems less like a mistake but rather an intentional desire
to introduce a liberal type that meets data exchange needs in the short term
and now, later, a structured data storage mechanism similar to "hstore".

Even if you have json_binary I can imaging that some people would want to be
able to store the original strict json as-is.  Sure, they can use text, but
this way intent is made clear and validation is attached directly to the
type as opposed to having to be done separately.  The use-cases described
for needing a liberal "json" prove this out.  That said "json" would be an
acceptable replacement for "json_text" in many cases and separate validation
for "strict json" prior to storing into "json" isn't that heinous.

David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/additional-json-functionality-tp5777975p5778655.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread David Rowley
On Sat, Nov 16, 2013 at 4:09 AM, Alvaro Herrera wrote:

> David Rowley escribió:
> > On Fri, Nov 15, 2013 at 12:33 PM, Tomas Vondra  wrote:
>
> > > Be careful with 'Name' data type - it's not just a simple string
> buffer.
> > > AFAIK it needs to work with hashing etc. so the zeroing is actually
> needed
> > > here to make sure two values produce the same result. At least that's
> how
> > > I understand the code after a quick check - for example this is from
> the
> > > same jsonfuncs.c you mentioned:
> > >
> > > memset(fname, 0, NAMEDATALEN);
> > > strncpy(fname, NameStr(tupdesc->attrs[i]->attname), NAMEDATALEN);
> > > hashentry = hash_search(json_hash, fname, HASH_FIND, NULL);
> > >
> > > So the zeroing is on purpose, although if strncpy does that then the
> > > memset is probably superflous.
>
> This code should probably be using namecpy().  Note namecpy() doesn't
> memset() after strncpy() and has survived the test of time, which
> strongly suggests that the memset is indeed superfluous.
>
>
I went on a bit of a strncpy cleanup rampage this morning and ended up
finding quite a few places where strncpy is used wrongly.
I'm not quite sure if I have got them all in this patch, but I' think I've
got the obvious ones at least.

For the hash_search in jsconfuncs.c after thinking about it a bit more...
Can we not just pass the attname without making a copy of it? I see keyPtr
in hash_search is const void * so it shouldn't get modified in there. I
can't quite see the reason for making the copy.

Attached is a patch with various cleanups where I didn't like the look of
the strncpy. I didn't go overboard with this as I know making this sort of
small changes all over can be a bit scary and I thought maybe it would get
rejected on that basis.

I also cleaned up things like strncpy(dest, src, strlen(src)); which just
seems a bit weird and I'm failing to get my head around why it was done. I
replaced these with memcpy instead, but they could perhaps be a plain old
strcpy.

Regards

David Rowley



> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>


strncpy_cleanup_v0.1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve code in tidbitmap.c

2013-11-15 Thread Tom Lane
"Etsuro Fujita"  writes:
> I'd like to do the complementary explanation of this.
> In tbm_union_page() and tbm_intersect_page() in tidbitmap.c, WORDS_PER_PAGE
> is used in the scan of a lossy chunk, instead of WORDS_PER_CHUNK, where
> these macros are defined as:

> /* number of active words for an exact page: */
> #define WORDS_PER_PAGE  ((MAX_TUPLES_PER_PAGE - 1) / BITS_PER_BITMAPWORD +
> 1)
> /* number of active words for a lossy chunk: */
> #define WORDS_PER_CHUNK  ((PAGES_PER_CHUNK - 1) / BITS_PER_BITMAPWORD + 1)

> Though the use of WORDS_PER_PAGE in the scan of a lossy chunk is logically
> correct since these macros implicitly satisfy that WORDS_PER_CHUNK <
> WORDS_PER_PAGE, I think WORDS_PER_CHUNK should be used in the scan of a
> lossy chunk for code readability and maintenance.  So, I submitted a patch
> working in such a way in an earlier email.

This is a bug fix, not a performance improvement (any improvement would
be welcome, but that's not the point).  There's absolutely nothing
guaranteeing that WORDS_PER_CHUNK is less than WORDS_PER_PAGE, and if
it were the other way around, the code would be outright broken.  It's
pure luck that it was merely inefficient.

Committed, thanks for finding it!

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extra functionality to createuser

2013-11-15 Thread Christopher Browne
On Fri, Nov 15, 2013 at 3:14 PM, Peter Eisentraut  wrote:
> On 11/14/13, 4:35 PM, Christopher Browne wrote:> On Thu, Nov 14, 2013 at
> 5:41 AM, Sameer Thakur  wrote:
>>> So i think -g option is failing
>>
>> Right you are.
>>
>> I was missing a "g:" in the getopt_long() call.
>>
>> Attached is a revised patch that handles that.
>>
>
> src/bin/scripts/createuser.c:117: indent with spaces.
> +   case 'g':
> src/bin/scripts/createuser.c:118: indent with spaces.
> +   roles = pg_strdup(optarg);

OK, I ran pgindent on createuser.c, which leads to the Next Patch...

-- 
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"
diff --git a/doc/src/sgml/ref/createuser.sgml b/doc/src/sgml/ref/createuser.sgml
index 2f1ea2f..5fedc80 100644
--- a/doc/src/sgml/ref/createuser.sgml
+++ b/doc/src/sgml/ref/createuser.sgml
@@ -131,6 +131,16 @@ PostgreSQL documentation
  
 
  
+  -g
+  --roles
+  
+   
+Indicates roles to associate with this role.
+   
+  
+ 
+
+ 
   -i
   --inherit
   
diff --git a/src/bin/scripts/createuser.c b/src/bin/scripts/createuser.c
index d1542d9..e2e1134 100644
--- a/src/bin/scripts/createuser.c
+++ b/src/bin/scripts/createuser.c
@@ -47,6 +47,7 @@ main(int argc, char *argv[])
{"pwprompt", no_argument, NULL, 'P'},
{"encrypted", no_argument, NULL, 'E'},
{"unencrypted", no_argument, NULL, 'N'},
+   {"roles", required_argument, NULL, 'g'},
{NULL, 0, NULL, 0}
};
 
@@ -57,6 +58,7 @@ main(int argc, char *argv[])
char   *host = NULL;
char   *port = NULL;
char   *username = NULL;
+   char   *roles = NULL;
enum trivalue prompt_password = TRI_DEFAULT;
boolecho = false;
boolinteractive = false;
@@ -83,7 +85,7 @@ main(int argc, char *argv[])
 
handle_help_version_opts(argc, argv, "createuser", help);
 
-   while ((c = getopt_long(argc, argv, "h:p:U:wWedDsSaArRiIlLc:PEN",
+   while ((c = getopt_long(argc, argv, "h:p:U:g:wWedDsSaArRiIlLc:PEN",
long_options, 
&optindex)) != -1)
{
switch (c)
@@ -112,6 +114,9 @@ main(int argc, char *argv[])
case 'D':
createdb = TRI_NO;
break;
+   case 'g':
+   roles = pg_strdup(optarg);
+   break;
case 's':
case 'a':
superuser = TRI_YES;
@@ -302,6 +307,8 @@ main(int argc, char *argv[])
appendPQExpBuffer(&sql, " NOREPLICATION");
if (conn_limit != NULL)
appendPQExpBuffer(&sql, " CONNECTION LIMIT %s", conn_limit);
+   if (roles != NULL)
+   appendPQExpBuffer(&sql, " IN ROLE %s", roles);
appendPQExpBuffer(&sql, ";\n");
 
if (echo)
@@ -334,6 +341,7 @@ help(const char *progname)
printf(_("  -D, --no-createdb role cannot create databases 
(default)\n"));
printf(_("  -e, --echoshow the commands being sent to 
the server\n"));
printf(_("  -E, --encrypted   encrypt stored password\n"));
+   printf(_("  -g, --roles   roles to associate with this new 
role\n"));
printf(_("  -i, --inherit role inherits privileges of roles 
it is a\n"
 "member of (default)\n"));
printf(_("  -I, --no-inherit  role does not inherit 
privileges\n"));

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autovacuum_work_mem

2013-11-15 Thread Peter Geoghegan
On Mon, Oct 21, 2013 at 6:44 AM, Magnus Hagander  wrote:
> +1. If changing at all, then maybe just "autovacuum_mem"?

I would like to stick with autovacuum_work_mem - it reflects the fact
that it's a setting shadowed by maintenance_work_mem, without being
too verbose.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread Josh Berkus
On 11/15/2013 02:59 PM, Merlin Moncure wrote:
>  On Fri, Nov 15, 2013 at 4:31 PM, Hannu Krosing  wrote:
> I think you may be on to something here.  This might also be a way
> opt-in to fast(er) serialization (upthread it was noted this is
> unimportant; I'm skeptical).  I deeply feel that two types is not the
> right path but I'm pretty sure that this can be finessed.
> 
>> As far as I understand merlin is mostly ok with stored json being
>> normalised and the problem is just with constructing "extended"
>> json (a.k.a. "processing instructions") to be used as source for
>> specialised parsers and renderers.

Thing is, I'm not particularly concerned about *Merlin's* specific use
case, which there are ways around. What I am concerned about is that we
may have users who have years of data stored in JSON text fields which
won't survive an upgrade to binary JSON, because we will stop allowing
certain things (ordering, duplicate keys) which are currently allowed in
those columns.  At the very least, if we're going to have that kind of
backwards compatibilty break we'll want to call the new version 10.0.

That's why naming old JSON as "json_text" won't work; it'll be a
hardened roadblock to upgrading.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump insert with column names speedup

2013-11-15 Thread Tom Lane
David Rowley  writes:
> The tailing white space is fixed in the attached patch.

Applied with minor cosmetic adjustments.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread Merlin Moncure
 On Fri, Nov 15, 2013 at 4:31 PM, Hannu Krosing  wrote:
 I think we need to take a *very* hard look at #3 before exploring #1
 or #2: Haven't through it through yet but it may be possible to handle
 this in such a way that will be mostly transparent to the end user and
 may have other benefits such as a faster path for serialization.
>>> If it’s possible to preserve order and still get the advantages of binary 
>>> representation --- which are substantial (see 
>>> http://theory.so/pg/2013/10/23/testing-nested-hstore/ and 
>>> http://theory.so/pg/2013/10/25/indexing-nested-hstore/ for a couple of 
>>> examples) --- without undue maintenance overhead, then great.
>>>
>>> I am completely opposed to duplicate key preservation in JSON, though. It 
>>> has caused us a fair number of headaches at $work.
> Let's just  change the current json-constructing functions return type to
> json_text which is exactly like text with 2 extra properties:
>
> 1) it is syntax-checked for valid json (that is it can be cast to json)
>
> and
>
> 2) if included in outer json as data, it is included directly and is not
> quoted like text
>
>
> With just these two it should possible to have the following
>
> a) Merlin and others can keep (ab)using json_text as this
> wonderfully versatile format for feeding json parsers and
> visualisers which accept duplicates and consider order significant
>
> b) cast this to binary json object if de-duplication and fast access to
> internals is needed
>
> I do not think we need anything else for this

I think you may be on to something here.  This might also be a way
opt-in to fast(er) serialization (upthread it was noted this is
unimportant; I'm skeptical).  I deeply feel that two types is not the
right path but I'm pretty sure that this can be finessed.

> As far as I understand merlin is mostly ok with stored json being
> normalised and the problem is just with constructing "extended"
> json (a.k.a. "processing instructions") to be used as source for
> specialised parsers and renderers.

yes, this is correct.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread Hannu Krosing
On 11/15/2013 09:25 PM, Merlin Moncure wrote:
> On Fri, Nov 15, 2013 at 1:51 PM, David E. Wheeler  
> wrote:
>> On Nov 15, 2013, at 6:35 AM, Merlin Moncure  wrote:
>>
>>> Here are the options on the table:
>>> 1) convert existing json type to binary flavor (notwithstanding objections)
>>> 2) maintain side by side types, one representing binary, one text.
>>> unfortunately, i think the text one must get the name 'json' due to
>>> unfortunate previous decision.
>>> 3) merge the behaviors into a single type and get the best of both
>>> worlds (as suggested upthread).
>>>
>>> I think we need to take a *very* hard look at #3 before exploring #1
>>> or #2: Haven't through it through yet but it may be possible to handle
>>> this in such a way that will be mostly transparent to the end user and
>>> may have other benefits such as a faster path for serialization.
>> If it’s possible to preserve order and still get the advantages of binary 
>> representation --- which are substantial (see 
>> http://theory.so/pg/2013/10/23/testing-nested-hstore/ and 
>> http://theory.so/pg/2013/10/25/indexing-nested-hstore/ for a couple of 
>> examples) --- without undue maintenance overhead, then great.
>>
>> I am completely opposed to duplicate key preservation in JSON, though. It 
>> has caused us a fair number of headaches at $work.
Let's just  change the current json-constructing functions return type to
json_text which is exactly like text with 2 extra properties:

1) it is syntax-checked for valid json (that is it can be cast to json)

and

2) if included in outer json as data, it is included directly and is not
quoted like text


With just these two it should possible to have the following

a) Merlin and others can keep (ab)using json_text as this
wonderfully versatile format for feeding json parsers and
visualisers which accept duplicates and consider order significant

b) cast this to binary json object if de-duplication and fast access to
internals is needed

I do not think we need anything else for this

As far as I understand merlin is mostly ok with stored json being
normalised and the problem is just with constructing "extended"
json (a.k.a. "processing instructions") to be used as source for
specialised parsers and renderers.

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review:Patch: SSL: prefer server cipher order

2013-11-15 Thread Adrian Klaver

On 11/15/2013 11:49 AM, Marko Kreen wrote:

On Fri, Nov 15, 2013 at 11:16:25AM -0800, Adrian Klaver wrote:

The description of the GUCs show up in the documentation but I am
not seeing the GUCs themselves in postgresql.conf, so I could test
no further. It is entirely possible I am missing a step and would
appreciate enlightenment.


Sorry, I forgot to update sample config.

ssl-prefer-server-cipher-order-v2.patch
- Add GUC to sample config
- Change default value to 'true', per comments from Alvaro and Magnus.

ssl-ecdh-v2.patch
- Add GUC to sample config



Well that worked.
I made ssl connections to the server using psql and verified it 
respected the order of ssl_ciphers. I do not have a client available 
with a different view of cipher order so I cannot test that.


--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread David E. Wheeler
On Nov 15, 2013, at 2:02 PM, Andrew Dunstan  wrote:

> Yeah, it would be a total foot gun here I think.
> 
> I've come to the conclusion that the only possible solution is to have a 
> separate type. That's a bit sad, but there it is. The upside is that this 
> will make the work Teodor has mentioned simpler. (Desperately making lemonade 
> from lemons here.)

Fine. My bikeshedding: Call the new type "jsonb". “B” for “binary.” Also, the 
old one is implicitly "jsona". Get it?

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread Andrew Dunstan


On 11/15/2013 04:53 PM, Tom Lane wrote:

"k...@rice.edu"  writes:

On Fri, Nov 15, 2013 at 01:18:22PM -0800, Josh Berkus wrote:

I believe this was a danger we recognized when we added the JSON type,
including the possibility that a future binary type might need to be a
separate type due to compatibility issues.  The only sad thing is the
naming; it would be better for the new type to carry the JSON name in
the future, but there's no way to make that work that I can think of.

What about a GUC for json version? Then you could choose and they
could both be call json.

GUCs that change user-visible semantics have historically proven to be
much less good ideas than they seem at first glance.





Yeah, it would be a total foot gun here I think.

I've come to the conclusion that the only possible solution is to have a 
separate type. That's a bit sad, but there it is. The upside is that 
this will make the work Teodor has mentioned simpler. (Desperately 
making lemonade from lemons here.)



cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread Tom Lane
"k...@rice.edu"  writes:
> On Fri, Nov 15, 2013 at 01:18:22PM -0800, Josh Berkus wrote:
>> I believe this was a danger we recognized when we added the JSON type,
>> including the possibility that a future binary type might need to be a
>> separate type due to compatibility issues.  The only sad thing is the
>> naming; it would be better for the new type to carry the JSON name in
>> the future, but there's no way to make that work that I can think of.

> What about a GUC for json version? Then you could choose and they
> could both be call json.

GUCs that change user-visible semantics have historically proven to be
much less good ideas than they seem at first glance.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread k...@rice.edu
On Fri, Nov 15, 2013 at 01:18:22PM -0800, Josh Berkus wrote:
> 
> I believe this was a danger we recognized when we added the JSON type,
> including the possibility that a future binary type might need to be a
> separate type due to compatibility issues.  The only sad thing is the
> naming; it would be better for the new type to carry the JSON name in
> the future, but there's no way to make that work that I can think of.
> 
> -- 
> Josh Berkus
> PostgreSQL Experts Inc.
> http://pgexperts.com
> 

What about a GUC for json version? Then you could choose and they
could both be call json.

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread David Johnston
Merlin Moncure-2 wrote
>> I don't want to have two types, but I think I'd probably rather have two
>> clean types than this. I can't imagine it being remotely acceptable to
>> have
>> behaviour depend in whether or not something was ever stored, which is
>> what
>> this looks like.
> 
> Well, maybe so.  My main gripe with the 'two types' solutions is that:
> 1) current type is already in core (that is, not an extension). In
> hindsight, I think this was a huge mistake.
> 2) current type has grabbed the 'json' type name and the 'json_xxx' API.
> 3) current type is getting used all over the place
> 
> 'Two types' means that (AIUI) you can't mess around with the existing
> API too much. And the new type (due out in 2016?) will be something of
> a second citizen.  The ramifications of dealing with the bifurcation
> is what makes *my* head hurt.  Every day the json stuff is getting
> more and more widely adopted.  9.4 isn't going to drop until 2014 best
> case and it won't be widely deployed in the enterprise until 2015 and
> beyond.  So you're going to have a huge code base operating on the
> 'legacy' json type.
> 
> merlin

The current type can store the exact same data as what a hash-like type
could store.  It can also store stuff a hash-like type would not be able to
store.  From my reading the main reason for adding the new hash-like type
would be to increase the performance characteristics of using said type. So:

1) if reasonable performance can be had with the current type the new type
would be unnecessary
2) if #1 is not possible then the new type trades of leniency in format for
performance improvements

One implication of #2 is that existing json that wants the improved
performance will need to undergo a full-table rewrite in order to be
converted.

Both output textual representations are identical and function overloading
and API should be able to maintained substantially identical between the two
types.

David J



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/additional-json-functionality-tp5777975p5778628.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread Josh Berkus
On 11/15/2013 01:12 PM, David E. Wheeler wrote:
> On Nov 15, 2013, at 12:37 PM, Andrew Dunstan  wrote:
> 
>> It's making my head hurt, to be honest, and it sounds like a recipe for 
>> years and years of inconsistencies and bugs.
>>
>> I don't want to have two types, but I think I'd probably rather have two 
>> clean types than this. I can't imagine it being remotely acceptable to have 
>> behaviour depend in whether or not something was ever stored, which is what 
>> this looks like.
> 
> I disklike having two types (no, three -- there is hstore, too!). But if 
> there is consensus for it (and I am not at all convinced that there is at 
> this point), I can live with it. Docs would have to be pretty explicit, 
> though.

I would be happy to do a survey on how common key ordering and/or
duplicate keys are in postgresql+json.  However, I'm not clear on what
set of survey responses would decide us in either direction.  Even as a
pool of one, Merlin's case is a pretty persuasive example ... and, as he
points out, there will be applications built around 9.3's JSON which
havent even been written yet.

I believe this was a danger we recognized when we added the JSON type,
including the possibility that a future binary type might need to be a
separate type due to compatibility issues.  The only sad thing is the
naming; it would be better for the new type to carry the JSON name in
the future, but there's no way to make that work that I can think of.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread David E. Wheeler
On Nov 15, 2013, at 12:37 PM, Andrew Dunstan  wrote:

> It's making my head hurt, to be honest, and it sounds like a recipe for years 
> and years of inconsistencies and bugs.
> 
> I don't want to have two types, but I think I'd probably rather have two 
> clean types than this. I can't imagine it being remotely acceptable to have 
> behaviour depend in whether or not something was ever stored, which is what 
> this looks like.

I disklike having two types (no, three -- there is hstore, too!). But if there 
is consensus for it (and I am not at all convinced that there is at this 
point), I can live with it. Docs would have to be pretty explicit, though.

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning recovery.conf into GUCs

2013-11-15 Thread Josh Berkus
On 11/15/2013 06:38 AM, Jaime Casanova wrote:
>> > Please check for compiler warnings in pg_basebackup:
>> >
>> > pg_basebackup.c:1109:1: warning: ‘escapeConnectionParameter’ defined but 
>> > not used [-Wunused-function]
>> > pg_basebackup.c:1168:1: warning: ‘escape_quotes’ defined but not used 
>> > [-Wunused-function]
>> >
> those are functions that are no longer used but Josh considered they
> could become useful before release.
> i can put them inside #ifdef _NOT_USED_ decorations or just remove
> them now and if/when we find some use for them re add them

Wait, which Josh?  Not me ...

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw, remote triggers and schemas

2013-11-15 Thread Thom Brown
On 15 November 2013 21:03, Tom Lane  wrote:
> Thom Brown  writes:
>> Is this unintended, or is it something users should fix themselves by
>> being explicit about relation schemas in trigger functions?  Should
>> the schema search path instead pick up whatever the default would be
>> for the user being used for the connection?
>
> postgres_fdw intentionally runs the remote session with a very minimal
> search_path (I think just pg_catalog, in fact).  I would argue that
> any trigger that breaks because of that is broken anyway, since it
> would fail --- possibly with security implications --- if some ordinary
> user modified the search path.

That makes sense.  Would it be worth putting a note in the
documentation about the behaviour of the search path on the
postgres_fdw page?

-- 
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump insert with column names speedup

2013-11-15 Thread David Rowley
On Sat, Nov 16, 2013 at 9:04 AM, Peter Eisentraut  wrote:

> src/bin/pg_dump/pg_dump.c:1604: trailing whitespace.
> +   staticStmt = createPQExpBuffer();
> src/bin/pg_dump/pg_dump.c:1612: trailing whitespace.
> +   else
> src/bin/pg_dump/pg_dump.c:1614: trailing whitespace.
> +   if (column_inserts)
>

The tailing white space is fixed in the attached patch.

Regards

David Rowley


pg_dump_colinsert_v0.3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw, remote triggers and schemas

2013-11-15 Thread Tom Lane
Thom Brown  writes:
> I've observed an issue whereby a parent table with a trigger that
> redirects inserts to a child table fails to run the trigger
> successfully if written to using a foreign table:

That trigger is making unsafe assumptions about what search_path
it's run under.  If you don't want to schema-qualify the reference
to "child", consider attaching a "SET search_path" clause to the
trigger function definition.

> Is this unintended, or is it something users should fix themselves by
> being explicit about relation schemas in trigger functions?  Should
> the schema search path instead pick up whatever the default would be
> for the user being used for the connection?

postgres_fdw intentionally runs the remote session with a very minimal
search_path (I think just pg_catalog, in fact).  I would argue that
any trigger that breaks because of that is broken anyway, since it
would fail --- possibly with security implications --- if some ordinary
user modified the search path.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Errors on missing pg_subtrans/ files with 9.3

2013-11-15 Thread J Smith
On Fri, Nov 15, 2013 at 3:21 PM, Robert Haas  wrote:
>
> I think what would help the most is if you could arrange to obtain a
> stack backtrace at the point when the error is thrown.  Maybe put a
> long sleep call in just before the error happens, and when it gets
> stuck there, attach gdb and run bt full.
>

That could potentially be doable. Perhaps I could use something like
google-coredumper or something similar to have a core dump generated
if the error comes up? Part of the problem is that the error is so
sporadic that it's going to be tough to say when the next one will
occur. For instance, we haven't changed our load on the server, yet
the error hasn't occurred since Nov 13, 15:01. I'd also like to avoid
blocking on the server with sleep or anything like that unless
absolutely necessary, as there are other services we have in
development that are using other databases on this cluster. (I can as
a matter of last resort, of course, but if google-coredumper can do
the job I'd like to give that a shot first.)

Any hints on where I could insert something like this? Should I try
putting it into the section of elog.c dealing with ENOENT errors, or
try to find a spot closer to where the file itself is being opened? I
haven't looked at Postgres internals for a while now so I'm not quite
sure of the best location for this sort of thing.

Cheers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread Andres Freund
On 2013-11-15 12:54:53 -0800, Josh Berkus wrote:
> On 11/15/2013 12:25 PM, Merlin Moncure wrote:
> > Kinda yes, kinda no.  Here's a rough sketch of what I'm thinking:
> > 
> > *) 'json' type internally has a binary as well a text representation.
> > The text representation is basically the current type behavior
> 
> 
> 
> That's not at all workable.  Users would be completely unable to predict
> or understand the JSON type and how it acts.  That's not just violating
> POLS; that's bashing POLS' head in with a shovel.

It's also not currently possible to implement such a behaviour inside a
type's functions. You'd need core code cooperation.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread Merlin Moncure
On Fri, Nov 15, 2013 at 2:54 PM, Josh Berkus  wrote:
> On 11/15/2013 12:25 PM, Merlin Moncure wrote:
>> Kinda yes, kinda no.  Here's a rough sketch of what I'm thinking:
>>
>> *) 'json' type internally has a binary as well a text representation.
>> The text representation is basically the current type behavior
>
> 
>
> That's not at all workable.  Users would be completely unable to predict
> or understand the JSON type and how it acts.  That's not just violating
> POLS; that's bashing POLS' head in with a shovel.

All right: make a new type then, and leave the current one alone please.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread Josh Berkus
On 11/15/2013 12:25 PM, Merlin Moncure wrote:
> Kinda yes, kinda no.  Here's a rough sketch of what I'm thinking:
> 
> *) 'json' type internally has a binary as well a text representation.
> The text representation is basically the current type behavior



That's not at all workable.  Users would be completely unable to predict
or understand the JSON type and how it acts.  That's not just violating
POLS; that's bashing POLS' head in with a shovel.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread Merlin Moncure
On Fri, Nov 15, 2013 at 2:37 PM, Andrew Dunstan  wrote:
>
> On 11/15/2013 03:25 PM, Merlin Moncure wrote:
>>
>> On Fri, Nov 15, 2013 at 1:51 PM, David E. Wheeler 
>> wrote:
>>>
>>> On Nov 15, 2013, at 6:35 AM, Merlin Moncure  wrote:
>>>
 Here are the options on the table:
 1) convert existing json type to binary flavor (notwithstanding
 objections)
 2) maintain side by side types, one representing binary, one text.
 unfortunately, i think the text one must get the name 'json' due to
 unfortunate previous decision.
 3) merge the behaviors into a single type and get the best of both
 worlds (as suggested upthread).

 I think we need to take a *very* hard look at #3 before exploring #1
 or #2: Haven't through it through yet but it may be possible to handle
 this in such a way that will be mostly transparent to the end user and
 may have other benefits such as a faster path for serialization.
>>>
>>> If it’s possible to preserve order and still get the advantages of binary
>>> representation --- which are substantial (see
>>> http://theory.so/pg/2013/10/23/testing-nested-hstore/ and
>>> http://theory.so/pg/2013/10/25/indexing-nested-hstore/ for a couple of
>>> examples) --- without undue maintenance overhead, then great.
>>>
>>> I am completely opposed to duplicate key preservation in JSON, though. It
>>> has caused us a fair number of headaches at $work.
>>
>> Kinda yes, kinda no.  Here's a rough sketch of what I'm thinking:
>>
>> *) 'json' type internally has a binary as well a text representation.
>> The text representation is basically the current type behavior
>> (duduplicated unordered).  The binary representation is the hstore-ish
>> variant.  The text mode is discarded when it's deemed no longer
>> appropriate to be needed, and, once gone, can never be rebuilt as it
>> was.
>>
>> *) only the binary internal representation ever gets stored to disk
>> (or anything else).
>>
>> *) the text mode is preferred for output if it is there.  otherwise, a
>> deduplicated, reordered text representation is generated
>>
>> *) When literal text is casted to json, the binary structure is built
>> up and kept alongside binary mode.   So, if you went: 'select '{"a":
>> 1, "a": 2}'::json', you'd get the same thing back.  (This is how
>> it works now.).  but, if you went: 'insert into foo select '{"a": 1,
>>"a": 2}'::json returning *', you'd get {"a": 2} back essentially
>> (although technically that would be a kind of race).
>>
>> *) When the json is stored to table, the text representation gets
>> immediately discarded on the basis that it's no longer the true
>> representation of the data.
>>
>> *) Ditto when making any equality operation (not as sure on this point).
>>
>> *) Ditto when doing any operation that mutates the structure in any
>> way. the text representation is immutable except during serialization
>> and if it gets invalidated it gets destroyed.
>>
>> *) New API function: json_simplify(); or some such.  It reorders and
>> dedups from user's point of view (but really just kills off the text
>> representation)
>>
>> *) once the text mode is gone, you get basically the proposed 'hstore'
>> behavior.
>>
>> *) serialization functions are generally used in contexts that do not
>> store anything but get output as query data.  They create *only* the
>> text mode.  However, if the resultant json is stored anywhere, the
>> text mode is destroyed and replaced with binary variant.  This is both
>> a concession to the current behavior and an optimization of
>> 'serialization-in-query' for which I think the binary mode is pessimal
>> performance wise.  so, xxx_to_json serialization functions work
>> exactly as they do now which fixes my problem essentially.
>>
>> *) if you are unhappy with duplicates in the above, just get use to
>> calling  json_simpify() on the serialized json (or deal with in on the
>> client side).
>>
>> This is all pretty glossy, but maybe there is a way forward...
>>
>
>
> It's making my head hurt, to be honest, and it sounds like a recipe for
> years and years of inconsistencies and bugs.
>
> I don't want to have two types, but I think I'd probably rather have two
> clean types than this. I can't imagine it being remotely acceptable to have
> behaviour depend in whether or not something was ever stored, which is what
> this looks like.

Well, maybe so.  My main gripe with the 'two types' solutions is that:
1) current type is already in core (that is, not an extension). In
hindsight, I think this was a huge mistake.
2) current type has grabbed the 'json' type name and the 'json_xxx' API.
3) current type is getting used all over the place

'Two types' means that (AIUI) you can't mess around with the existing
API too much. And the new type (due out in 2016?) will be something of
a second citizen.  The ramifications of dealing with the bifurcation
is what makes *my* head hurt.  Every day the json stuff is getting
more and more widely adopte

[HACKERS] postgres_fdw, remote triggers and schemas

2013-11-15 Thread Thom Brown
Hi,

I've observed an issue whereby a parent table with a trigger that
redirects inserts to a child table fails to run the trigger
successfully if written to using a foreign table:

Example:

Database 1:

CREATE TABLE parent (id int, content text);

CREATE TABLE child () INHERITS (parent);

CREATE OR REPLACE FUNCTION redirect_func ()
  RETURNS trigger AS $$
BEGIN
  INSERT INTO child VALUES (NEW.*);
  RETURN NULL;
END; $$ language plpgsql;

CREATE TRIGGER parent_trig
  BEFORE INSERT ON parent
  FOR EACH ROW EXECUTE PROCEDURE redirect_func();


Database 2:

CREATE FOREIGN TABLE foreign_parent (id int, content text)
  SERVER local_pg_db
  OPTIONS (table_name 'parent');


Then...

postgres=# INSERT INTO foreign_parent VALUES (2, 'test2');
ERROR:  relation "child" does not exist
CONTEXT:  Remote SQL command: INSERT INTO public.parent(id, content)
VALUES ($1, $2)
PL/pgSQL function public.redirect_func() line 3 at SQL statement

I've run that remote SQL command in isolation on database 1 and it
completes successfully.

It appears that this is caused by the relation reference in the
trigger function not being explicit about the schema, as if I remove
"public" from the search_path, I can generate this issue on database 1
with the same statement.  The search_path only contains 'pg_catalog'
on the foreign table connection.

Is this unintended, or is it something users should fix themselves by
being explicit about relation schemas in trigger functions?  Should
the schema search path instead pick up whatever the default would be
for the user being used for the connection?

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread Andrew Dunstan


On 11/15/2013 03:25 PM, Merlin Moncure wrote:

On Fri, Nov 15, 2013 at 1:51 PM, David E. Wheeler  wrote:

On Nov 15, 2013, at 6:35 AM, Merlin Moncure  wrote:


Here are the options on the table:
1) convert existing json type to binary flavor (notwithstanding objections)
2) maintain side by side types, one representing binary, one text.
unfortunately, i think the text one must get the name 'json' due to
unfortunate previous decision.
3) merge the behaviors into a single type and get the best of both
worlds (as suggested upthread).

I think we need to take a *very* hard look at #3 before exploring #1
or #2: Haven't through it through yet but it may be possible to handle
this in such a way that will be mostly transparent to the end user and
may have other benefits such as a faster path for serialization.

If it’s possible to preserve order and still get the advantages of binary 
representation --- which are substantial (see 
http://theory.so/pg/2013/10/23/testing-nested-hstore/ and 
http://theory.so/pg/2013/10/25/indexing-nested-hstore/ for a couple of 
examples) --- without undue maintenance overhead, then great.

I am completely opposed to duplicate key preservation in JSON, though. It has 
caused us a fair number of headaches at $work.

Kinda yes, kinda no.  Here's a rough sketch of what I'm thinking:

*) 'json' type internally has a binary as well a text representation.
The text representation is basically the current type behavior
(duduplicated unordered).  The binary representation is the hstore-ish
variant.  The text mode is discarded when it's deemed no longer
appropriate to be needed, and, once gone, can never be rebuilt as it
was.

*) only the binary internal representation ever gets stored to disk
(or anything else).

*) the text mode is preferred for output if it is there.  otherwise, a
deduplicated, reordered text representation is generated

*) When literal text is casted to json, the binary structure is built
up and kept alongside binary mode.   So, if you went: 'select '{"a":
1, "a": 2}'::json', you'd get the same thing back.  (This is how
it works now.).  but, if you went: 'insert into foo select '{"a": 1,
   "a": 2}'::json returning *', you'd get {"a": 2} back essentially
(although technically that would be a kind of race).

*) When the json is stored to table, the text representation gets
immediately discarded on the basis that it's no longer the true
representation of the data.

*) Ditto when making any equality operation (not as sure on this point).

*) Ditto when doing any operation that mutates the structure in any
way. the text representation is immutable except during serialization
and if it gets invalidated it gets destroyed.

*) New API function: json_simplify(); or some such.  It reorders and
dedups from user's point of view (but really just kills off the text
representation)

*) once the text mode is gone, you get basically the proposed 'hstore' behavior.

*) serialization functions are generally used in contexts that do not
store anything but get output as query data.  They create *only* the
text mode.  However, if the resultant json is stored anywhere, the
text mode is destroyed and replaced with binary variant.  This is both
a concession to the current behavior and an optimization of
'serialization-in-query' for which I think the binary mode is pessimal
performance wise.  so, xxx_to_json serialization functions work
exactly as they do now which fixes my problem essentially.

*) if you are unhappy with duplicates in the above, just get use to
calling  json_simpify() on the serialized json (or deal with in on the
client side).

This is all pretty glossy, but maybe there is a way forward...




It's making my head hurt, to be honest, and it sounds like a recipe for 
years and years of inconsistencies and bugs.


I don't want to have two types, but I think I'd probably rather have two 
clean types than this. I can't imagine it being remotely acceptable to 
have behaviour depend in whether or not something was ever stored, which 
is what this looks like.


cheers

andrew











--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread Merlin Moncure
On Fri, Nov 15, 2013 at 1:51 PM, David E. Wheeler  wrote:
> On Nov 15, 2013, at 6:35 AM, Merlin Moncure  wrote:
>
>> Here are the options on the table:
>> 1) convert existing json type to binary flavor (notwithstanding objections)
>> 2) maintain side by side types, one representing binary, one text.
>> unfortunately, i think the text one must get the name 'json' due to
>> unfortunate previous decision.
>> 3) merge the behaviors into a single type and get the best of both
>> worlds (as suggested upthread).
>>
>> I think we need to take a *very* hard look at #3 before exploring #1
>> or #2: Haven't through it through yet but it may be possible to handle
>> this in such a way that will be mostly transparent to the end user and
>> may have other benefits such as a faster path for serialization.
>
> If it’s possible to preserve order and still get the advantages of binary 
> representation --- which are substantial (see 
> http://theory.so/pg/2013/10/23/testing-nested-hstore/ and 
> http://theory.so/pg/2013/10/25/indexing-nested-hstore/ for a couple of 
> examples) --- without undue maintenance overhead, then great.
>
> I am completely opposed to duplicate key preservation in JSON, though. It has 
> caused us a fair number of headaches at $work.

Kinda yes, kinda no.  Here's a rough sketch of what I'm thinking:

*) 'json' type internally has a binary as well a text representation.
The text representation is basically the current type behavior
(duduplicated unordered).  The binary representation is the hstore-ish
variant.  The text mode is discarded when it's deemed no longer
appropriate to be needed, and, once gone, can never be rebuilt as it
was.

*) only the binary internal representation ever gets stored to disk
(or anything else).

*) the text mode is preferred for output if it is there.  otherwise, a
deduplicated, reordered text representation is generated

*) When literal text is casted to json, the binary structure is built
up and kept alongside binary mode.   So, if you went: 'select '{"a":
1, "a": 2}'::json', you'd get the same thing back.  (This is how
it works now.).  but, if you went: 'insert into foo select '{"a": 1,
  "a": 2}'::json returning *', you'd get {"a": 2} back essentially
(although technically that would be a kind of race).

*) When the json is stored to table, the text representation gets
immediately discarded on the basis that it's no longer the true
representation of the data.

*) Ditto when making any equality operation (not as sure on this point).

*) Ditto when doing any operation that mutates the structure in any
way. the text representation is immutable except during serialization
and if it gets invalidated it gets destroyed.

*) New API function: json_simplify(); or some such.  It reorders and
dedups from user's point of view (but really just kills off the text
representation)

*) once the text mode is gone, you get basically the proposed 'hstore' behavior.

*) serialization functions are generally used in contexts that do not
store anything but get output as query data.  They create *only* the
text mode.  However, if the resultant json is stored anywhere, the
text mode is destroyed and replaced with binary variant.  This is both
a concession to the current behavior and an optimization of
'serialization-in-query' for which I think the binary mode is pessimal
performance wise.  so, xxx_to_json serialization functions work
exactly as they do now which fixes my problem essentially.

*) if you are unhappy with duplicates in the above, just get use to
calling  json_simpify() on the serialized json (or deal with in on the
client side).

This is all pretty glossy, but maybe there is a way forward...

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Errors on missing pg_subtrans/ files with 9.3

2013-11-15 Thread Robert Haas
On Wed, Nov 13, 2013 at 12:29 PM, J Smith  wrote:
> Looks like we got another set of errors overnight. Here's the log file
> from the errors. (Log file scrubbed slightly to remove private data,
> but still representative of the problem I believe.)
>
> Nov 13 05:34:34 dev postgres[6084]: [4-1] user=dev,db=dev ERROR:
> could not access status of transaction 6337381
> Nov 13 05:34:34 dev postgres[6084]: [4-2] user=dev,db=dev DETAIL:
> Could not open file "pg_subtrans/0060": No such file or directory.
> Nov 13 05:34:34 dev postgres[6084]: [4-3] user=dev,db=dev CONTEXT:
> SQL statement "SELECT 1 FROM ONLY "typhon"."collection_batches" x
> WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x"
> Nov 13 05:34:34 dev postgres[6084]: [4-4] user=dev,db=dev STATEMENT:
> update listings set deactivated_at=$1 where id=$2 and lock_version=$3
> Nov 13 05:34:34 dev postgres[6076]: [4-1] user=dev,db=dev ERROR:
> could not access status of transaction 6337381
> Nov 13 05:34:34 dev postgres[6076]: [4-2] user=dev,db=dev DETAIL:
> Could not open file "pg_subtrans/0060": No such file or directory.
> Nov 13 05:34:34 dev postgres[6076]: [4-3] user=dev,db=dev CONTEXT:
> SQL statement "SELECT 1 FROM ONLY "typhon"."collection_batches" x
> WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x"
> Nov 13 05:34:34 dev postgres[6076]: [4-4] user=dev,db=dev STATEMENT:
> update listings set deactivated_at=$1 where id=$2 and lock_version=$3
> Nov 13 05:34:34 dev postgres[6087]: [4-1] user=dev,db=dev ERROR:
> could not access status of transaction 6337381
> Nov 13 05:34:34 dev postgres[6087]: [4-2] user=dev,db=dev DETAIL:
> Could not open file "pg_subtrans/0060": No such file or directory.
> Nov 13 05:34:34 dev postgres[6087]: [4-3] user=dev,db=dev CONTEXT:
> SQL statement "SELECT 1 FROM ONLY "typhon"."collection_batches" x
> WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x"
> Nov 13 05:34:34 dev postgres[6087]: [4-4] user=dev,db=dev STATEMENT:
> update listings set deactivated_at=$1 where id=$2 and lock_version=$3
> Nov 13 05:34:34 dev postgres[6086]: [4-1] user=dev,db=dev ERROR:
> could not access status of transaction 6337381
> Nov 13 05:34:34 dev postgres[6086]: [4-2] user=dev,db=dev DETAIL:
> Could not open file "pg_subtrans/0060": No such file or directory.
> Nov 13 05:34:34 dev postgres[6086]: [4-3] user=dev,db=dev CONTEXT:
> SQL statement "SELECT 1 FROM ONLY "typhon"."collection_batches" x
> WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x"
> Nov 13 05:34:34 dev postgres[6086]: [4-4] user=dev,db=dev STATEMENT:
> update listings set deactivated_at=$1 where id=$2 and lock_version=$3
> Nov 13 05:34:34 dev postgres[6088]: [4-1] user=dev,db=dev ERROR:
> could not access status of transaction 6337381
> Nov 13 05:34:34 dev postgres[6088]: [4-2] user=dev,db=dev DETAIL:
> Could not open file "pg_subtrans/0060": No such file or directory.
> Nov 13 05:34:34 dev postgres[6088]: [4-3] user=dev,db=dev CONTEXT:
> SQL statement "SELECT 1 FROM ONLY "typhon"."collection_batches" x
> WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x"
> Nov 13 05:34:34 dev postgres[6088]: [4-4] user=dev,db=dev STATEMENT:
> update listings set deactivated_at=$1 where id=$2 and lock_version=$3
> Nov 13 05:34:34 dev postgres[6085]: [4-1] user=dev,db=dev ERROR:
> could not access status of transaction 6337381
> Nov 13 05:34:34 dev postgres[6085]: [4-2] user=dev,db=dev DETAIL:
> Could not open file "pg_subtrans/0060": No such file or directory.
> Nov 13 05:34:34 dev postgres[6085]: [4-3] user=dev,db=dev CONTEXT:
> SQL statement "SELECT 1 FROM ONLY "typhon"."collection_batches" x
> WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x"
> Nov 13 05:34:34 dev postgres[6085]: [4-4] user=dev,db=dev STATEMENT:
> update listings set deactivated_at=$1 where id=$2 and lock_version=$3
>
> Several processes all seemed to hit the problem at the same moment,
> and all of them refer to the same transaction ID. Again, the file
> pg_subtrans/0060 doesn't exist, and the only file that does exist is
> pg_subtrans/005A which appears to be a zeroed-out file 245760 bytes in
> length.
>
> Still don't have a clue as to how I can reproduce the problem. It
> seems that in all cases the error occurred during either an UPDATE to
> a table_X or an INSERT to table_Y. In all cases, the error occurred in
> a manner identical to those shown in the log above, the only
> difference being either an UPDATE on table_X or an INSERT on table_Y.
>
> Not sure what direction I should head to now. Perhaps some aggressive
> logging would help, so we can see the queries surrounding the
> problems? I could reconfigure things to capture all statements and set
> up monit or something to send an alert when the problem resurfaces,
> for instance.
>
> Cheers all.

I think what would help the most is if you could arrange to obtain a
stack backtrace at the point when the error is thrown.  Maybe put a
long sleep call in just before the error happens, and when it gets
stuck there, attach gdb and run bt 

Re: [HACKERS] Extra functionality to createuser

2013-11-15 Thread Peter Eisentraut
On 11/14/13, 4:35 PM, Christopher Browne wrote:> On Thu, Nov 14, 2013 at
5:41 AM, Sameer Thakur  wrote:
>> So i think -g option is failing
>
> Right you are.
>
> I was missing a "g:" in the getopt_long() call.
>
> Attached is a revised patch that handles that.
>

src/bin/scripts/createuser.c:117: indent with spaces.
+   case 'g':
src/bin/scripts/createuser.c:118: indent with spaces.
+   roles = pg_strdup(optarg);


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Alexander Korotkov
On Sat, Nov 16, 2013 at 12:10 AM, Peter Eisentraut  wrote:

> On 11/14/13, 12:26 PM, Alexander Korotkov wrote:
> > Revised version of patch is attached.
>
> This doesn't build:
>
> ginget.c: In function ‘scanPage’:
> ginget.c:1108:2: warning: implicit declaration of function
> ‘GinDataLeafPageGetPostingListEnd’ [-Wimplicit-function-declaration]
> ginget.c:1108:9: warning: assignment makes pointer from integer without a
> cast [enabled by default]
> ginget.c:1109:18: error: ‘GinDataLeafIndexCount’ undeclared (first use in
> this function)
> ginget.c:1109:18: note: each undeclared identifier is reported only once
> for each function it appears in
> ginget.c::3: error: unknown type name ‘GinDataLeafItemIndex’
> ginget.c::3: warning: implicit declaration of function
> ‘GinPageGetIndexes’ [-Wimplicit-function-declaration]
> ginget.c::57: error: subscripted value is neither array nor pointer
> nor vector
> ginget.c:1112:12: error: request for member ‘pageOffset’ in something not
> a structure or union
> ginget.c:1115:38: error: request for member ‘iptr’ in something not a
> structure or union
> ginget.c:1118:230: error: request for member ‘pageOffset’ in something not
> a structure or union
> ginget.c:1119:16: error: request for member ‘iptr’ in something not a
> structure or union
> ginget.c:1123:233: error: request for member ‘pageOffset’ in something not
> a structure or union
> ginget.c:1136:3: warning: implicit declaration of function
> ‘ginDataPageLeafReadItemPointer’ [-Wimplicit-function-declaration]
> ginget.c:1136:7: warning: assignment makes pointer from integer without a
> cast [enabled by default]
>

This patch is against gin packed posting lists patch. Doesn't compile
separately.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Peter Eisentraut
On 11/14/13, 12:26 PM, Alexander Korotkov wrote:
> Revised version of patch is attached.

This doesn't build:

ginget.c: In function ‘scanPage’:
ginget.c:1108:2: warning: implicit declaration of function 
‘GinDataLeafPageGetPostingListEnd’ [-Wimplicit-function-declaration]
ginget.c:1108:9: warning: assignment makes pointer from integer without a cast 
[enabled by default]
ginget.c:1109:18: error: ‘GinDataLeafIndexCount’ undeclared (first use in this 
function)
ginget.c:1109:18: note: each undeclared identifier is reported only once for 
each function it appears in
ginget.c::3: error: unknown type name ‘GinDataLeafItemIndex’
ginget.c::3: warning: implicit declaration of function ‘GinPageGetIndexes’ 
[-Wimplicit-function-declaration]
ginget.c::57: error: subscripted value is neither array nor pointer nor 
vector
ginget.c:1112:12: error: request for member ‘pageOffset’ in something not a 
structure or union
ginget.c:1115:38: error: request for member ‘iptr’ in something not a structure 
or union
ginget.c:1118:230: error: request for member ‘pageOffset’ in something not a 
structure or union
ginget.c:1119:16: error: request for member ‘iptr’ in something not a structure 
or union
ginget.c:1123:233: error: request for member ‘pageOffset’ in something not a 
structure or union
ginget.c:1136:3: warning: implicit declaration of function 
‘ginDataPageLeafReadItemPointer’ [-Wimplicit-function-declaration]
ginget.c:1136:7: warning: assignment makes pointer from integer without a cast 
[enabled by default]



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2013-11-15 Thread Andres Freund
Hi,

On 2013-11-15 11:40:17 +0900, Michael Paquier wrote:
> - 20131114_3_reindex_concurrently.patch, providing the core feature.
> Patch 3 needs to have patch 2 applied first. Regression tests,
> isolation tests and documentation are included with the patch.

Have you addressed my concurrency concerns from the last version?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2

2013-11-15 Thread Andres Freund
On 2013-11-15 20:47:26 +0100, Andres Freund wrote:
> Hi,
> 
> On 2013-09-27 00:55:45 +0200, Andres Freund wrote:
> > So what's todo? The file header tells us:
> >  * - revive pure-spinlock implementation
> >  * - abstract away atomic ops, we really only need a few.
> >  *   - CAS
> >  *   - LOCK XADD
> >  * - convert PGPROC->lwWaitLink to ilist.h slist or even dlist.
> >  * - remove LWLockWakeup dealing with MyProc
> >  * - overhaul the mask offsets, make SHARED/EXCLUSIVE_LOCK_MASK wider, 
> > MAX_BACKENDS
> 
> So, here's the next version of this patchset:
> 1) I've added an abstracted atomic ops implementation. Needs a fair
>amount of work, also submitted as a separate CF entry. (Patch 1 & 2)
> 2) I've converted PGPROC->lwWaiting into a dlist. That makes a fair bit
>of code easier to read and reduces the size of the patchset. Also
>fixes a bug in the xlog-scalability code. (Patch 3)
> 3) Alvaro and I updated the comments in lwlock.c. (Patch 4)
> 
> I think 2) should be committable pretty soon. It's imo a pretty clear
> win in readability. 1) will need a good bit of more work.
> 
> With regard to the scalable lwlock work, what's most needed now is a
> good amount of testing.
> 
> Please note that you need to 'autoreconf' after applying the patchset. I
> don't have a compatible autoconf version on this computer causing the
> diff to be humongous if I include those changes.

Please also note that due to the current state of the atomics
implementation this likely will only work on a somewhat recent gcc and
that the performance might be slightly worse than in the previous
version because the atomic add is implemented using the CAS fallback.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2013-11-15 Thread Jeff Janes
On Fri, Nov 8, 2013 at 12:11 PM, Alvaro Herrera wrote:

> Erik Rijkers wrote:
> > On Thu, September 26, 2013 00:34, Erik Rijkers wrote:
> > > On Wed, September 25, 2013 22:34, Alvaro Herrera wrote:
> > >
> > >> [minmax-5.patch]
> > >
> > > I have the impression it's not quite working correctly.
>
> Here's a version 7 of the patch, which fixes these bugs and adds
> opclasses for a bunch more types (timestamp, timestamptz, date, time,
> timetz), courtesy of Martín Marqués.  It's also been rebased to apply
> cleanly on top of today's master branch.
>
> I have also added a selectivity function, but I'm not positive that it's
> very useful yet.
>



I tested it with attached script, but broke out of the "for" loop after 5
iterations (when it had 300,000,005 rows inserted)

Then I did an analyze, and got an error message below:

jjanes=# analyze;
ERROR:  could not truncate file "base/16384/16388_vm" to 488 blocks: it's
only 82 blocks now

16388 is the index's relfilenode.

Here is the backtrace upon entry to the truncate that is going to fail:

#0  mdtruncate (reln=0x23c91b0, forknum=VISIBILITYMAP_FORKNUM, nblocks=488)
at md.c:858
#1  0x0048eb4a in mmRevmapTruncate (rmAccess=0x26ad878,
heapNumBlocks=1327434) at mmrevmap.c:360
#2  0x0048d37a in mmvacuumcleanup (fcinfo=) at
minmax.c:1264
#3  0x0072dcef in FunctionCall2Coll (flinfo=,
collation=, arg1=,
arg2=) at fmgr.c:1323
#4  0x0048c1e5 in index_vacuum_cleanup (info=,
stats=0x0) at indexam.c:715
#5  0x0052a7ce in do_analyze_rel (onerel=0x7f59798589e8,
vacstmt=0x23b0bd8, acquirefunc=0x5298d0 ,
relpages=1327434,
inh=0 '\000', elevel=13) at analyze.c:634
#6  0x0052b320 in analyze_rel (relid=,
vacstmt=0x23b0bd8, bstrategy=) at analyze.c:267
#7  0x0057cba7 in vacuum (vacstmt=0x23b0bd8, relid=, do_toast=1 '\001', bstrategy=,
for_wraparound=0 '\000', isTopLevel=) at
vacuum.c:249
#8  0x00663177 in standard_ProcessUtility (parsetree=0x23b0bd8,
queryString=, context=,
params=0x0,
dest=, completionTag=) at
utility.c:682
#9  0x7f598290b791 in pgss_ProcessUtility (parsetree=0x23b0bd8,
queryString=0x23b0220 "analyze \n;", context=PROCESS_UTILITY_TOPLEVEL,
params=0x0,
dest=0x23b0f18, completionTag=0x7fffd3442f30 "") at
pg_stat_statements.c:825
#10 0x0065fcf7 in PortalRunUtility (portal=0x24195e0,
utilityStmt=0x23b0bd8, isTopLevel=1 '\001', dest=0x23b0f18,
completionTag=0x7fffd3442f30 "")
at pquery.c:1187
#11 0x00660c6d in PortalRunMulti (portal=0x24195e0, isTopLevel=1
'\001', dest=0x23b0f18, altdest=0x23b0f18, completionTag=0x7fffd3442f30 "")
at pquery.c:1318
#12 0x00661323 in PortalRun (portal=0x24195e0,
count=9223372036854775807, isTopLevel=1 '\001', dest=0x23b0f18,
altdest=0x23b0f18,
completionTag=0x7fffd3442f30 "") at pquery.c:816
#13 0x0065dbb4 in exec_simple_query (query_string=0x23b0220
"analyze \n;") at postgres.c:1048
#14 0x0065f259 in PostgresMain (argc=,
argv=, dbname=0x2347be8 "jjanes", username=)
at postgres.c:3992
#15 0x0061b7d0 in BackendRun (argc=,
argv=) at postmaster.c:4085
#16 BackendStartup (argc=, argv=)
at postmaster.c:3774
#17 ServerLoop (argc=, argv=) at
postmaster.c:1585
#18 PostmasterMain (argc=, argv=)
at postmaster.c:1240
#19 0x005b5e90 in main (argc=3, argv=0x2346cd0) at main.c:196


Cheers,

Jeff


minmax_test3.sh
Description: Bourne shell script

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump insert with column names speedup

2013-11-15 Thread Peter Eisentraut
src/bin/pg_dump/pg_dump.c:1604: trailing whitespace.
+   staticStmt = createPQExpBuffer(); 
src/bin/pg_dump/pg_dump.c:1612: trailing whitespace.
+   else 
src/bin/pg_dump/pg_dump.c:1614: trailing whitespace.
+   if (column_inserts) 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] additional json functionality

2013-11-15 Thread David E. Wheeler
On Nov 15, 2013, at 6:35 AM, Merlin Moncure  wrote:

> Here are the options on the table:
> 1) convert existing json type to binary flavor (notwithstanding objections)
> 2) maintain side by side types, one representing binary, one text.
> unfortunately, i think the text one must get the name 'json' due to
> unfortunate previous decision.
> 3) merge the behaviors into a single type and get the best of both
> worlds (as suggested upthread).
> 
> I think we need to take a *very* hard look at #3 before exploring #1
> or #2: Haven't through it through yet but it may be possible to handle
> this in such a way that will be mostly transparent to the end user and
> may have other benefits such as a faster path for serialization.

If it’s possible to preserve order and still get the advantages of binary 
representation --- which are substantial (see 
http://theory.so/pg/2013/10/23/testing-nested-hstore/ and 
http://theory.so/pg/2013/10/25/indexing-nested-hstore/ for a couple of 
examples) --- without undue maintenance overhead, then great.

I am completely opposed to duplicate key preservation in JSON, though. It has 
caused us a fair number of headaches at $work.

Best,

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review:Patch: SSL: prefer server cipher order

2013-11-15 Thread Marko Kreen
On Fri, Nov 15, 2013 at 11:16:25AM -0800, Adrian Klaver wrote:
> The description of the GUCs show up in the documentation but I am
> not seeing the GUCs themselves in postgresql.conf, so I could test
> no further. It is entirely possible I am missing a step and would
> appreciate enlightenment.

Sorry, I forgot to update sample config.

ssl-prefer-server-cipher-order-v2.patch
- Add GUC to sample config
- Change default value to 'true', per comments from Alvaro and Magnus.

ssl-ecdh-v2.patch
- Add GUC to sample config

-- 
marko

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 77a9303..56bfa01 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -883,6 +883,18 @@ include 'filename'
   
  
 
+ 
+  ssl_prefer_server_ciphers (bool)
+  
+   ssl_prefer_server_ciphers configuration parameter
+  
+  
+   
+Specifies whether to prefer client or server ciphersuite.
+   
+  
+ 
+
  
   password_encryption (boolean)
   
diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 7f01a78..2094674 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -112,6 +112,9 @@ static bool ssl_loaded_verify_locations = false;
 /* GUC variable controlling SSL cipher list */
 char	   *SSLCipherSuites = NULL;
 
+/* GUC variable: if false, prefer client ciphers */
+bool	   SSLPreferServerCiphers;
+
 /*  */
 /*		 Hardcoded values		*/
 /*  */
@@ -845,6 +848,10 @@ initialize_SSL(void)
 	if (SSL_CTX_set_cipher_list(SSL_context, SSLCipherSuites) != 1)
 		elog(FATAL, "could not set the cipher list (no valid ciphers available)");
 
+	/* Let server choose order */
+	if (SSLPreferServerCiphers)
+		SSL_CTX_set_options(SSL_context, SSL_OP_CIPHER_SERVER_PREFERENCE);
+
 	/*
 	 * Load CA store, so we can verify client certificates if needed.
 	 */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 54d8078..0ec5ddf 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -127,6 +127,7 @@ extern char *temp_tablespaces;
 extern bool ignore_checksum_failure;
 extern bool synchronize_seqscans;
 extern char *SSLCipherSuites;
+extern bool SSLPreferServerCiphers;
 
 #ifdef TRACE_SORT
 extern bool trace_sort;
@@ -801,6 +802,15 @@ static struct config_bool ConfigureNamesBool[] =
 		check_ssl, NULL, NULL
 	},
 	{
+		{"ssl_prefer_server_ciphers", PGC_POSTMASTER, CONN_AUTH_SECURITY,
+			gettext_noop("Give priority to server ciphersuite order."),
+			NULL
+		},
+		&SSLPreferServerCiphers,
+		true,
+		NULL, NULL, NULL
+	},
+	{
 		{"fsync", PGC_SIGHUP, WAL_SETTINGS,
 			gettext_noop("Forces synchronization of updates to disk."),
 			gettext_noop("The server will use the fsync() system call in several places to make "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 34a2d05..a190e5f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -86,6 +86,7 @@
 #ssl_key_file = 'server.key'		# (change requires restart)
 #ssl_ca_file = ''			# (change requires restart)
 #ssl_crl_file = ''			# (change requires restart)
+#ssl_prefer_server_ciphers = on		# (change requires restart)
 #password_encryption = on
 #db_user_namespace = off
 
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 56bfa01..3785052 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -895,6 +895,19 @@ include 'filename'
   
  
 
+ 
+  ssl_ecdh_curve (string)
+  
+   ssl_ecdh_curve configuration parameter
+  
+  
+   
+Specifies name of EC curve which will be used in ECDH key excanges.
+Default is prime256p1.
+   
+  
+ 
+
  
   password_encryption (boolean)
   
diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 2094674..8d688f2 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -69,6 +69,9 @@
 #if SSLEAY_VERSION_NUMBER >= 0x0907000L
 #include 
 #endif
+#if (OPENSSL_VERSION_NUMBER >= 0x0090800fL) && !defined(OPENSSL_NO_ECDH)
+#include 
+#endif
 #endif   /* USE_SSL */
 
 #include "libpq/libpq.h"
@@ -112,6 +115,9 @@ static bool ssl_loaded_verify_locations = false;
 /* GUC variable controlling SSL cipher list */
 char	   *SSLCipherSuites = NULL;
 
+/* GUC variable for default ECHD curve. */
+char	   *SSLECDHCurve;
+
 /* GUC variable: if false, prefer client ciphers */
 bool	   SSLPreferServerCiphers;
 
@@ -765,6 +771,29 @@ info_cb(const SSL *ssl, int type, int args)
 	}
 }
 
+#if (OPENSSL_VERSION_NUMBER >= 0x0090800fL) && !defined(OPENSSL_NO_ECDH)
+static void
+initialize_ecdh(void)
+{
+	EC_KEY *ecdh;
+	int nid;
+
+	nid = OBJ_sn2nid(SSLECDHCurve);
+	if (!nid)
+		elog(FATAL, "ECDH: curve n

Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Alexander Korotkov
On Fri, Nov 15, 2013 at 11:39 PM, Rod Taylor  wrote:

>
>
>
> On Fri, Nov 15, 2013 at 2:26 PM, Alexander Korotkov 
> wrote:
>
>> On Fri, Nov 15, 2013 at 11:18 PM, Rod Taylor wrote:
>>
>>> 2%.
>>>
>>> It's essentially sentence fragments from 1 to 5 words in length. I
>>> wasn't expecting it to be much smaller.
>>>
>>> 10 recent value selections:
>>>
>>>  white vinegar reduce color running
>>>  vinegar cure uti
>>>  cane vinegar acidity depends parameter
>>>  how remedy fir clogged shower
>>>  use vinegar sensitive skin
>>>  home remedies removing rust heating
>>>  does non raw apple cider
>>>  home remedies help maintain healthy
>>>  can vinegar mess up your
>>>  apple cide vineger ph balance
>>>
>>
>> I didn't get why it's not significantly smaller. Is it possible to share
>> dump?
>>
>
> Sorry, I reported that incorrectly. It's not something I was actually
> looking for and didn't pay much attention to at the time.
>
> The patched index is 58% of the 9.4 master size. 212 MB instead of 365 MB.
>

Good. That's meet my expectations :)
You mention that both master and patched versions was compiled with debug.
Was cassert enabled?

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Rod Taylor
On Fri, Nov 15, 2013 at 2:26 PM, Alexander Korotkov wrote:

> On Fri, Nov 15, 2013 at 11:18 PM, Rod Taylor  wrote:
>
>> 2%.
>>
>> It's essentially sentence fragments from 1 to 5 words in length. I wasn't
>> expecting it to be much smaller.
>>
>> 10 recent value selections:
>>
>>  white vinegar reduce color running
>>  vinegar cure uti
>>  cane vinegar acidity depends parameter
>>  how remedy fir clogged shower
>>  use vinegar sensitive skin
>>  home remedies removing rust heating
>>  does non raw apple cider
>>  home remedies help maintain healthy
>>  can vinegar mess up your
>>  apple cide vineger ph balance
>>
>
> I didn't get why it's not significantly smaller. Is it possible to share
> dump?
>

Sorry, I reported that incorrectly. It's not something I was actually
looking for and didn't pay much attention to at the time.

The patched index is 58% of the 9.4 master size. 212 MB instead of 365 MB.


Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Alexander Korotkov
On Fri, Nov 15, 2013 at 11:18 PM, Rod Taylor  wrote:

> 2%.
>
> It's essentially sentence fragments from 1 to 5 words in length. I wasn't
> expecting it to be much smaller.
>
> 10 recent value selections:
>
>  white vinegar reduce color running
>  vinegar cure uti
>  cane vinegar acidity depends parameter
>  how remedy fir clogged shower
>  use vinegar sensitive skin
>  home remedies removing rust heating
>  does non raw apple cider
>  home remedies help maintain healthy
>  can vinegar mess up your
>  apple cide vineger ph balance
>

I didn't get why it's not significantly smaller. Is it possible to share
dump?

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Cannot allocate memory

2013-11-15 Thread Kevin Grittner
"Heng Zhi Feng (zh...@hsr.ch)"  wrote:

> Virtual Machine – Ubuntu 13.10
> 1.92GB Memory
> 2 Parallel Processors

> work_mem = 11MB

> shared_buffers = 448MB
> max_connections = 80

> 2013-11-15 11:02:35 CET LOG:  could not fork autovacuum worker process: 
> Cannot allocate memory
> 2013-11-15 11:02:36 CET LOG:  could not send data to client: Broken pipe
> 2013-11-15 11:02:36 CET LOG:  unexpected EOF on client connection

Before you start PostgreSQL, what does `free -m` show?

On such a tiny machine, some of the usual advice needs to be
modified a bit.  Sure, people say to start with shared_buffers at
25% of machine RAM, but if the (virtual) machine has so little RAM
that the OS is already taking a significant percentage, I would say
to go with 25% of what is free (excluding OS cache).  Likewise, the
advice I usually give to start with work_mem at 25% of machine RAM
divided by max_connections should be based on *available* RAM.  So
4MB to 5MB is probably going to be more appropriate than 11MB.  You
will probably need to reduce temp_buffers to 2MB or less -- right
now 1/3 of your machine RAM could be tied up in space reserved for
caching temporary table data, not released until connections close.

Since this VM is tight on resources and only has two cores, you
might want to use pgbouncer, configured in transaction mode with a
pool limited to something like 5 connections, so that you can
increase work_mem and avoid over-taxing the resources you have.

http://wiki.postgresql.org/wiki/Number_Of_Database_Connections

-- 
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Rod Taylor
2%.

It's essentially sentence fragments from 1 to 5 words in length. I wasn't
expecting it to be much smaller.

10 recent value selections:

 white vinegar reduce color running
 vinegar cure uti
 cane vinegar acidity depends parameter
 how remedy fir clogged shower
 use vinegar sensitive skin
 home remedies removing rust heating
 does non raw apple cider
 home remedies help maintain healthy
 can vinegar mess up your
 apple cide vineger ph balance

regards,

Rod



On Fri, Nov 15, 2013 at 12:51 PM, Alexander Korotkov
wrote:

> On Fri, Nov 15, 2013 at 6:57 PM, Rod Taylor  wrote:
>
>> I tried again this morning using gin-packed-postinglists-16.patch and
>> gin-fast-scan.6.patch. No crashes.
>>
>> It is about a 0.1% random sample of production data (10,000,000 records)
>> with the below structure. Pg was compiled with debug enabled in both cases.
>>
>>   Table "public.kp"
>>  Column |  Type   | Modifiers
>> +-+---
>>  id | bigint  | not null
>>  string | text| not null
>>  score1 | integer |
>>  score2 | integer |
>>  score3 | integer |
>>  score4 | integer |
>> Indexes:
>> "kp_pkey" PRIMARY KEY, btree (id)
>> "kp_string_key" UNIQUE CONSTRAINT, btree (string)
>> "textsearch_gin_idx" gin (to_tsvector('simple'::regconfig, string))
>> WHERE score1 IS NOT NULL
>>
>>
>>
>> This is a query tested. All data is in Pg buffer cache for these timings.
>> Words like "the" and "and" are very common (~9% of entries, each) and a
>> word like "hotel" is much less common (~0.2% of entries).
>>
>>   SELECT id,string
>> FROM kp
>>WHERE score1 IS NOT NULL
>>  AND to_tsvector('simple', string) @@ to_tsquery('simple', ?)
>>  -- ? is substituted with the query strings
>> ORDER BY score1 DESC, score2 ASC
>> LIMIT 1000;
>>
>>  Limit  (cost=56.04..56.04 rows=1 width=37) (actual time=250.010..250.032
>> rows=142 loops=1)
>>->  Sort  (cost=56.04..56.04 rows=1 width=37) (actual
>> time=250.008..250.017 rows=142 loops=1)
>>  Sort Key: score1, score2
>>  Sort Method: quicksort  Memory: 36kB
>>  ->  Bitmap Heap Scan on kp  (cost=52.01..56.03 rows=1 width=37)
>> (actual time=249.711..249.945 rows=142 loops=1)
>>Recheck Cond: ((to_tsvector('simple'::regconfig, string)
>> @@ '''hotel'' & ''and'' & ''the'''::tsquery) AND (score1 IS NOT NULL))
>>->  Bitmap Index Scan on textsearch_gin_idx
>> (cost=0.00..52.01 rows=1 width=0) (actual time=249.681..249.681 rows=142
>> loops=1)
>>  Index Cond: (to_tsvector('simple'::regconfig,
>> string) @@ '''hotel'' & ''and'' & ''the'''::tsquery)
>>  Total runtime: 250.096 ms
>>
>>
>>
>> Times are from \timing on.
>>
>> MASTER
>> ===
>> the:   888.436 ms   926.609 ms   885.502 ms
>> and:   944.052 ms   937.732 ms   920.050 ms
>> hotel:  53.992 ms57.039 ms65.581 ms
>> and & the & hotel: 260.308 ms   248.275 ms   248.098 ms
>>
>> These numbers roughly match what we get with Pg 9.2. The time savings
>> between 'the' and 'and & the & hotel' is mostly heap lookups for the score
>> and the final sort.
>>
>>
>>
>> The size of the index on disk is about 2% smaller in the patched version.
>>
>> PATCHED
>> ===
>> the:  1055.169 ms 1081.976 ms  1083.021 ms
>> and:   912.173 ms  949.364 ms   965.261 ms
>> hotel:  62.591 ms   64.341 ms62.923 ms
>> and & the & hotel: 268.577 ms  259.293 ms   257.408 ms
>> hotel & and & the: 253.574 ms  258.071 ms  250.280 ms
>>
>> I was hoping that the 'and & the & hotel' case would improve with this
>> patch to be closer to the 'hotel' search, as I thought that was the kind of
>> thing it targeted. Unfortunately, it did not. I actually applied the
>> patches, compiled, initdb/load data, and ran it again thinking I made a
>> mistake.
>>
>> Reordering the terms 'hotel & and & the' doesn't change the result.
>>
>
> Oh, in this path new consistent method isn't implemented for tsvector
> opclass, for array only. Will be fixed soon.
> BTW, was index 2% smaller or 2 times smaller? If it's 2% smaller than I
> need to know more about your dataset :)
>
> --
> With best regards,
> Alexander Korotkov.
>
>


[HACKERS] Review:Patch: SSL: prefer server cipher order

2013-11-15 Thread Adrian Klaver
First review of the above patch as listed in current CommitFest as well 
as subsequent ECDH patch in the thread below:


http://www.postgresql.org/message-id/1383782378-7342-1-git-send-email-mark...@gmail.com

Platform OpenSuse 12.2

Both patches applied cleanly.

Configured:

./configure --with-python --with-openssl 
--prefix=/home/aklaver/pgsqlTest --with-pgport=5462 --enable-cassert



make and make check ran without error.

The description of the GUCs show up in the documentation but I am not 
seeing the GUCs themselves in postgresql.conf, so I could test no 
further. It is entirely possible I am missing a step and would 
appreciate enlightenment.



The general premise seems sound, allowing the DBA control over the type 
of cipher of used.


Thanks,
--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sequence Access Method WIP

2013-11-15 Thread Simon Riggs
On 15 November 2013 15:48, Peter Eisentraut  wrote:
> On 11/14/13, 3:10 PM, Simon Riggs wrote:
>> On 16 January 2013 00:40, Simon Riggs  wrote:
>>
>>> SeqAm allows you to specify a plugin that alters the behaviour for
>>> sequence allocation and resetting, aimed specifically at clustering
>>> systems.
>>>
>>> New command options on end of statement allow syntax
>>>   CREATE SEQUENCE foo_seq
>>>   USING globalseq
>>
>> Production version of this, ready for commit to PG 9.4
>
> This patch doesn't apply anymore.

Yes, a patch conflict was just committed, will repost.

> Also, you set this to "returned with feedback" in the CF app.  Please
> verify whether that was intentional.

Not sure that was me, if so, corrected.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] REINDEX CONCURRENTLY 2.0

2013-11-15 Thread Peter Eisentraut
On 11/14/13, 9:40 PM, Michael Paquier wrote:
> Hi all,
> 
> Please find attached updated patches for the support of REINDEX
> CONCURRENTLY, renamed 2.0 for the occasion:
> - 20131114_1_index_drop_comments.patch, patch that updates some
> comments in index_drop. This updates only a couple of comments in
> index_drop but has not been committed yet. It should be IMO...
> - 20131114_2_WaitForOldsnapshots_refactor.patch, a refactoring patch
> providing a single API that can be used to wait for old snapshots
> - 20131114_3_reindex_concurrently.patch, providing the core feature.
> Patch 3 needs to have patch 2 applied first. Regression tests,
> isolation tests and documentation are included with the patch.

The third patch needs to be rebased, because of a conflict in index.c.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sequence Access Method WIP

2013-11-15 Thread Peter Eisentraut
On 11/14/13, 3:10 PM, Simon Riggs wrote:
> On 16 January 2013 00:40, Simon Riggs  wrote:
> 
>> SeqAm allows you to specify a plugin that alters the behaviour for
>> sequence allocation and resetting, aimed specifically at clustering
>> systems.
>>
>> New command options on end of statement allow syntax
>>   CREATE SEQUENCE foo_seq
>>   USING globalseq
> 
> Production version of this, ready for commit to PG 9.4

This patch doesn't apply anymore.

Also, you set this to "returned with feedback" in the CF app.  Please
verify whether that was intentional.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sequence Access Method WIP

2013-11-15 Thread Simon Riggs
On 15 November 2013 15:08, Heikki Linnakangas  wrote:

> I wonder if the level of abstraction is right.

That is the big question and not something to shy away from.

What is presented is not the first thought, by a long way. Andres'
contribution to the patch is mainly around this point, so the seq am
is designed with the needs of the main use case in mind.

I'm open to suggested changes but I would say that practical usage
beats changes suggested for purity.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part 1: additional information

2013-11-15 Thread Alexander Korotkov
On Fri, Nov 15, 2013 at 9:56 PM, Peter Eisentraut  wrote:

> On 11/15/13, 12:24 PM, Alexander Korotkov wrote:
> > On Fri, Nov 15, 2013 at 8:58 PM, Peter Eisentraut  > > wrote:
> >
> > On 11/14/13, 6:00 AM, Alexander Korotkov wrote:
> > > Sorry, I posted buggy version of patch. Attached version is
> correct.
> >
> > This patch crashes the hstore the pg_trgm regression tests.
> >
> >
> > What exact version did you try 14 or 16? If it was 16, could you please
> > post a stacktrace, because it doesn't crash for me.
>
> The one that's the latest in the commitfest:
> http://www.postgresql.org/message-id/capphfdveq-jhe_2z9pvw22bp6h_o8xoaxcbjagf87gs4p4j...@mail.gmail.com
>
> If you have a newer one, please add it there.
>

Ok, I actualized information in commitfest app.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Sequence Access Method WIP

2013-11-15 Thread Andres Freund
On 2013-11-15 20:08:30 +0200, Heikki Linnakangas wrote:
> It's pretty hard to review the this without seeing the "other"
> implementation you're envisioning to use this API. But I'll try:

We've written a distributed sequence implementation against it, so it
works at least for that use case.

While certainly not release worthy, it still might be interesting to
look at.
http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=blob;f=contrib/bdr/bdr_seq.c;h=c9480c8021882f888e35080f0e7a7494af37ae27;hb=bdr

bdr_sequencer_fill_sequence pre-allocates ranges of values,
bdr_sequence_alloc returns them upon nextval().

> I wonder if the level of abstraction is right. The patch assumes that the
> on-disk storage of all sequences is the same, so the access method can't
> change that.  But then it leaves the details of actually updating the on-disk
> block, WAL-logging and all, as the responsibility of the access method.
> Surely that's going to look identical in all the seqams, if they all use the
> same on-disk format. That also means that the sequence access methods can't
> be implemented as plugins, as plugins can't do WAL-logging.

Well, it exposes log_sequence_tuple() - together with the added "am
private" column of pg_squence that allows to do quite a bit of different
things. I think unless we really implement pluggable RMGRs - which I
don't really see coming - we cannot be radically different.

> The comment in seqam.c says that there's a private column reserved for
> access method-specific data, called am_data, but that seems to be the only
> mention of am_data in the patch.

It's amdata, not am_data :/. Directly at the end of pg_sequence.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] trivial one-off memory leak in guc-file.l ParseConfigFile

2013-11-15 Thread Heikki Linnakangas

On 22.09.2013 22:40, didier wrote:

Hi

fix a small memory leak in guc-file.l ParseConfigFile

AbsoluteConfigLocation() return a strdup string but it's never free or
referenced outside ParseConfigFile

Courtesy Valgrind and Noah Misch MEMPOOL work.


I spotted and fixed this some time ago while fixing another leak, see 
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=138184adc5f7c60c184972e4d23f8cdb32aed77d. 
I didn't realize you had already reported it back then.


So, I've marked this as committed in the commitfest app. But thanks for 
the report anyway.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sequence Access Method WIP

2013-11-15 Thread Heikki Linnakangas

On 14.11.2013 22:10, Simon Riggs wrote:

On 16 January 2013 00:40, Simon Riggs  wrote:


SeqAm allows you to specify a plugin that alters the behaviour for
sequence allocation and resetting, aimed specifically at clustering
systems.

New command options on end of statement allow syntax
   CREATE SEQUENCE foo_seq
   USING globalseq


Production version of this, ready for commit to PG 9.4

Includes test extension which allows sequences without gaps - "gapless".

Test using seq_test.psql after creating extension.

No dependencies on other patches.


It's pretty hard to review the this without seeing the "other" 
implementation you're envisioning to use this API. But I'll try:


I wonder if the level of abstraction is right. The patch assumes that 
the on-disk storage of all sequences is the same, so the access method 
can't change that. But then it leaves the details of actually updating 
the on-disk block, WAL-logging and all, as the responsibility of the 
access method. Surely that's going to look identical in all the seqams, 
if they all use the same on-disk format. That also means that the 
sequence access methods can't be implemented as plugins, as plugins 
can't do WAL-logging.


The comment in seqam.c says that there's a private column reserved for 
access method-specific data, called am_data, but that seems to be the 
only mention of am_data in the patch.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pre-commit triggers

2013-11-15 Thread Andrew Dunstan


Attached is a patch to provide a new event trigger that will fire on 
transaction commit. I have tried to make certain that it fires at a 
sufficiently early stage in the commit process that some of the evils 
mentioned in previous discussions on this topic aren't relevant.


The triggers don't fire if there is no real XID, so only actual data 
changes should cause the trigger to fire. They also don't fire in single 
user mode, so that if you do something stupid like create a trigger that 
unconditionally raises an error you have a way to recover.


This is intended to be somewhat similar to the same feature in the 
Firebird database, and the initial demand came from a client migrating 
from that system to Postgres.


cheers

andrew
diff --git a/doc/src/sgml/event-trigger.sgml b/doc/src/sgml/event-trigger.sgml
index ac31332..3bbf1a4 100644
--- a/doc/src/sgml/event-trigger.sgml
+++ b/doc/src/sgml/event-trigger.sgml
@@ -12,7 +12,7 @@
PostgreSQL also provides event triggers.  Unlike regular
triggers, which are attached to a single table and capture only DML events,
event triggers are global to a particular database and are capable of
-   capturing DDL events.
+   capturing DDL events or transaction commits.
   
 
   
@@ -29,8 +29,9 @@
  occurs in the database in which it is defined. Currently, the only
  supported events are
  ddl_command_start,
- ddl_command_end
- and sql_drop.
+ ddl_command_end,
+ sql_drop, and
+ transaction_commit.
  Support for additional events may be added in future releases.

 
@@ -65,6 +66,15 @@

 

+A transaction_commit trigger is called at the end of a
+transaction, just before any deferred triggers are fired, unless
+no data changes have been made by the transaction, or
+PostgreSQL is running in Single-User mode. This is so
+that you can recover from a badly specified transaction_commit
+trigger.
+   
+
+   
  Event triggers (like other functions) cannot be executed in an aborted
  transaction.  Thus, if a DDL command fails with an error, any associated
  ddl_command_end triggers will not be executed.  Conversely,
@@ -77,8 +87,13 @@

 

- For a complete list of commands supported by the event trigger mechanism,
- see .
+A transaction_commit trigger is also not called in an
+aborted transaction.
+   
+
+   
+ For a complete list of commands supported by the event trigger
+ mechanism, see .

 

@@ -101,6 +116,11 @@
  to intercept. A common use of such triggers is to restrict the range of
  DDL operations which users may perform.

+
+   
+transaction_commit triggers do not currently support
+WHEN clauses.
+   
   
 
   
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 0591f3f..74fc04c 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -1825,6 +1825,16 @@ CommitTransaction(void)
 	Assert(s->parent == NULL);
 
 	/*
+	 * First fire any pre-commit triggers, so if they in turn cause any
+	 * deferred triggers etc to fire this will be picked up below.
+	 * Only fire them, though, if we have a real transaction ID and
+	 * we're not running standalone. Not firing when standalone provides
+	 * a way to recover from setting up a bad transaction trigger.
+	 */
+	if (s->transactionId != InvalidTransactionId && IsUnderPostmaster)
+		PreCommitTriggersFire();
+
+	/*
 	 * Do pre-commit processing that involves calling user-defined code, such
 	 * as triggers.  Since closing cursors could queue trigger actions,
 	 * triggers could open cursors, etc, we have to keep looping until there's
@@ -2030,6 +2040,16 @@ PrepareTransaction(void)
 	Assert(s->parent == NULL);
 
 	/*
+	 * First fire any pre-commit triggers, so if they in turn cause any
+	 * deferred triggers etc to fire this will be picked up below.
+	 * Only fire them, though, if we have a real transaction ID and
+	 * we're not running standalone. Not firing when standalone provides
+	 * a way to recover from setting up a bad transaction trigger.
+	 */
+	if (s->transactionId != InvalidTransactionId && IsUnderPostmaster)
+		PreCommitTriggersFire();
+
+	/*
 	 * Do pre-commit processing that involves calling user-defined code, such
 	 * as triggers.  Since closing cursors could queue trigger actions,
 	 * triggers could open cursors, etc, we have to keep looping until there's
diff --git a/src/backend/commands/event_trigger.c b/src/backend/commands/event_trigger.c
index 328e2a8..f93441f 100644
--- a/src/backend/commands/event_trigger.c
+++ b/src/backend/commands/event_trigger.c
@@ -153,7 +153,8 @@ CreateEventTrigger(CreateEventTrigStmt *stmt)
 	/* Validate event name. */
 	if (strcmp(stmt->eventname, "ddl_command_start") != 0 &&
 		strcmp(stmt->eventname, "ddl_command_end") != 0 &&
-		strcmp(stmt->eventname, "sql_drop") != 0)
+		strcmp(stmt->eventname, "sql_drop") != 0 &&
+		strcmp(stmt->eventname, "transaction_commit")

Re: [HACKERS] GIN improvements part 1: additional information

2013-11-15 Thread Peter Eisentraut
On 11/15/13, 12:24 PM, Alexander Korotkov wrote:
> On Fri, Nov 15, 2013 at 8:58 PM, Peter Eisentraut  > wrote:
> 
> On 11/14/13, 6:00 AM, Alexander Korotkov wrote:
> > Sorry, I posted buggy version of patch. Attached version is correct.
> 
> This patch crashes the hstore the pg_trgm regression tests.
> 
> 
> What exact version did you try 14 or 16? If it was 16, could you please
> post a stacktrace, because it doesn't crash for me.

The one that's the latest in the commitfest: 
http://www.postgresql.org/message-id/capphfdveq-jhe_2z9pvw22bp6h_o8xoaxcbjagf87gs4p4j...@mail.gmail.com

If you have a newer one, please add it there.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Alexander Korotkov
On Fri, Nov 15, 2013 at 6:57 PM, Rod Taylor  wrote:

> I tried again this morning using gin-packed-postinglists-16.patch and
> gin-fast-scan.6.patch. No crashes.
>
> It is about a 0.1% random sample of production data (10,000,000 records)
> with the below structure. Pg was compiled with debug enabled in both cases.
>
>   Table "public.kp"
>  Column |  Type   | Modifiers
> +-+---
>  id | bigint  | not null
>  string | text| not null
>  score1 | integer |
>  score2 | integer |
>  score3 | integer |
>  score4 | integer |
> Indexes:
> "kp_pkey" PRIMARY KEY, btree (id)
> "kp_string_key" UNIQUE CONSTRAINT, btree (string)
> "textsearch_gin_idx" gin (to_tsvector('simple'::regconfig, string))
> WHERE score1 IS NOT NULL
>
>
>
> This is a query tested. All data is in Pg buffer cache for these timings.
> Words like "the" and "and" are very common (~9% of entries, each) and a
> word like "hotel" is much less common (~0.2% of entries).
>
>   SELECT id,string
> FROM kp
>WHERE score1 IS NOT NULL
>  AND to_tsvector('simple', string) @@ to_tsquery('simple', ?)
>  -- ? is substituted with the query strings
> ORDER BY score1 DESC, score2 ASC
> LIMIT 1000;
>
>  Limit  (cost=56.04..56.04 rows=1 width=37) (actual time=250.010..250.032
> rows=142 loops=1)
>->  Sort  (cost=56.04..56.04 rows=1 width=37) (actual
> time=250.008..250.017 rows=142 loops=1)
>  Sort Key: score1, score2
>  Sort Method: quicksort  Memory: 36kB
>  ->  Bitmap Heap Scan on kp  (cost=52.01..56.03 rows=1 width=37)
> (actual time=249.711..249.945 rows=142 loops=1)
>Recheck Cond: ((to_tsvector('simple'::regconfig, string) @@
> '''hotel'' & ''and'' & ''the'''::tsquery) AND (score1 IS NOT NULL))
>->  Bitmap Index Scan on textsearch_gin_idx
> (cost=0.00..52.01 rows=1 width=0) (actual time=249.681..249.681 rows=142
> loops=1)
>  Index Cond: (to_tsvector('simple'::regconfig, string)
> @@ '''hotel'' & ''and'' & ''the'''::tsquery)
>  Total runtime: 250.096 ms
>
>
>
> Times are from \timing on.
>
> MASTER
> ===
> the:   888.436 ms   926.609 ms   885.502 ms
> and:   944.052 ms   937.732 ms   920.050 ms
> hotel:  53.992 ms57.039 ms65.581 ms
> and & the & hotel: 260.308 ms   248.275 ms   248.098 ms
>
> These numbers roughly match what we get with Pg 9.2. The time savings
> between 'the' and 'and & the & hotel' is mostly heap lookups for the score
> and the final sort.
>
>
>
> The size of the index on disk is about 2% smaller in the patched version.
>
> PATCHED
> ===
> the:  1055.169 ms 1081.976 ms  1083.021 ms
> and:   912.173 ms  949.364 ms   965.261 ms
> hotel:  62.591 ms   64.341 ms62.923 ms
> and & the & hotel: 268.577 ms  259.293 ms   257.408 ms
> hotel & and & the: 253.574 ms  258.071 ms  250.280 ms
>
> I was hoping that the 'and & the & hotel' case would improve with this
> patch to be closer to the 'hotel' search, as I thought that was the kind of
> thing it targeted. Unfortunately, it did not. I actually applied the
> patches, compiled, initdb/load data, and ran it again thinking I made a
> mistake.
>
> Reordering the terms 'hotel & and & the' doesn't change the result.
>

Oh, in this path new consistent method isn't implemented for tsvector
opclass, for array only. Will be fixed soon.
BTW, was index 2% smaller or 2 times smaller? If it's 2% smaller than I
need to know more about your dataset :)

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] writable FDWs / update targets confusion

2013-11-15 Thread Tom Lane
Albe Laurenz  writes:
> Tom, could you show us a rope if there is one?

What is it you actually need to fetch?

IIRC, the idea was that most FDWs would do the equivalent of fetching the
primary-key columns to use in an update.  If that's what you need, then
AddForeignUpdateTargets should identify those columns and generate Vars
for them.  postgres_fdw is probably not a good model since it's using
ctid (a non-portable thing) and piggybacking on the existence of a tuple
header field to put that in.

If you're dealing with some sort of hidden tuple identity column that
works like CTID but doesn't fit in six bytes, there may not be any good
solution in the current state of the FDW support.  As I mentioned, we'd
batted around the idea of letting FDWs define a system column with some
other datatype besides TID, but we'd not figured out all the nitty
gritty details in time for 9.3.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transaction-lifespan memory leak with plpgsql DO blocks

2013-11-15 Thread Yeb Havinga

On 2013-11-14 22:23, Tom Lane wrote:


So after some experimentation I came up with version 2, attached.


Thanks for looking into this! I currently do not have access to a setup 
to try the patch. A colleague of mine will look into this next week.


Thanks again,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part 1: additional information

2013-11-15 Thread Alexander Korotkov
On Fri, Nov 15, 2013 at 8:58 PM, Peter Eisentraut  wrote:

> On 11/14/13, 6:00 AM, Alexander Korotkov wrote:
> > Sorry, I posted buggy version of patch. Attached version is correct.
>
> This patch crashes the hstore the pg_trgm regression tests.
>

What exact version did you try 14 or 16? If it was 16, could you please
post a stacktrace, because it doesn't crash for me.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Optimize kernel readahead using buffer access strategy

2013-11-15 Thread Peter Eisentraut
On 11/14/13, 7:09 AM, KONDO Mitsumasa wrote:
> I create a patch that is improvement of disk-read and OS file caches. It can
> optimize kernel readahead parameter using buffer access strategy and
> posix_fadvice() in various disk-read situations.

Various compiler warnings:

tablecmds.c: In function ‘copy_relation_data’:
tablecmds.c:9120:3: warning: passing argument 5 of ‘smgrread’ makes pointer 
from integer without a cast [enabled by default]
In file included from tablecmds.c:79:0:
../../../src/include/storage/smgr.h:94:13: note: expected ‘char *’ but argument 
is of type ‘int’

bufmgr.c: In function ‘ReadBuffer_common’:
bufmgr.c:455:4: warning: passing argument 5 of ‘smgrread’ from incompatible 
pointer type [enabled by default]
In file included from ../../../../src/include/storage/buf_internals.h:22:0,
 from bufmgr.c:45:
../../../../src/include/storage/smgr.h:94:13: note: expected ‘char *’ but 
argument is of type ‘BufferAccessStrategy’

md.c: In function ‘mdread’:
md.c:680:2: warning: passing argument 2 of ‘BufferHintIOAdvise’ makes integer 
from pointer without a cast [enabled by default]
In file included from md.c:34:0:
../../../../src/include/storage/fd.h:79:12: note: expected ‘off_t’ but argument 
is of type ‘char *’



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part 1: additional information

2013-11-15 Thread Peter Eisentraut
On 11/14/13, 6:00 AM, Alexander Korotkov wrote:
> Sorry, I posted buggy version of patch. Attached version is correct.

This patch crashes the hstore the pg_trgm regression tests.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2013-11-15 Thread Peter Eisentraut
On 11/14/13, 3:17 AM, Peter Geoghegan wrote:
> Attached patch allows pg_stat_statements to store arbitrarily long
> query texts, using an external, transparently managed lookaside file.

Compiler warnings with fortify settings:

gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute 
-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g 
-fpic -I. -I. -I../../src/include -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE 
-I/usr/include/libxml2   -c -o pg_stat_statements.o pg_stat_statements.c -MMD 
-MP -MF .deps/pg_stat_statements.Po
pg_stat_statements.c: In function ‘entry_reset’:
pg_stat_statements.c:1827:11: warning: ignoring return value of ‘ftruncate’, 
declared with attribute warn_unused_result [-Wunused-result]
pg_stat_statements.c: In function ‘garbage_collect_query_texts’:
pg_stat_statements.c:1779:11: warning: ignoring return value of ‘ftruncate’, 
declared with attribute warn_unused_result [-Wunused-result]



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])

2013-11-15 Thread Peter Eisentraut
On 11/14/13, 2:50 AM, Amit Kapila wrote:
> Find the rebased version attached with this mail.

Doesn't build:

openjade  -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D . -c 
/usr/share/sgml/docbook/stylesheet/dsssl/modular/catalog -d stylesheet.dsl -t 
sgml -i output-html -V html-index postgres.sgml
openjade:reference.sgml:61:3:E: cannot find "alter_system.sgml"; tried 
"ref/alter_system.sgml", "./alter_system.sgml", "./alter_system.sgml", 
"/usr/local/share/sgml/alter_system.sgml", "/usr/share/sgml/alter_system.sgml"
openjade:config.sgml:164:27:X: reference to non-existent ID "SQL-ALTERSYSTEM"
make[3]: *** [HTML.index] Error 1
make[3]: *** Deleting file `HTML.index'
osx -D. -x lower -i include-xslt-index postgres.sgml >postgres.xmltmp
osx:reference.sgml:61:3:E: cannot find "alter_system.sgml"; tried 
"ref/alter_system.sgml", "./alter_system.sgml", 
"/usr/local/share/sgml/alter_system.sgml", "/usr/share/sgml/alter_system.sgml"
osx:config.sgml:164:27:X: reference to non-existent ID "SQL-ALTERSYSTEM"
make[3]: *** [postgres.xml] Error 1

New file missing in patch?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes (timings)

2013-11-15 Thread Kevin Grittner
Erik Rijkers  wrote:

> Perhaps someone finds these timings useful.

> '--enable-cassert'

Assertions can really distort the timings, and not always equally
for all code paths.  Any chance of re-running those tests without
that?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes (timings)

2013-11-15 Thread Andres Freund
On 2013-11-15 17:11:46 +0100, Erik Rijkers wrote:
> I've been messing with minmax indexes some more so here are some results of 
> that.
> 
> Perhaps someone finds these timings useful.
> 
> 
> Centos 5.7, 32 GB memory, 2 quadcores.
> 
> '--prefix=/var/data1/pg_stuff/pg_installations/pgsql.minmax' 
> '--with-pgport=6444' '--enable-depend' '--enable-cassert'
> '--enable-debug' '--with-perl' '--with-openssl' '--with-libxml' 
> '--enable-dtrace'

Just some general advice: doing timings with --enale-cassert isn't that
meaningful - it often can distort results significantly.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] writable FDWs / update targets confusion

2013-11-15 Thread Albe Laurenz
Tomas Vondra wrote:
> I'm working on adding write support to one of my FDWs. Adding INSERT went
> pretty fine, but when adding DELETE/UPDATE I got really confused about how
> the update targets are supposed to work.
> 
> My understanding of how it's supposed to work is this:
> 
>  (1) AddForeignUpdateTargets adds columns that serve as ID of the record
>  (e.g. postgres_fdw adds 'ctid')
> 
>  (2) planning the inner foreign scan handles the new column appropriately
>  (e.g. scans system columns, as in case of 'ctid' etc.)
> 
>  (3) IterateForeignScan will see the column in the tuple descriptor, will
>  set it just like any other column, etc.
> 
>  (4) ExecForeignDelete will fetch the new column and do something with it
> 
> However no matter what I do, I can't get the steps (3) and (4) working this
> way.

I have no idea either.

> And looking at postgres_fdw it seems to me it does not really set the ctid
> into the tuple as a column, but just does this:
> 
> if (ctid)
> tuple->t_self = *ctid;
> 
> which I can't really do because I need to use INT8 and not TID. But even
> if I do this,

What exactly did you do?
Did you try
   tuple->t_self = myInt8;

That would write 8 bytes into a 6-byte variable, thus scribbling
past the end, right?

> Interestingly, if I do this in ExecForeignDelete
> 
> static TupleTableSlot *
> myExecForeignDelete(EState *estate,
>   ResultRelInfo *resultRelInfo,
>   TupleTableSlot *slot,
>   TupleTableSlot *planSlot)
> {
> 
> bool isNull;
> MyModifyState state = (MyModifyState)resultRelInfo->ri_FdwState;
> int64 ctid;
> 
> Datum datum = ExecGetJunkAttribute(planSlot,
>state->ctidAttno, &isNull);
> 
> ctid = DatumGetInt64(datum);
> 
> elog(WARNING, "ID = %ld", ctid);
> 
> if (isNull)
> elog(ERROR, "ctid is NULL");
> 
> /* FIXME not yet implemented */
> return NULL;
> }
> 
> I do get (isNull=FALSE) but the ctid evaluates to some random number, e.g.
> 
> WARNING:  ID = 44384788 (44384788)
> WARNING:  ID = 44392980 (44392980)
> 
> and so on.

Maybe that's the effect of writing past the end of the variable.

> So what did I get wrong? Is it possible to use arbitrary hidden column as
> "junk" columns (documentation seems to suggest that)? What is the right
> way to do that / whad did I get wrong?

I would like to know an answer to this as well.

I don't think that assigning to tuple->t_self will work, and I
know too little about the executor to know if there's any way
to fit a ctid that is *not* an ItemPointerData into a TupleTableSlot
so that it will show up as resjunk TargerEntry in ExecForeignUpdate.

Tom, could you show us a rope if there is one?

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL renegotiation

2013-11-15 Thread Andres Freund
On 2013-11-15 10:58:19 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2013-11-15 10:43:23 -0500, Tom Lane wrote:
> >> Another reason I'm not in a hurry is that the problem we're trying
> >> to solve doesn't seem to be causing real-world trouble.  So by
> >> "awhile", I'm thinking "let's let it get through 9.4 beta testing".
> 
> > Well, there have been a bunch of customer complaints about it, afair
> > that's what made Alvaro look into it in the first place. So it's not a
> > victimless bug.
> 
> OK, then maybe end-of-beta is too long.  But how much testing will it get
> during development?  I know I never use SSL on development installs.
> How many hackers do?

I guess few. And even fewer will actually have connections that live
long enough to experience renegotiations :/.

I wonder how hard it'd be to rig the buildfarm code to generate ssl
certificates and use them during installcheck. If we'd additionally set
a low renegotiation limit...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add transforms feature

2013-11-15 Thread Dimitri Fontaine
Hi,

Peter Eisentraut  writes:
> Rebased patch.  No changes except that merge conflicts were resolved,
> and I had to add some Data::Dumper tweaks to the regression tests so
> that the results came out in  consistent order on different versions of
> Perl.

I just spent some time reading that patch, here's my first round of
review:

  - Documentation style seems to be to be different from the "man page"
or "reference docs" style that we use elsewhere, and is instead
deriving the general case from examples. Reads strange.

  - The internal datatype argument and return type discussion for
function argument looks misplaced, but I don't have a better
proposition for that.

  - The code looks good, I didn't spot any problem when reading it.
Given your Jenkins installation I didn't try (yet at least) to use
the patch and compile and run it locally.

Interesting note might be the addition of two new system caches, one
for the PL languages and another one for the transform functions. I
agree with the need, just wanted to raise awereness about that in
case there's something to be said on that front.

  - Do we need an ALTER TRANSFORM command?

Usually we have at least an Owner for the new objects and a command
to change the owner. Then should we be able to change the
function(s) used in a transform?

  - Should transform live in a schema?

At first sight, no reason why, but see next point about a use case
that we might be able to solve doing that.

  - SQL Standard has something different named the same thing,
targetting client side types apparently. Is there any reason why we
would want to stay away from using the same name for something
really different in PostgreSQL?

On the higher level design, the big question here is about selective
behavior. As soon as you CREATE TRANSFORM FOR hstore LANGUAGE plperl
then any plperl function will now receive its hstore arguments as a
proper perl hash rather than a string.

Any pre-existing plperl function with hstore arguments or return type
then needs to be upgraded to handle the new types nicely, and some of
those might not be under the direct control of the DBA running the
CREATE TRANSFORM command, when using some plperl extensions for example.

A mechanism allowing for the transform to only be used in some functions
but not others might be useful. The simplest such mechanism I can think
of is modeled against the PL/Java classpath facility as specified in the
SQL standard: you attach a classpath per schema.

We could design transforms to only "activate" in the schema they are
created, thus allowing plperl functions to co-exist where some will
receive proper hash for hstores and other will continue to get plain
text.

Should using the schema to that ends be frowned upon, then we need a way
to register each plperl function against using or not using the
transform facility, defaulting to not using anything. Maybe something
like the following:

  CREATE FUNCTION foo(hash hstore, x ltree)
 RETURNS hstore
 LANGUAGE plperl
 USING TRANSFORM FOR hstore, ltree
  AS $$ … $$;

Worst case, that I really don't think we need, would be addressing that
per-argument:

  CREATE FUNCTION foo (hash hstore WITH TRANSFORM, kv hstore) …

I certainly hope we don't need that, and sure can't imagine use cases
for that level of complexity at the time of writing this review.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL renegotiation

2013-11-15 Thread Tom Lane
Andres Freund  writes:
> On 2013-11-15 10:43:23 -0500, Tom Lane wrote:
>> Another reason I'm not in a hurry is that the problem we're trying
>> to solve doesn't seem to be causing real-world trouble.  So by
>> "awhile", I'm thinking "let's let it get through 9.4 beta testing".

> Well, there have been a bunch of customer complaints about it, afair
> that's what made Alvaro look into it in the first place. So it's not a
> victimless bug.

OK, then maybe end-of-beta is too long.  But how much testing will it get
during development?  I know I never use SSL on development installs.
How many hackers do?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Kevin Grittner
Alvaro Herrera  wrote:
> Kevin Grittner escribió:

>> That argument would be more persuasive if I could find any current
>> usage of the namecpy() function anywhere in the source code.
>
> Well, its cousin namestrcpy is used in a lot of places.  That one uses a
> regular C string as source; namecpy uses a Name as source, so they are
> slightly different but the coding is pretty much the same.

Fair enough.

> There is a difference in using the macro StrNCpy instead of the strncpy
> library function directly.  ISTM this makes sense because Name is known
> to be zero-terminated at NAMEDATALEN, which a random C string is not.

Is the capital T in the second #undef in this pg_locale.c code intended?:

#ifdef WIN32
/*
 * This Windows file defines StrNCpy. We don't need it here, so we undefine
 * it to keep the compiler quiet, and undefine it again after the file is
 * included, so we don't accidentally use theirs.
 */
#undef StrNCpy
#include 
#ifdef StrNCpy
#undef STrNCpy
#endif
#endif

--
Kevin GrittnerEDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL renegotiation

2013-11-15 Thread Tom Lane
Alvaro Herrera  writes:
> So I committed this patch without backpatching anything. ...
> ... should we wait longer for the new renegotiation code to
> be more battle-tested?

+1 to waiting awhile.  I think if we don't see any problems in
HEAD, then back-patching as-is would be the best solution.
The other alternatives are essentially acknowledging that you're
back-patching something you're afraid isn't production ready.
Let's not go there.

Another reason I'm not in a hurry is that the problem we're trying
to solve doesn't seem to be causing real-world trouble.  So by
"awhile", I'm thinking "let's let it get through 9.4 beta testing".

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL renegotiation

2013-11-15 Thread Andres Freund
On 2013-11-15 10:43:23 -0500, Tom Lane wrote:
> +1 to waiting awhile.  I think if we don't see any problems in
> HEAD, then back-patching as-is would be the best solution.
> The other alternatives are essentially acknowledging that you're
> back-patching something you're afraid isn't production ready.
> Let's not go there.

Agreed. Both on just backpatching it unchanged and waiting for the fix
to prove itself a bit.

> Another reason I'm not in a hurry is that the problem we're trying
> to solve doesn't seem to be causing real-world trouble.  So by
> "awhile", I'm thinking "let's let it get through 9.4 beta testing".

Well, there have been a bunch of customer complaints about it, afair
that's what made Alvaro look into it in the first place. So it's not a
victimless bug.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Andres Freund  writes:
> > I didn't argue against s/strncpy/strlcpy/. That's clearly a sensible
> > fix.
> > I am arguing about introducing additional code and error messages about
> > it, that need to be translated. And starting doing so in isolationtester
> > of all places.
> 
> I agree with Andres on this.  Commit
> 7cb964acb794078ef033cbf2e3a0e7670c8992a9 is the very definition of
> overkill, and I don't want to see us starting to plaster the source
> code with things like this.  Converting strncpy to strlcpy seems
> appropriate --- and sufficient.

Personally, I'd like to see better handling like this- but done in a way
which minimizes impact to code and translators.  A function like
namecpy() (which I agree with Kevin about- curious that it's not used..)
which handled the check, errmsg and exit seems reasonable to me, for the
"userland" binaries (and perhaps the postmaster when doing command-line
checking of, eg, -D) that need it.

Still, I'm not offering to go do it, so take my feelings on it with that
in mind. :)

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Alvaro Herrera
Kevin Grittner escribió:
> Alvaro Herrera  wrote:
> 
> > This code should probably be using namecpy().  Note namecpy()
> > doesn't memset() after strncpy() and has survived the test of
> > time, which strongly suggests that the memset is indeed
> > superfluous.
> 
> That argument would be more persuasive if I could find any current
> usage of the namecpy() function anywhere in the source code.

Well, its cousin namestrcpy is used in a lot of places.  That one uses a
regular C string as source; namecpy uses a Name as source, so they are
slightly different but the coding is pretty much the same.

There is a difference in using the macro StrNCpy instead of the strncpy
library function directly.  ISTM this makes sense because Name is known
to be zero-terminated at NAMEDATALEN, which a random C string is not.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Kevin Grittner
Alvaro Herrera  wrote:

> This code should probably be using namecpy().  Note namecpy()
> doesn't memset() after strncpy() and has survived the test of
> time, which strongly suggests that the memset is indeed
> superfluous.

That argument would be more persuasive if I could find any current
usage of the namecpy() function anywhere in the source code.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part2: fast scan

2013-11-15 Thread Rod Taylor
I tried again this morning using gin-packed-postinglists-16.patch and
gin-fast-scan.6.patch. No crashes during index building.

Pg was compiled with debug enabled in both cases. The data is a ~0.1%
random sample of production data (10,000,000 records for the test) with the
below structure.

  Table "public.kp"
 Column |  Type   | Modifiers
+-+---
 id | bigint  | not null
 string | text| not null
 score1 | integer |
 score2 | integer |
 score3 | integer |
 score4 | integer |
Indexes:
"kp_pkey" PRIMARY KEY, btree (id)
"kp_string_key" UNIQUE CONSTRAINT, btree (string)
"textsearch_gin_idx" gin (to_tsvector('simple'::regconfig, string))
WHERE score1 IS NOT NULL



All data is in Pg buffer cache for these timings. Words like "the" and
"and" are very common (~9% of entries, each) and a word like "hotel" is
much less common (~0.2% of entries). Below is the query tested:

  SELECT id,string
FROM kp
   WHERE score1 IS NOT NULL
 AND to_tsvector('simple', string) @@ to_tsquery('simple', ?)
 -- ? is substituted with the query strings
ORDER BY score1 DESC, score2 ASC
LIMIT 1000;

 Limit  (cost=56.04..56.04 rows=1 width=37) (actual time=250.010..250.032
rows=142 loops=1)
   ->  Sort  (cost=56.04..56.04 rows=1 width=37) (actual
time=250.008..250.017 rows=142 loops=1)
 Sort Key: score1, score2
 Sort Method: quicksort  Memory: 36kB
 ->  Bitmap Heap Scan on kp  (cost=52.01..56.03 rows=1 width=37)
(actual time=249.711..249.945 rows=142 loops=1)
   Recheck Cond: ((to_tsvector('simple'::regconfig, string) @@
'''hotel'' & ''and'' & ''the'''::tsquery) AND (score1 IS NOT NULL))
   ->  Bitmap Index Scan on textsearch_gin_idx
(cost=0.00..52.01 rows=1 width=0) (actual time=249.681..249.681 rows=142
loops=1)
 Index Cond: (to_tsvector('simple'::regconfig, string)
@@ '''hotel'' & ''and'' & ''the'''::tsquery)
 Total runtime: 250.096 ms



Times are from \timing on.

MASTER
===
the:   888.436 ms   926.609 ms   885.502 ms
and:   944.052 ms   937.732 ms   920.050 ms
hotel:  53.992 ms57.039 ms65.581 ms
and & the & hotel: 260.308 ms   248.275 ms   248.098 ms

These numbers roughly match what we get with Pg 9.2. The time savings
between 'the' and 'and & the & hotel' is mostly heap lookups for the score
and the final sort.



The size of the index on disk is about 2% smaller in the patched version.

PATCHED
===
the:  1055.169 ms 1081.976 ms  1083.021 ms
and:   912.173 ms  949.364 ms   965.261 ms
hotel:  62.591 ms   64.341 ms62.923 ms
and & the & hotel: 268.577 ms  259.293 ms   257.408 ms
hotel & and & the: 253.574 ms  258.071 ms  250.280 ms

I was hoping that the 'and & the & hotel' case would improve with this
patch to be closer to the 'hotel' search, as I thought that was the kind of
thing it targeted. Unfortunately, it did not. I actually applied the
patches, compiled, initdb/load data, and ran it again thinking I made a
mistake.

Reordering the terms 'hotel & and & the' doesn't change the result.



On Fri, Nov 15, 2013 at 1:51 AM, Alexander Korotkov wrote:

> On Fri, Nov 15, 2013 at 3:25 AM, Rod Taylor wrote:
>
>> I checked out master and put together a test case using a small
>> percentage of production data for a known problem we have with Pg 9.2 and
>> text search scans.
>>
>> A small percentage in this case means 10 million records randomly
>> selected; has a few billion records.
>>
>>
>> Tests ran for master successfully and I recorded timings.
>>
>>
>>
>> Applied the patch included here to master along with
>> gin-packed-postinglists-14.patch.
>> Run make clean; ./configure; make; make install.
>> make check (All 141 tests passed.)
>>
>> initdb, import dump
>>
>>
>> The GIN index fails to build with a segfault.
>>
>
> Thanks for testing. See fixed version in thread about packed posting lists.
>
> --
> With best regards,
> Alexander Korotkov.
>


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Tom Lane
Andres Freund  writes:
> I didn't argue against s/strncpy/strlcpy/. That's clearly a sensible
> fix.
> I am arguing about introducing additional code and error messages about
> it, that need to be translated. And starting doing so in isolationtester
> of all places.

I agree with Andres on this.  Commit
7cb964acb794078ef033cbf2e3a0e7670c8992a9 is the very definition of
overkill, and I don't want to see us starting to plaster the source
code with things like this.  Converting strncpy to strlcpy seems
appropriate --- and sufficient.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Database disconnection and switch inside a single bgworker

2013-11-15 Thread Tom Lane
Michael Paquier  writes:
> Currently, bgworkers offer the possibility to connect to a given
> database using BackgroundWorkerInitializeConnection in bgworker.h, but
> there is actually no way to disconnect from a given database inside
> the same bgworker process.

That's isomorphic to having a backend switch to a different database,
which occasionally gets requested, but there is no real likelihood
that we'll ever implement.  The problem is, how can you be sure you
have flushed all the database-specific state that's been built up?
The relcache and catcaches are only the tip of the iceberg; we've
got caches all over the place.  And once you had flushed that data,
you'd have to recreate it --- but the code for doing so is intimately
intertwined with connection startup tasks that you'd most likely not
want to repeat.

And, once you'd done all that work, what would you have?  A database
switch methodology that would save a fork(), but not much else.
The time to warm up the caches wouldn't be any better than in a
fresh process.

The cost/benefit ratio for making this work just doesn't look very
promising.  That's why autovacuum is built the way it is.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Alvaro Herrera
David Rowley escribió:
> On Fri, Nov 15, 2013 at 12:33 PM, Tomas Vondra  wrote:

> > Be careful with 'Name' data type - it's not just a simple string buffer.
> > AFAIK it needs to work with hashing etc. so the zeroing is actually needed
> > here to make sure two values produce the same result. At least that's how
> > I understand the code after a quick check - for example this is from the
> > same jsonfuncs.c you mentioned:
> >
> > memset(fname, 0, NAMEDATALEN);
> > strncpy(fname, NameStr(tupdesc->attrs[i]->attname), NAMEDATALEN);
> > hashentry = hash_search(json_hash, fname, HASH_FIND, NULL);
> >
> > So the zeroing is on purpose, although if strncpy does that then the
> > memset is probably superflous.

This code should probably be using namecpy().  Note namecpy() doesn't
memset() after strncpy() and has survived the test of time, which
strongly suggests that the memset is indeed superfluous.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning recovery.conf into GUCs

2013-11-15 Thread Michael Paquier
On Fri, Nov 15, 2013 at 11:38 PM, Jaime Casanova  wrote:
> those are functions that are no longer used but Josh considered they
> could become useful before release.
> i can put them inside #ifdef _NOT_USED_ decorations or just remove
> them now and if/when we find some use for them re add them
Why not simply removing them? They will be kept in the git history either way.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Andres Freund
On 2013-11-15 10:04:12 -0500, Stephen Frost wrote:
> * Andres Freund (and...@2ndquadrant.com) wrote:
> > Sure, there can be longer paths, but postgres don't support them. In a
> > *myriad* of places. It's just not worth spending code on it.
> >
> > Just about any of the places that use MAXPGPATH are "vulnerable" or
> > produce confusing error messages if it's exceeded. And there are about
> > zero complaints about it.
> 
> Confusing error messages are one thing, segfaulting is another.

I didn't argue against s/strncpy/strlcpy/. That's clearly a sensible
fix.
I am arguing about introducing additional code and error messages about
it, that need to be translated. And starting doing so in isolationtester
of all places.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote:
> Sure, there can be longer paths, but postgres don't support them. In a
> *myriad* of places. It's just not worth spending code on it.
>
> Just about any of the places that use MAXPGPATH are "vulnerable" or
> produce confusing error messages if it's exceeded. And there are about
> zero complaints about it.

Confusing error messages are one thing, segfaulting is another.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Andres Freund
On 2013-11-15 09:53:24 -0500, Stephen Frost wrote:
> * Andres Freund (and...@2ndquadrant.com) wrote:
> > FWIW, argv0 is pretty much guaranteed to be shorter than MAXPGPATH since
> > MAXPGPATH is the longest a path can be, and argv[0] is either the 
> > executable's
> > name (if executed via PATH) or the path to the executable.
> 
> Err, it's the longest that *we* think the path can be..  That's not the
> same as actually being the longest that a path can be, which depends on
> the filesystem and OS...  It's not hard to get past our 1024 limit:

Sure, there can be longer paths, but postgres don't support them. In a
*myriad* of places. It's just not worth spending code on it.

Just about any of the places that use MAXPGPATH are "vulnerable" or
produce confusing error messages if it's exceeded. And there are about
zero complaints about it.

> > Now, you could probably write a program to exeve() a binary with argv[0]
> > being longer, but in that case you can also just put garbage in there.
> 
> We shouldn't blow up in that case either, really.

Good luck.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2013-11-15 Thread Merlin Moncure
On Fri, Nov 15, 2013 at 6:06 AM, Dimitri Fontaine
 wrote:
>> But: I very, very much agree with the other concerns around this. This
>> should be a patch to fix single user mode, not one to make postgres into
>> a single process database. It's not, and trying to make it by using
>> single user mode for it will start to hinder development of normal
>> postgres because we suddenly need to be concerned about performance and
>> features in situations where we previously weren't.
>
> +1
>
> Maybe what needs to happen to this patch is away to restrict its usage
> to --single. I'm thinking that postgres --single maybe could be made to
> fork the server process underneath the psql controler client process
> transparently.

I personally would prefer not to do that.  My main non-administrative
interest in this mode is doing things like benchmarking protocol
overhead.  I'm OK with not supporting (and optimizing) for single user
code paths but I don't like the idea of building walls that serve no
purpose other than to make it difficult for other people mess around.
Just document strenuously that this mode is not intended for
application bundling and move on...

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ERROR during end-of-xact/FATAL

2013-11-15 Thread Robert Haas
On Wed, Nov 13, 2013 at 11:04 AM, Noah Misch  wrote:
>> So, in short, ERROR + ERROR*10 = PANIC, but FATAL + ERROR*10 = FATAL.
>> That's bizarre.
>
> Quite so.
>
>> Given that that's where we are, promoting an ERROR during FATAL
>> processing to PANIC doesn't seem like it's losing much; we're
>> essentially already doing that in the (probably more likely) case of a
>> persistent ERROR during ERROR processing.  But since PANIC sucks, I'd
>> rather go the other direction: let's make an ERROR during ERROR
>> processing promote to FATAL.  And then let's do what you write above:
>> make sure that there's a separate on-shmem-exit callback for each
>> critical shared memory resource and that we call of those during FATAL
>> processing.
>
> Many of the factors that can cause AbortTransaction() to fail can also cause
> CommitTransaction() to fail, and those would still PANIC if the transaction
> had an xid.  How practical might it be to also escape from an error during
> CommitTransaction() with a FATAL instead of PANIC?  There's more to fix up in
> that case (sinval, NOTIFY), but it may be within reach.  If such a technique
> can only reasonably fix abort, though, I have doubts it buys us enough.

The critical stuff that's got to happen after
RecordTransactionCommit() appears to be ProcArrayEndTransaction() and
AtEOXact_Inval(). Unfortunately, the latter is well after the point
when we're supposed to only be doing "non-critical resource cleanup",
nonwithstanding which it appears to be critical.

So here's a sketch.  Hoist the preparatory logic in
RecordTransactionCommit() - smgrGetPendingDeletes,
xactGetCommittedChildren, and xactGetCommittedInvalidationMessages up
into the caller and do it before setting TRANS_COMMIT.  If any of that
stuff fails, we'll land in AbortTransaction() which must cope.  As
soon as we exit the commit critical section, set a flag somewhere
(where?) indicating that we have written our commit record; when that
flag is set, (a) promote any ERROR after that point through the end of
commit cleanup to FATAL and (b) if we enter AbortTransaction(), don't
try to RecordTransactionAbort().

I can't see that the notification stuff requires fixup in this case;
AFAICS, it is just adjusting backend-local state, and it's OK to
disregard any problems there during a FATAL exit.  Do you see
something to the contrary?  But invalidation messages are a problem:
if we commit and exit without sending our queued-up invalidation
messages, Bad Things Will Happen.  Perhaps we could arrange things so
that in that case only, we just PANIC.   That would allow most write
transactions to get by with FATAL, promoting to PANIC only in the case
of transactions that have modified system catalogs and only until the
invalidations have actually been sent.  Avoiding the PANIC in that
case seems to require some additional wizardry which is not entirely
clear to me at this time.

I think we'll have to approach the various problems in this area
stepwise, or we'll never make any progress.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning recovery.conf into GUCs

2013-11-15 Thread Peter Eisentraut
On 11/13/13, 12:17 AM, Jaime Casanova wrote:
> I have rebased Michael Paquier's patch and did a few changes:
> 
> * changed standby.enabled filename to recovery.trigger
> * make archive_command a SIGHUP parameter again
> * make restore_command a SIGHUP parameter
> * rename restore_command variable to XLogRestoreCommand to match
> XLogArchiveCommand

Please check for compiler warnings in pg_basebackup:

pg_basebackup.c:1109:1: warning: ‘escapeConnectionParameter’ defined but not 
used [-Wunused-function]
pg_basebackup.c:1168:1: warning: ‘escape_quotes’ defined but not used 
[-Wunused-function]



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strncpy is not a safe version of strcpy

2013-11-15 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote:
> FWIW, argv0 is pretty much guaranteed to be shorter than MAXPGPATH since
> MAXPGPATH is the longest a path can be, and argv[0] is either the executable's
> name (if executed via PATH) or the path to the executable.

Err, it's the longest that *we* think the path can be..  That's not the
same as actually being the longest that a path can be, which depends on
the filesystem and OS...  It's not hard to get past our 1024 limit:

sfrost@beorn:/really/long/path> echo $PWD | wc -c
1409

> Now, you could probably write a program to exeve() a binary with argv[0]
> being longer, but in that case you can also just put garbage in there.

We shouldn't blow up in that case either, really.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] additional json functionality

2013-11-15 Thread Merlin Moncure
On Thu, Nov 14, 2013 at 1:54 PM, Hannu Krosing  wrote:
> On 11/14/2013 08:17 PM, Merlin Moncure wrote:
>> On Thu, Nov 14, 2013 at 11:34 AM, David E. Wheeler
>>  wrote:
>>> On Nov 14, 2013, at 7:07 AM, Merlin Moncure  wrote:
>>>
 This is exactly what needs to be done, full stop (how about: hstore).
 It really comes down to this: changing the serialization behaviors
 that have been in production for 2 releases (three if you count the
 extension) is bad enough, but making impossible some legal json
 constructions which are currently possible is an unacceptable
 compatibility break.  It's going to break applications I've currently
 put into production with no clear workaround.  This is quite frankly
 not ok and and I'm calling foul.  The RFC may claim that these
 constructions are dubious but that's irrelevant.  It's up to the
 parser to decide that and when serializing you are not in control of
 the parser.
>>> The current JSON type preserves key order and duplicates. But is it 
>>> documented that this is a feature, or something to be guaranteed?
>> It doesn't, but the row_to_json function has a very clear mechanism of
>> action.  And, 'not being documented' is not the standard for latitude
>> to make arbitrary changes to existing function behaviors.
> the whole hash*() function family was changed based on "not documented"
> premise, so we do have a precedent .
>>
>>> In my experience, no JSON parser guarantees key order or duplication.
>> I found one in about two seconds.  http://docs.python.org/2/library/json.html
>>
>> "object_pairs_hook, if specified will be called with the result of
>> every JSON object decoded with an ordered list of pairs. The return
>> value ofobject_pairs_hook will be used instead of the dict. This
>> feature can be used to implement custom decoders that rely on the
>> order that the key and value pairs are decoded (for example,
>> collections.OrderedDict() will remember the order of insertion). If
>> object_hook is also defined, the object_pairs_hooktakes priority."
>>
>> That makes the rest of your argument moot.  Plus, I quite clearly am
>> dealing with parsers that do.
> I am sure you could also devise an json encoding scheme
> where white space is significant ;)
>
> The question is, how much of it should json *type* support.
>
> As discussed in other thread, most of your requirements
> would be met by having json/row/row set-to-text serializer
> functions which output json-formatted "text".

No, that would not work putting aside the fact it would require
rewriting heaps of code.  What I do now inside the json wrapping
routines is create things like

{
  "x": [
{dynamic object},
{dynamic object},
...
  ],
  "y": ...,
  ...
}

The only way to do it is to build 'dynamic object' into json in
advance of the outer xxx_to_json call.  The 'dynamic object' is
created out of a json builder that takes a paired array -- basically a
variant of Andrew's 'json_build' upthread.  If the 'json serializer'
outputted text, the 'outer' to_json call would then re-escape the
object.  I can't use hstore for that purpose precisely because of the
transformations it does on the object.

Stepping back, I'm using json serialization as a kind of 'supercharged
crosstab'.  To any client that can parse json, json serialization
completely displaces crosstabbing -- it's superior in every way.  I
am, if you may, kind of leading research efforts in the area and I can
tell you with absolute certainty that breaking this behavior is a
mistake.

Forcing hstore-ish output mechanisms removes the ability to handle
certain important edge cases that work just fine today. If that
ability was taken away, it would be a very bitter pill for me to
swallow and would have certain ramifications for me professionally; I
went out on a pretty big limb and pushed pg/json aggressively (over
strenuous objection) in an analytics product which is now in the final
stages of beta testing.  I would hate to see the conclusion of the
case study be "Ultimately we had to migrate the code back to Hibernate
due to compatibility issues".

Here are the options on the table:
1) convert existing json type to binary flavor (notwithstanding objections)
2) maintain side by side types, one representing binary, one text.
unfortunately, i think the text one must get the name 'json' due to
unfortunate previous decision.
3) merge the behaviors into a single type and get the best of both
worlds (as suggested upthread).

I think we need to take a *very* hard look at #3 before exploring #1
or #2: Haven't through it through yet but it may be possible to handle
this in such a way that will be mostly transparent to the end user and
may have other benefits such as a faster path for serialization.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logging WAL when updating hintbit

2013-11-15 Thread Peter Eisentraut
On 11/14/13, 1:02 AM, Sawada Masahiko wrote: 
> I attached patch adds new wal_level 'all'.

Compiler warning:

pg_controldata.c: In function ‘wal_level_str’:
pg_controldata.c:72:2: warning: enumeration value ‘WAL_LEVEL_ALL’ not handled 
in switch [-Wswitch]



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >