date:20141020

On Fri, Oct 17, 2014 at 10:37 PM, Michael Paquier
michael.paqu...@gmail.com wrote:

 On Fri, Oct 17, 2014 at 9:23 PM, Fujii Masao masao.fu...@gmail.com wrote:

 In this case, the patch seems to make the restartpoint recycle even WAL files
 which have .ready files and will have to be archived later. Thought?

 The real problem currently is that it is possible to have a segment file not 
 marked as .done during recovery when stream connection is abruptly cut when 
 this segment is switched, marking it as .ready in archive_status and simply 
 letting this segment in pg_xlog because it will neither be recycled nor 
 removed. I have not been able to look much at this code these days, so I am 
 not sure how invasive it would be in back-branches, but perhaps we should try 
 to improve code such as when a segment file is switched and connection to the 
 is cut, we guarantee that this file is completed and marked as .done.

I have spent more time on that, with a bit more of underground work...
First, the problem can be reproduced most of the time by running this
simple command:
psql -c 'select pg_switch_xlog()'; pg_ctl restart -m immediate

This will enforce a segment file switch and restart the master in
crash recovery. This has as effect to immediately cut the WAL stream
on slave, symbolized by a FATAL in libpqrcv_receive where rawlen == 0.
For example, let's imagine that stream fails when switching from
00010003 to the next segment, then the
the last XLogRecPtr in XLogWalRcvProcessMsg for dataStart is for
example 0/310, and walrcv-latestWalEnd is 0/400. When stream
restarts it will begin once again from 0/400, ignoring that
00010003 should be marked as .done, ultimately marking
it in .ready state when old segment files are recycled or removed.
There is nothing that can really be done to enforce the creation of a
.done file before the FATAL of libpqrcv_receive because we cannot
predict the stream failure..

Now, we can do better than what we have now by looking at WAL start
position used when starting streaming in WAL receiver and enforce
.done if the start position is the last one of previous segment.
Hence, in the case of start position 0/400 that was found
previously, the file that will be enforced to .done is
00010003. I have written the patch attached that
implements this idea and fixes the problem. Now let's see if you guys
see any flaws in this simple logic which uses a sniper gun instead of
a bazooka as in the previous patches sent.

Regards,
-- 
Michael
From ce7e1eec97dbe7b1648b4f56a1f9825eeba7ebed Mon Sep 17 00:00:00 2001
From: Michael Paquier mpaqu...@vmware.com
Date: Mon, 20 Oct 2014 14:40:37 +0900
Subject: [PATCH] Mark segment file .done for last WAL position at stream start

When stream connection between a standby and its root node is unstable,
WAL stream errors make this standby node fail with FATAL errors because
of the WAL receiver, forcing it to restart in crash-recovery mode. Now,
when the crash occurs exactly when a switch to a new segment file is
done, it is possible that the WAL receiver restarts from a position
located on the next segment file, while the previous file is let as is,
marked ultimately in .ready state once old WAL files are recycled or
removed. Note that this file should have been marked as .done by the
WAL receiver itself.

With this patch, WAL receiver checks for the presence of the previous
segment file of start position if this WAL position is the last one of
the previous segment file and enforces it to .done, preventing .ready
files from being accumulated on standbys.

This failure can be easily reproducible with the following command:
psql -c 'select pg_switch_xlog()'; pg_ctl restart -m immediate
---
 src/backend/access/transam/xlogarchive.c | 26 ++
 src/backend/replication/walreceiver.c| 12 
 src/include/access/xlog_internal.h   |  1 +
 3 files changed, 39 insertions(+)

diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c
index 047efa2..25a153f 100644
--- a/src/backend/access/transam/xlogarchive.c
+++ b/src/backend/access/transam/xlogarchive.c
@@ -553,6 +553,32 @@ XLogArchiveNotifySeg(XLogSegNo segno)
 	XLogArchiveNotify(xlog);
 }
 
+
+/*
+ * XLogArchiveForceDoneIfPresent
+ *
+ * Wrapper of XLogArchiveForceDone, used conditionally based on the presence
+ * of given XLOG segment file.
+ */
+void
+XLogArchiveForceDoneIfPresent(TimeLineID tli, XLogSegNo segno)
+{
+	struct stat	stat_buf;
+	char		xlogfname[MAXFNAMELEN];
+	char		xlogfpath[MAXPGPATH];
+
+	XLogFilePath(xlogfpath, tli, segno);
+
+	/* File is missing, nothing to do */
+	if (stat(xlogfpath, stat_buf) != 0)
+		return;
+
+	/* All is fine, process... */
+	XLogFileName(xlogfname, tli, segno);
+	XLogArchiveForceDone(xlogfname);
+}
+
+
 /*
  * XLogArchiveForceDone
  *
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index

Re: [HACKERS] Typos in comments

2014-10-20 Thread David Rowley

On Mon, Oct 20, 2014 at 6:54 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp
wrote:

 I ran into a type  a a  in a comment in snapmgr.c, and found that
 there are four other places that've made the same typo, by the grep
 command.  And in one of those places, I found yet another typo.  Please
 find attached a patch.


Just while on this topic, I had a quick look at the results from the regex
\b(\w+)\s+\1\b which finds duplicate words. After sifting through the false
positives I found a few more. Patch attached.

Regards

David Rowley
diff --git a/src/backend/replication/walsender.c 
b/src/backend/replication/walsender.c
index 384c9b6..385d18b 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -574,7 +574,7 @@ StartReplication(StartReplicationCmd *cmd)
 * to find the requested WAL segment in pg_xlog.
 *
 * XXX: we could be more strict here and only allow a 
startpoint
-* that's older than the switchpoint, if it it's still 
in the same
+* that's older than the switchpoint, if it's still in 
the same
 * WAL segment.
 */
if (!XLogRecPtrIsInvalid(switchpoint) 
diff --git a/src/bin/pg_basebackup/receivelog.c 
b/src/bin/pg_basebackup/receivelog.c
index 89b22f2..c6c90fb 100644
--- a/src/bin/pg_basebackup/receivelog.c
+++ b/src/bin/pg_basebackup/receivelog.c
@@ -753,7 +753,7 @@ ReadEndOfStreamingResult(PGresult *res, XLogRecPtr 
*startpos, uint32 *timeline)
  * initiating streaming with the START_STREAMING command.
  *
  * If the COPY ends (not necessarily successfully) due a message from the
- * server, returns a PGresult and sets sets *stoppos to the last byte written.
+ * server, returns a PGresult and sets *stoppos to the last byte written.
  * On any other sort of error, returns NULL.
  */
 static PGresult *
diff --git a/src/interfaces/ecpg/pgtypeslib/timestamp.c 
b/src/interfaces/ecpg/pgtypeslib/timestamp.c
index b0f9bf1..cf1fed2 100644
--- a/src/interfaces/ecpg/pgtypeslib/timestamp.c
+++ b/src/interfaces/ecpg/pgtypeslib/timestamp.c
@@ -419,7 +419,7 @@ dttofmtasc_replace(timestamp * ts, date dDate, int dow, 
struct tm * tm,
replace_val.str_val = 
months[tm-tm_mon];
replace_type = 
PGTYPES_TYPE_STRING_CONSTANT;
break;
-   /* the full name name of the month */
+   /* the full name of the month */
/* XXX should be locale aware */
case 'B':
replace_val.str_val = 
pgtypes_date_months[tm-tm_mon];

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] agent_init concurrent running question

2014-10-20 Thread Jov

hello,I am reading the pool manager code,and have some problem:

in poolmgr.c function agent_init():

if (agent-pool)
 agent_release_connections(agent, false);

..

#ifdef XCP
 /* find database */
 agent-pool = find_database_pool(database, user_name);
 /* create if not found */
 if (agent-pool == NULL)
 agent-pool = create_database_pool(database, user_name);


in concurrent running,this function with the same dbname and
username，create_database_pool(database, user_name) may be run twice,so
there may be more than one DatabasePool in the databasePools list? But
find_database_pool(database, user_name) will always return the first
find,other may be not used,there may be memory leak?

and why this function call agent_release_connections,and then use
find_database_pool to get a new pool? For concurrent running,is it possible
one agent release the connections which other agent still is in use?

thanks!

Jov
blog: http:amutu.com/blog http://amutu.com/blog

Re: [HACKERS] alter user/role CURRENT_USER

2014-10-20 Thread Rushabh Lathia

I gone through patch and here is the review for this patch:


.) patch go applied on master branch with patch -p1 command
   (git apply failed)
.) regression make check run fine
.) testcase coverage is missing in the patch
.) Over all coding/patch looks good.

Few comments:

1) Any particular reason for not adding SESSION_USER/USER also made usable
for this command ?

2) I think RoleId_or_curruser can be used for following role:

/* ALTER TABLE name OWNER TO RoleId */
| OWNER TO RoleId

3) In the documentation patch, added for CURRENT_USER but CURRENT_ROLE is
missing.


On Fri, Oct 10, 2014 at 1:57 PM, Kyotaro HORIGUCHI 
horiguchi.kyot...@lab.ntt.co.jp wrote:

 Hello, on the way considering alter role set .., I found that
 ALTER ROLE/USER cannot take CURRENT_USER as the role name.

 In addition to that, documents of ALTER ROLE / USER are
 inconsistent with each other in spite of ALTER USER is said to be
 an alias for ALTER ROLE. Plus, ALTER USER cannot take ALL as user
 name although ALTER ROLE can.

 This patch does following things,

  - ALTER USER/ROLE now takes CURRENT_USER as user name.

  - Rewrite sysnopsis of the documents for ALTER USER and ALTER
ROLE so as to they have exactly same syntax.

  - Modify syntax of ALTER USER so as to be an alias of ALTER ROLE.

- Added CURRENT_USER/CURRENT_ROLE syntax to both.
- Added ALL syntax as user name to ALTER USER.
- Added IN DATABASE syntax to ALTER USER.

x Integrating ALTER ROLE/USER syntax could not be done because of
  shift/reduce error of bison.

  x This patch contains no additional regressions. Is it needed?

 SESSION_USER/USER also can be made usable for this command, but
 this patch doesn't so (yet).

 regards,

 --
 Kyotaro Horiguchi
 NTT Open Source Software Center


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
Rushabh Lathia

[HACKERS] Refactor status detection for XLOG segment file in xlogarchive.c (Was: *FF WALs under 9.2)

Hi all,

Except if I am missing something, is there any reason why the sequence used
in XLogArchiveCheckDone and XLogArchiveIsBusy to check if a XLOG segment
has been already archived is duplicated? I guess that doing a little bit of
refactoring here would make sense for simplicity, patch is attached.
Regards,
-- 
Michael
From 9b96e5825f6f6d2cbbc5ddf9c45c6609e6c01fb7 Mon Sep 17 00:00:00 2001
From: Michael Paquier mpaqu...@vmware.com
Date: Mon, 20 Oct 2014 16:33:26 +0900
Subject: [PATCH] Refactor archive status analysis into a single function

The same code block was used for XLogArchiveIsBusy and XLogArchiveCheckDone
to check the archive status of a given segment...
---
 src/backend/access/transam/xlogarchive.c | 50 +---
 src/include/access/xlog_internal.h   |  1 +
 2 files changed, 27 insertions(+), 24 deletions(-)

diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c
index 047efa2..047e8f8 100644
--- a/src/backend/access/transam/xlogarchive.c
+++ b/src/backend/access/transam/xlogarchive.c
@@ -607,12 +607,7 @@ XLogArchiveForceDone(const char *xlog)
 }
 
 /*
- * XLogArchiveCheckDone
- *
- * This is called when we are ready to delete or recycle an old XLOG segment
- * file or backup history file.  If it is okay to delete it then return true.
- * If it is not time to delete it, make sure a .ready file exists, and return
- * false.
+ * XLogArchiveIsDone
  *
  * If XLOG.done exists, then return true; else if XLOG.ready exists,
  * then return false; else create XLOG.ready and return false.
@@ -621,15 +616,11 @@ XLogArchiveForceDone(const char *xlog)
  * create XLOG.ready fails, we'll retry during subsequent checkpoints.
  */
 bool
-XLogArchiveCheckDone(const char *xlog)
+XLogArchiveIsDone(const char *xlog)
 {
 	char		archiveStatusPath[MAXPGPATH];
 	struct stat stat_buf;
 
-	/* Always deletable if archiving is off */
-	if (!XLogArchivingActive())
-		return true;
-
 	/* First check for .done --- this means archiver is done with it */
 	StatusFilePath(archiveStatusPath, xlog, .done);
 	if (stat(archiveStatusPath, stat_buf) == 0)
@@ -645,6 +636,28 @@ XLogArchiveCheckDone(const char *xlog)
 	if (stat(archiveStatusPath, stat_buf) == 0)
 		return true;
 
+	return false;
+}
+
+/*
+ * XLogArchiveCheckDone
+ *
+ * This is called when we are ready to delete or recycle an old XLOG segment
+ * file or backup history file.  If it is okay to delete it then return true.
+ * If it is not time to delete it, make sure a .ready file exists, and return
+ * false.
+ */
+bool
+XLogArchiveCheckDone(const char *xlog)
+{
+	/* Always deletable if archiving is off */
+	if (!XLogArchivingActive())
+		return true;
+
+	/* Check if segment is marked as .done */
+	if (XLogArchiveIsDone(xlog))
+		return true;
+
 	/* Retry creation of the .ready file */
 	XLogArchiveNotify(xlog);
 	return false;
@@ -666,19 +679,8 @@ XLogArchiveIsBusy(const char *xlog)
 	char		archiveStatusPath[MAXPGPATH];
 	struct stat stat_buf;
 
-	/* First check for .done --- this means archiver is done with it */
-	StatusFilePath(archiveStatusPath, xlog, .done);
-	if (stat(archiveStatusPath, stat_buf) == 0)
-		return false;
-
-	/* check for .ready --- this means archiver is still busy with it */
-	StatusFilePath(archiveStatusPath, xlog, .ready);
-	if (stat(archiveStatusPath, stat_buf) == 0)
-		return true;
-
-	/* Race condition --- maybe archiver just finished, so recheck */
-	StatusFilePath(archiveStatusPath, xlog, .done);
-	if (stat(archiveStatusPath, stat_buf) == 0)
+	/* Check that segment is not yet marked as .done */
+	if (XLogArchiveIsDone(xlog))
 		return false;
 
 	/*
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index 27b9899..6d1a66c 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -287,6 +287,7 @@ extern void KeepFileRestoredFromArchive(char *path, char *xlogfname);
 extern void XLogArchiveNotify(const char *xlog);
 extern void XLogArchiveNotifySeg(XLogSegNo segno);
 extern void XLogArchiveForceDone(const char *xlog);
+extern bool XLogArchiveIsDone(const char *xlog);
 extern bool XLogArchiveCheckDone(const char *xlog);
 extern bool XLogArchiveIsBusy(const char *xlog);
 extern void XLogArchiveCleanup(const char *xlog);
-- 
2.1.2


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] agent_init concurrent running question

On Mon, Oct 20, 2014 at 4:29 PM, Jov am...@amutu.com wrote:

 in poolmgr.c function agent_init():

This code is part of Postgres-XC (or Postgres-XL) and not PostgreSQL
itself, you should send your questions there:
https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
Regards,
-- 
Michael

Re: [HACKERS] BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves)

On Fri, Oct 17, 2014 at 10:12 PM, Fujii Masao masao.fu...@gmail.com wrote:

 On Fri, Oct 17, 2014 at 9:23 PM, Fujii Masao masao.fu...@gmail.com
 wrote:
  On Thu, Oct 9, 2014 at 3:26 PM, Michael Paquier
  michael.paqu...@gmail.com wrote:
 
 
  On Wed, Oct 8, 2014 at 10:00 PM, Michael Paquier 
 michael.paqu...@gmail.com
  wrote:
 
  The additional process at promotion sounds like a good idea, I'll try
 to
  get a patch done tomorrow. This would result as well in removing the
  XLogArchiveForceDone stuff. Either way, not that I have been able to
  reproduce the problem manually, things can be clearly solved.
 
  Please find attached two patches aimed to fix this issue and to improve
 the
  situation:
  - 0001 prevents the apparition of those phantom WAL segment file by
 ensuring
  that when a node is in recovery it will remove it whatever its status in
  archive_status. This patch is the real fix, and should be applied down
 to
  9.2.
  - 0002 is a patch implementing Heikki's idea of enforcing all the
 segment
  files present in pg_xlog to have their status to .done, marking them for
  removal. When looking at the code, I finally concluded that Fujii-san's
  point, about marking in all cases as .done segment files that have been
  fully streamed, actually makes more sense to not be backward. This patch
  would actually not be mandatory for back-patching, but it makes the
 process
  more robust IMO.
 
  Thanks for the patches!

 While reviewing the patch, I found another bug related to WAL file in
 recovery
 mode. The problem is that exitArchiveRecovery() always creates .ready file
 for
 the last WAL file of the old timeline even when it's restored from the
 archive
 and has .done file. So this causes the already-archived WAL file to be
 archived
 again Attached patch fixes this bug.

That's a good catch! Patch looks good. I think that you should change
xlogpath to use MAXFNAMELEN instead of MAXPGPATH in exitArchiveRecovery.
This is only for correctness, so that's a master-only remark, because this
variable is used only to calculate a segment file name, and not a path.
Renaming the variable from xlogpath to xlogname would make sense as well.
Regards,
-- 
Michael

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-10-20 Thread Palle Girgensohn

Hello,

How did this testing turn out?

Palle

3 jul 2014 kl. 12:15 skrev Tatsuo Ishii is...@postgresql.org:

 Hi,
 
 Hi,
 
 Attached you can find a short (compile tested only ) patch implementing
 a 'shared_memory_type' GUC, akin to 'dynamic_shared_memory_type'. Will
 only apply to 9.4, not 9.3, but it should be easy to convert for it.
 
 Greetings,
 
 Andres Freund
 
 I have rebased Andres's patch against 9.3-STABLE tree. Please take a
 look at attached patches and let me know if you find anything strange.
 
 I am going to test the patch on a huge HP machine: DL980G7/64
 cores/2TB mem.  With the patch I would be able to report back if using
 SysV shared mem fixes the 9.3 performance problem.
 
 Best regards,
 --
 Tatsuo Ishii
 SRA OSS, Inc. Japan
 English: http://www.sraoss.co.jp/index_en.php
 Japanese:http://www.sraoss.co.jp
 diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
 index f746c81..e82054a 100644
 --- a/src/backend/port/sysv_shmem.c
 +++ b/src/backend/port/sysv_shmem.c
 @@ -72,6 +72,7 @@ static void IpcMemoryDetach(int status, Datum shmaddr);
 static void IpcMemoryDelete(int status, Datum shmId);
 static PGShmemHeader *PGSharedMemoryAttach(IpcMemoryKey key,
IpcMemoryId *shmid);
 +static void *CreateAnonymousSegment(Size *size);
 
 
 /*
 @@ -389,49 +390,19 @@ PGSharedMemoryCreate(Size size, bool makePrivate, int 
 port)
* developer use, this shouldn't be a big problem.
*/
 #ifndef EXEC_BACKEND
 + if (shared_memory_type == SHMEM_TYPE_MMAP)
   {
 - longpagesize = sysconf(_SC_PAGE_SIZE);
 -
 - /*
 -  * Ensure request size is a multiple of pagesize.
 -  *
 -  * pagesize will, for practical purposes, always be a power of 
 two.
 -  * But just in case it isn't, we do it this way instead of using
 -  * TYPEALIGN().
 -  */
 - if (pagesize  0  size % pagesize != 0)
 - size += pagesize - (size % pagesize);
 -
 - /*
 -  * We assume that no one will attempt to run PostgreSQL 9.3 or 
 later
 -  * on systems that are ancient enough that anonymous shared 
 memory is
 -  * not supported, such as pre-2.4 versions of Linux.  If that 
 turns
 -  * out to be false, we might need to add a run-time test here 
 and do
 -  * this only if the running kernel supports it.
 -  */
 - AnonymousShmem = mmap(NULL, size, PROT_READ | PROT_WRITE, 
 PG_MMAP_FLAGS,
 -   -1, 0);
 - if (AnonymousShmem == MAP_FAILED)
 - {
 - int saved_errno = errno;
 -
 - ereport(FATAL,
 - (errmsg(could not map anonymous shared 
 memory: %m),
 -  (saved_errno == ENOMEM) ?
 - errhint(This error usually means that 
 PostgreSQL's request 
 -  for a shared memory segment exceeded 
 available memory 
 -   or swap space. To reduce the request 
 size (currently 
 -   %lu bytes), reduce PostgreSQL's 
 shared memory usage, 
 - perhaps by reducing 
 shared_buffers or 
 - max_connections.,
 - (unsigned long) size) : 0));
 - }
 + AnonymousShmem = CreateAnonymousSegment(size);
   AnonymousShmemSize = size;
 -
   /* Now we need only allocate a minimal-sized SysV shmem block. 
 */
   sysvsize = sizeof(PGShmemHeader);
   }
 + else
 #endif
 + {
 + Assert(shared_memory_type == SHMEM_TYPE_SYSV);
 + sysvsize = size;
 + }
 
   /* Make sure PGSharedMemoryAttach doesn't fail without need */
   UsedShmemSegAddr = NULL;
 @@ -631,3 +602,47 @@ PGSharedMemoryAttach(IpcMemoryKey key, IpcMemoryId 
 *shmid)
 
   return hdr;
 }
 +
 +/*
 + * Creates an anonymous mmap()ed shared memory segment.
 + *
 + * Pass the requested size in *size.  This function will modify *size to the
 + * actual size of the allocation, if it ends up allocating a segment that is
 + * larger than requested.
 + */
 +#ifndef EXEC_BACKEND
 +static void *
 +CreateAnonymousSegment(Size *size)
 +{
 + Sizeallocsize = *size;
 + void   *ptr = MAP_FAILED;
 + int mmap_errno = 0;
 +
 + /*
 +  * use the original size, not the rounded up value, when falling back
 +  * to non-huge pages.
 +  */
 + allocsize = *size;
 + ptr = mmap(NULL, allocsize, PROT_READ | PROT_WRITE,
 +PG_MMAP_FLAGS, -1, 0);
 + mmap_errno = errno;
 +
 + if (ptr ==

Re: [HACKERS] Proposal : REINDEX SCHEMA

2014-10-20 Thread Sawada Masahiko

On Mon, Oct 20, 2014 at 9:41 AM, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:


 On Sun, Oct 19, 2014 at 10:37 PM, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:


 On Sun, Oct 19, 2014 at 1:02 PM, Sawada Masahiko sawada.m...@gmail.com
 wrote:
 
  On Fri, Oct 17, 2014 at 4:32 AM, Fabrízio de Royes Mello
  fabriziome...@gmail.com wrote:
   On Wed, Oct 15, 2014 at 11:41 AM, Sawada Masahiko
   sawada.m...@gmail.com
   wrote:
  
   On Mon, Oct 13, 2014 at 11:16 PM, Robert Haas robertmh...@gmail.com
   wrote:
On Sun, Oct 12, 2014 at 1:27 PM, Stephen Frost sfr...@snowman.net
wrote:
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
Sawada Masahiko wrote:
 Attached WIP patch adds new syntax REINEX SCHEMA which does
 reindexing
 all table of specified schema.
 There are syntax dose reindexing specified index, per table and
 per
 database,
 but we can not do reindexing per schema for now.
   
It seems doubtful that there really is much use for this feature,
but
if
there is, I think a better syntax precedent is the new ALTER
TABLE ALL
IN TABLESPACE thingy, rather than your proposed REINDEX SCHEMA.
Something like REINDEX TABLE ALL IN SCHEMA perhaps.
   
Yeah, I tend to agree that we should be looking at the 'ALL IN
TABLESPACE' and 'ALL IN SCHEMA' type of commands to keep things
consistent.  This might be an alternative for the vacuum / analyze
/
reindex database commands also..
   
Urgh.  I don't have a problem with that syntax in general, but it
clashes pretty awfully with what we're already doing for REINDEX
otherwise.
   
  
   Attached patches are latest version patch.
  
   Ok.
  
  
   I changed syntax to REINDEX ALL IN SCHEMA, but I felt a sense of
   discomfort a little
   as Robert mentioned.
  
  
   I understood, but the real problem will in a near future when the
   features
   will be pushed... :-)
  
   They are separated features, but maybe we can join this features to a
   one
   future commit... it's just an idea.
  
  
   Anyway, you can apply these patches in numerical order,
   can use REINDEX ALL IN SCHEMA feature and  -S/--schema option in
   reindexdb.
  
   000_reindex_all_in_schema_v2.patch : It contains REINDEX ALL IN
   SCHEMA
   feature
  
   1) Compile without warnings
  
  
   2) IMHO you can add more test cases to better code coverage:
  
   * reindex a schema that doesn't exists
   * try to run reindex all in schema inside a transaction block
  
  
   3) Isn't enough just?
  
   bool do_database = (kind == OBJECT_DATABASE);
  
   ... instead of...
  
   +   bool do_database = (kind == OBJECT_DATABASE) ? true : false;
  
  
   4) IMHO you can add other Assert to check valid relkinds, like:
  
   Assert(kind == OBJECT_DATABASE || kind == OBJECT_SCHEMA);
  
  
   5) I think is more legible:
  
   /* Get OID of object for result */
   if (do_database)
   objectOid = MyDatabaseId
   else
   objectOid = get_namespace_oid(objectName, false);
  
   ... insead of ...
  
   +   /* Get OID of object for result */
   +   objectOid = (do_database) ? MyDatabaseId :
   get_namespace_oid(objectName,
   false);
  
  
  
   001_Add_schema_option_to_reindexdb_v1.patch : It contains reindexdb
   -S/--schema supporting
  
  
   The code itself is good for me, but  IMHO you can add test cases to
   src/bin/scripts/t/090_reindexdb.pl
  
 
  Thank you for reviewing.

 You're welcome!


  I agree 2) - 5).

 :-)


  Attached patch is latest version patch I modified above.

 All is fine to me now... all work as expected and no compiler warnings.

 There are just a little fix to do in src/bin/scripts/t/090_reindexdb.pl

 -use Test::More tests = 7;
 +use Test::More tests = 8;

 Because you added a new testcase to suittest, so you need to increase the
 test count at beginning of the file.


 Patch attached. Now the regress run without errors.


Thank you for reviewing and revising!
I also did successfully.
It looks good to me :)

Regards,

---
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal : REINDEX SCHEMA

2014-10-20 Thread Fabrízio de Royes Mello

On Mon, Oct 20, 2014 at 11:24 AM, Sawada Masahiko sawada.m...@gmail.com
wrote:

 On Mon, Oct 20, 2014 at 9:41 AM, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:
 
 
  On Sun, Oct 19, 2014 at 10:37 PM, Fabrízio de Royes Mello
  fabriziome...@gmail.com wrote:
 
 
  On Sun, Oct 19, 2014 at 1:02 PM, Sawada Masahiko sawada.m...@gmail.com

  wrote:
  
   On Fri, Oct 17, 2014 at 4:32 AM, Fabrízio de Royes Mello
   fabriziome...@gmail.com wrote:
On Wed, Oct 15, 2014 at 11:41 AM, Sawada Masahiko
sawada.m...@gmail.com
wrote:
   
On Mon, Oct 13, 2014 at 11:16 PM, Robert Haas 
robertmh...@gmail.com
wrote:
 On Sun, Oct 12, 2014 at 1:27 PM, Stephen Frost 
sfr...@snowman.net
 wrote:
 * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
 Sawada Masahiko wrote:
  Attached WIP patch adds new syntax REINEX SCHEMA which does
  reindexing
  all table of specified schema.
  There are syntax dose reindexing specified index, per table
and
  per
  database,
  but we can not do reindexing per schema for now.

 It seems doubtful that there really is much use for this
feature,
 but
 if
 there is, I think a better syntax precedent is the new ALTER
 TABLE ALL
 IN TABLESPACE thingy, rather than your proposed REINDEX
SCHEMA.
 Something like REINDEX TABLE ALL IN SCHEMA perhaps.

 Yeah, I tend to agree that we should be looking at the 'ALL IN
 TABLESPACE' and 'ALL IN SCHEMA' type of commands to keep things
 consistent.  This might be an alternative for the vacuum /
analyze
 /
 reindex database commands also..

 Urgh.  I don't have a problem with that syntax in general, but
it
 clashes pretty awfully with what we're already doing for REINDEX
 otherwise.

   
Attached patches are latest version patch.
   
Ok.
   
   
I changed syntax to REINDEX ALL IN SCHEMA, but I felt a sense of
discomfort a little
as Robert mentioned.
   
   
I understood, but the real problem will in a near future when the
features
will be pushed... :-)
   
They are separated features, but maybe we can join this features
to a
one
future commit... it's just an idea.
   
   
Anyway, you can apply these patches in numerical order,
can use REINDEX ALL IN SCHEMA feature and  -S/--schema option in
reindexdb.
   
000_reindex_all_in_schema_v2.patch : It contains REINDEX ALL IN
SCHEMA
feature
   
1) Compile without warnings
   
   
2) IMHO you can add more test cases to better code coverage:
   
* reindex a schema that doesn't exists
* try to run reindex all in schema inside a transaction block
   
   
3) Isn't enough just?
   
bool do_database = (kind == OBJECT_DATABASE);
   
... instead of...
   
+   bool do_database = (kind == OBJECT_DATABASE) ? true : false;
   
   
4) IMHO you can add other Assert to check valid relkinds, like:
   
Assert(kind == OBJECT_DATABASE || kind == OBJECT_SCHEMA);
   
   
5) I think is more legible:
   
/* Get OID of object for result */
if (do_database)
objectOid = MyDatabaseId
else
objectOid = get_namespace_oid(objectName, false);
   
... insead of ...
   
+   /* Get OID of object for result */
+   objectOid = (do_database) ? MyDatabaseId :
get_namespace_oid(objectName,
false);
   
   
   
001_Add_schema_option_to_reindexdb_v1.patch : It contains
reindexdb
-S/--schema supporting
   
   
The code itself is good for me, but  IMHO you can add test cases to
src/bin/scripts/t/090_reindexdb.pl
   
  
   Thank you for reviewing.
 
  You're welcome!
 
 
   I agree 2) - 5).
 
  :-)
 
 
   Attached patch is latest version patch I modified above.
 
  All is fine to me now... all work as expected and no compiler warnings.
 
  There are just a little fix to do in src/bin/scripts/t/090_reindexdb.pl
 
  -use Test::More tests = 7;
  +use Test::More tests = 8;
 
  Because you added a new testcase to suittest, so you need to increase
the
  test count at beginning of the file.
 
 
  Patch attached. Now the regress run without errors.
 

 Thank you for reviewing and revising!

You're welcome!


 I also did successfully.
 It looks good to me :)


Changed status to Ready for commiter.


Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello

Re: [HACKERS] Typos in comments

2014-10-20 Thread Robert Haas

On Mon, Oct 20, 2014 at 3:04 AM, David Rowley dgrowle...@gmail.com wrote:
 On Mon, Oct 20, 2014 at 6:54 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp
 wrote:
 I ran into a type  a a  in a comment in snapmgr.c, and found that
 there are four other places that've made the same typo, by the grep
 command.  And in one of those places, I found yet another typo.  Please
 find attached a patch.

 Just while on this topic, I had a quick look at the results from the regex
 \b(\w+)\s+\1\b which finds duplicate words. After sifting through the false
 positives I found a few more. Patch attached.

Thanks.  I have committed both patches.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Would you help to review our modifications

2014-10-20 Thread rohtodeveloper

Dear All,
  L've got a question for you.
If the value of the integer type is converted to a value of type 
Boolean，PostgreSQL will display just likeThe rule has already exist;
1. CREATE CAST (integer AS bool) WITH INOUT AS IMPLICIT;  

ERROR:  cast from type integer to type boolean already exists
  If I want to drop the rule,will be told likeFrom integer to boolean type 
rules may not be dropped;
2. DROP CAST (integer AS bool) ;
ERROR:  cannot drop cast from integer to boolean because it is required by the 
database system
  Since the rule has already exist,Why execute the following methods will 
be in error?
3. insert into testbool(p1,p2)values(20,0::integer);
ERROR:  column p2 is of type boolean but expression is of type integer
  So how to deal with this kind of situation if I want a implicit 
conversion?

 Best Regards,

 rohtodeveloper

Re: [HACKERS] Wrong filename in comment

Marko Tiikkaja ma...@joh.to writes:
 Commit 32984d8fc3dbb90a3fafb69fece0134f1ea790f9 forgot to change the 
 filename in the comment in contrib/pgcrypto/pgcrypto--1.2.sql.  Trivial 
 patch attached.

Pushed, thanks.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] narwhal and PGDLLIMPORT

Noah Misch n...@leadboat.com writes:
 I reproduced narwhal's problem using its toolchain on another 32-bit Windows
 Server 2003 system.  The crash happens at the SHGetFolderPath() call in
 pqGetHomeDirectory().  A program can acquire that function via shfolder.dll or
 via shell32.dll; we've used the former method since commit 889f038, for better
 compatibility[1] with Windows NT 4.0.  On this system, shfolder.dll's version
 loads and unloads shell32.dll.  In PostgreSQL built using this older compiler,
 shfolder.dll:SHGetFolderPath() unloads libpq in addition to unloading shell32!
 That started with commit 846e91e.

Thanks for doing the detective work on this!

 I don't expect to understand the mechanism
 behind it, but I recommend we switch back to linking libpq with shell32.dll.
 The MSVC build already does that in all supported branches, and it feels right
 for the MinGW build to follow suit in 9.4+.  Windows versions that lack the
 symbol in shell32.dll are now ancient history.

This is basically reverting 889f038, right?  It looks to me like we made
that change only to support NT4, which was obsolete even in 2005, so no
objection from me.  Magnus might have a different idea though.

 I happened to try the same contrib/dblink test suite on PostgreSQL built with
 modern MinGW-w64 (i686-4.9.1-release-win32-dwarf-rt_v3-rev1).  That, too, gave
 a crash-like symptom starting with commit 846e91e.

Ick.

 Passing -static-libgcc to the link restores the libgcc situation as it stood
 before commit 846e91e.  The main beneficiary of shared libgcc is C++/Java
 exception handling, so PostgreSQL doesn't care.  No doubt there's some deeper
 bug in libgcc or in PostgreSQL; loading a module that links with shared libgcc
 should not disrupt exit().  I'm content with this workaround.

Works for me too, until such time as somebody feels like digging deeper.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] narwhal and PGDLLIMPORT

2014-10-20 Thread Noah Misch

On Mon, Oct 20, 2014 at 11:06:54AM -0400, Tom Lane wrote:
 Noah Misch n...@leadboat.com writes:
  I don't expect to understand the mechanism
  behind it, but I recommend we switch back to linking libpq with shell32.dll.
  The MSVC build already does that in all supported branches, and it feels 
  right
  for the MinGW build to follow suit in 9.4+.  Windows versions that lack the
  symbol in shell32.dll are now ancient history.
 
 This is basically reverting 889f038, right?

Mostly so.  Keeping the SHGetSpecialFolderPath() - SHGetFolderPath() part,
but reverting the shell32.dll - shfolder.dll part.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] alter user/role CURRENT_USER

2014-10-20 Thread Brightwell, Adam

Kyotaro,

Food for thought.  Couldn't you reduce the following block:

+ if (strcmp(stmt-role, current_user) == 0)
+ {
+ roleid = GetUserId();
+ tuple = SearchSysCache1(AUTHOID, ObjectIdGetDatum(roleid));
+ if (!HeapTupleIsValid(tuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_OBJECT),
+ errmsg(roleid %d does not exist, roleid)));
+ }
+ else
+ {
+ tuple = SearchSysCache1(AUTHNAME, PointerGetDatum(stmt-role));
+ if (!HeapTupleIsValid(tuple))
+ ereport(ERROR,
+ (errcode(ERRCODE_UNDEFINED_OBJECT),
+ errmsg(role \%s\ does not exist, stmt-role)));

To:

if (strcmp(stmt-role, current_user) == 0)
roleid = GetUserId();
else
roleid = get_role_oid(stmt-role, false);

tuple = SearchSysCache1(AUTHOID, ObjectIdGetDatum(roleid));

if (!HeapTupleIsValid(tuple))
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
 errmsg(roleid %d does not exist, roleid)));

I think this makes it a bit cleaner.  It also reuses existing code as
'get_role_oid()' already does a valid role name check and will raise the
proper error.

-Adam


On Mon, Oct 20, 2014 at 3:40 AM, Rushabh Lathia rushabh.lat...@gmail.com
wrote:

 I gone through patch and here is the review for this patch:


 .) patch go applied on master branch with patch -p1 command
(git apply failed)
 .) regression make check run fine
 .) testcase coverage is missing in the patch
 .) Over all coding/patch looks good.

 Few comments:

 1) Any particular reason for not adding SESSION_USER/USER also made usable
 for this command ?

 2) I think RoleId_or_curruser can be used for following role:

 /* ALTER TABLE name OWNER TO RoleId */
 | OWNER TO RoleId

 3) In the documentation patch, added for CURRENT_USER but CURRENT_ROLE is
 missing.


 On Fri, Oct 10, 2014 at 1:57 PM, Kyotaro HORIGUCHI 
 horiguchi.kyot...@lab.ntt.co.jp wrote:

 Hello, on the way considering alter role set .., I found that
 ALTER ROLE/USER cannot take CURRENT_USER as the role name.

 In addition to that, documents of ALTER ROLE / USER are
 inconsistent with each other in spite of ALTER USER is said to be
 an alias for ALTER ROLE. Plus, ALTER USER cannot take ALL as user
 name although ALTER ROLE can.

 This patch does following things,

  - ALTER USER/ROLE now takes CURRENT_USER as user name.

  - Rewrite sysnopsis of the documents for ALTER USER and ALTER
ROLE so as to they have exactly same syntax.

  - Modify syntax of ALTER USER so as to be an alias of ALTER ROLE.

- Added CURRENT_USER/CURRENT_ROLE syntax to both.
- Added ALL syntax as user name to ALTER USER.
- Added IN DATABASE syntax to ALTER USER.

x Integrating ALTER ROLE/USER syntax could not be done because of
  shift/reduce error of bison.

  x This patch contains no additional regressions. Is it needed?

 SESSION_USER/USER also can be made usable for this command, but
 this patch doesn't so (yet).

 regards,

 --
 Kyotaro Horiguchi
 NTT Open Source Software Center


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




 --
 Rushabh Lathia




-- 
Adam Brightwell - adam.brightw...@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com

Re: [HACKERS] Proposal: Log inability to lock pages during vacuum

2014-10-20 Thread Greg Stark

On Mon, Oct 20, 2014 at 2:57 AM, Jim Nasby jim.na...@bluetreble.com wrote:
 Currently, a non-freeze vacuum will punt on any page it can't get a cleanup
 lock on, with no retry. Presumably this should be a rare occurrence, but I
 think it's bad that we just assume that and won't warn the user if something
 bad is going on.

 My thought is that if we skip any pages elog(LOG) how many we skipped. If we
 skip more than 1% of the pages we visited (not relpages) then elog(WARNING)
 instead.

Is there some specific failure you've run into where a page was stuck
in a pinned state and never got vacuumed?

I would like to see a more systematic way of going about this. What
LSN or timestamp is associated with the oldest unvacuumed page? How
many times have we tried to visit it? What do those numbers look like
overall -- i.e. what's the median number of times it takes to vacuum a
page and what does the distribution look like of the unvacuumed ages?

With that data it should be possible to determine if the behaviour is
actually working well and where to draw the line to determine outliers
that might represent bugs.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Trailing comma support in SELECT statements

On Oct 18, 2014, at 7:06 PM, Jim Nasby jim.na...@bluetreble.com wrote:

 Yes.
 
 The only case I can think of where we wouldn't want this is COPY.
 
 BTW, this should also apply to delimiters other than commas; for example, 
 some geometry types use ; as a delimiter between points.

I don’t think it should apply to the internals of types, necessarily. JSON, for 
example, always dies on an trailing comma, so should probably stay that way. 
Well, maybe allow it on JSONB input, but not JSON. Though we perhaps don’t want 
their behaviors to diverge.

D



smime.p7s
Description: S/MIME cryptographic signature

Re: [HACKERS] Would you help to review our modifications

2014-10-20 Thread David G Johnston

rohtodeveloper wrote
 So how to deal with this kind of situation if I want a implicit
 conversion?

As of the out-of-support 8.3 release many of the implicit casts previously
defined have been changed to explicit casts.  It is a catalog change -
obviously, since you can still define implicit casts - so if you absolutely
must have the pre-existing cast be implicit you can modify the catalog
directly.

You may wish to describe why you think this is the solution you need - with
implicit casting there are generally more downsides that upsides.

Dave





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Would-you-help-to-review-our-modifications-tp5823666p5823679.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Trailing comma support in SELECT statements

2014-10-20 Thread Andrew Dunstan



On 10/20/2014 11:59 AM, David E. Wheeler wrote:

On Oct 18, 2014, at 7:06 PM, Jim Nasby jim.na...@bluetreble.com wrote:


Yes.

The only case I can think of where we wouldn't want this is COPY.

BTW, this should also apply to delimiters other than commas; for example, some 
geometry types use ; as a delimiter between points.

I don’t think it should apply to the internals of types, necessarily. JSON, for 
example, always dies on an trailing comma, so should probably stay that way. 
Well, maybe allow it on JSONB input, but not JSON. Though we perhaps don’t want 
their behaviors to diverge.




The JSON spec is quite clear on this. Leading and trailing commas are 
not allowed. I would fight tooth and nail not to allow it for json (and 
by implication jsonb, since they use literally the same parser - in fact 
we do that precisely so their input grammars can't diverge).


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Superuser connect during smart shutdown

2014-10-20 Thread Robert Haas

On Sun, Oct 19, 2014 at 12:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 But TBH I suspect 95% of the problems here would vanish if smart
 shutdown weren't the default ...

 But for your repeated objections, we would have changed the default to fast 
 years ago. AFAICT everyone else is in favor of that.

 I've certainly objected to it in the past, but I don't believe
 I was the only one objecting.

What's your feeling now?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_basebackup fails with long tablespace paths

My Salesforce colleague Thomas Fanghaenel observed that the TAP tests
for pg_basebackup fail when run in a sufficiently deeply-nested directory
tree.  The cause appears to be that we rely on standard tar format
to represent the symlink for a tablespace, and POSIX tar format has a
hard-wired restriction of 99 bytes in a symlink's expansion.

What do we want to do about this?  I think a minimum expectation would be
for pg_basebackup to notice and complain when it's trying to create an
unworkably long symlink entry, but it would be far better if we found a
way to cope instead.

One thing we could possibly do without reinventing tar is to avoid using
absolute path names if a PGDATA-relative one would do.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] wal-size limited to 16MB - Performance issue for subsequent backup

Hi.

One of our production issues is that the system generates lots of
wal-files, lots is like 151952 files over the last 24h, which is about
2.4TB worth of WAL files. I wouldn't say that isn't an issue by itself,
but the system does indeed work fine. We do subsequently gzip the files to
limit actual disk-usage, this makes the files roughly 30-50% in size.

That being said, along comes the backup, scheduled ones a day and tries to
read off these wal-files, which to the backup looks like an awfull lot of
small files, our backup utillized a single thread to read of those files
and levels of at reading through 30-40MB/s from a 21 drive Raid50 of
rotating drives, which is quite bad. That causes a daily incremental run
to take in the order of 24h. Differential picking up larger deltas and
full are even worse.

One could of-course question a lot of the other things here, but
currently the only limiting factor is actually the backup being
able to keep up transporting the wal-log away from the system due
to the small wal size.

A short test like:
find . -type f -ctime -1 | tail -n 50 | xargs cat | pipebench /dev/null
confirms the backup speed to be roughly the same as seen by the backup
software.
Another test from the same volume doing:
find . -type f -ctime -1 | tail -n 50 | xargs cat largefile
And then wait for the fs to not cache the file any more and
cat largefile | pipebench /dev/null
confirms that the disk-subsystem can do way (150-200MB/s) better on larger
files.

Thread here around the same topic:
http://postgresql.1045698.n5.nabble.com/How-do-you-change-the-size-of-the-WAL-files-td3425516.html
But not a warm welcoming workaround.

Suggestions are welcome. An archive-command/restore command that could
combine/split wal-segments might be the easiest workaround, but how about
crash-safeness?

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wal-size limited to 16MB - Performance issue for subsequent backup

jes...@krogh.cc writes:
 Thread here around the same topic:
 http://postgresql.1045698.n5.nabble.com/How-do-you-change-the-size-of-the-WAL-files-td3425516.html
 But not a warm welcoming workaround.

configure --with-wal-segsize=something ?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Superuser connect during smart shutdown

Robert Haas robertmh...@gmail.com writes:
 On Sun, Oct 19, 2014 at 12:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I've certainly objected to it in the past, but I don't believe
 I was the only one objecting.

 What's your feeling now?

I'm prepared to yield on the point.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wal-size limited to 16MB - Performance issue for subsequent backup

2014-10-20 Thread k...@rice.edu

On Mon, Oct 20, 2014 at 09:03:59PM +0200, jes...@krogh.cc wrote:
 Hi.
 
 One of our production issues is that the system generates lots of
 wal-files, lots is like 151952 files over the last 24h, which is about
 2.4TB worth of WAL files. I wouldn't say that isn't an issue by itself,
 but the system does indeed work fine. We do subsequently gzip the files to
 limit actual disk-usage, this makes the files roughly 30-50% in size.
 
 ...
 
 Suggestions are welcome. An archive-command/restore command that could
 combine/split wal-segments might be the easiest workaround, but how about
 crash-safeness?
 

Hi,

Have you considered using something like tar/star in the archive command
to pack them into much larger tar archives?

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wal-size limited to 16MB - Performance issue for subsequent backup

 jes...@krogh.cc writes:
 Thread here around the same topic:
 http://postgresql.1045698.n5.nabble.com/How-do-you-change-the-size-of-the-WAL-files-td3425516.html
 But not a warm welcoming workaround.

 configure --with-wal-segsize=something ?

Yes, but there are good reasons not to go down that route. One is that:

1) It looks like I'am going to be the only one, beware of the dragons.
2) It requires apparently a re-initdb, thus dump/restore of 4.5TB of data

The latter can absolutely be done by scheduling downtime, but the former
would compromise any level of good practice around our DB-operations.

Jesper




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wal-size limited to 16MB - Performance issue for subsequent backup

Hi,

On 2014-10-20 21:03:59 +0200, jes...@krogh.cc wrote:
 One of our production issues is that the system generates lots of
 wal-files, lots is like 151952 files over the last 24h, which is about
 2.4TB worth of WAL files. I wouldn't say that isn't an issue by itself,
 but the system does indeed work fine. We do subsequently gzip the files to
 limit actual disk-usage, this makes the files roughly 30-50% in size.

Have you analyzed what the source of that volume is? Which version of
postgres are you using? What's your checkpoint_timeout/segments
settings?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wal-size limited to 16MB - Performance issue for subsequent backup

jes...@krogh.cc writes:
 configure --with-wal-segsize=something ?

 Yes, but there are good reasons not to go down that route. One is that:

 1) It looks like I'am going to be the only one, beware of the dragons.
 2) It requires apparently a re-initdb, thus dump/restore of 4.5TB of data

I think a clean shutdown and pg_resetxlog would be sufficient.  I agree
you'd need to do some testing though ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wal-size limited to 16MB - Performance issue for subsequent backup


 On 2014-10-20 21:03:59 +0200, jes...@krogh.cc wrote:
 One of our production issues is that the system generates lots of
 wal-files, lots is like 151952 files over the last 24h, which is about
 2.4TB worth of WAL files. I wouldn't say that isn't an issue by itself,
 but the system does indeed work fine. We do subsequently gzip the files
 to
 limit actual disk-usage, this makes the files roughly 30-50% in size.

 Have you analyzed what the source of that volume is? Which version of
 postgres are you using? What's your checkpoint_timeout/segments
 settings?

Suggestions are surely welcome. I do suspect the majority is from 30
concurrent processes updating an 506GB GIN index, but it would be nice to
confirm that. There is also a message-queue in the DB with a fairly high
turnaround.

Currently PG 9.2 moving to 9.3 hopefully before end-of-year,
checkpoint_timeout = 30min, checkpoint_segments = 4096.

According to logs checkpoints are roughly 15 minutes apart.

Jesper




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wal-size limited to 16MB - Performance issue for subsequent backup

On 2014-10-20 21:41:26 +0200, jes...@krogh.cc wrote:

On 2014-10-20 21:03:59 +0200, jes...@krogh.cc wrote:
One of our production issues is that the system generates lots of
wal-files, lots is like 151952 files over the last 24h, which is about
2.4TB worth of WAL files. I wouldn't say that isn't an issue by itself,
but the system does indeed work fine. We do subsequently gzip the files
to limit actual disk-usage, this makes the files roughly 30-50% in size.

I'm a bit doubtful that 16MB vs., say, 64MB files really changes
anything substantial for you. If it indeed is a problem, it's simple
enough to join the files temporarily?

Have you analyzed what the source of that volume is? Which version of
postgres are you using? What's your checkpoint_timeout/segments
settings?

Suggestions are surely welcome.

Once you're on 9.3 I'd suggest using pg_xlogdump --stats on it. There's
a backport of the facility for 9.3 (looking somewhat different than what
is now in 9.5) at
http://archives.postgresql.org/message-id/CABRT9RAzGowqLFcEE8aF6VdPoFEy%2BP9gmu7ktGRzw0dgRwVr9Q%40mail.gmail.com

That'd tell you a fair bit more. It's noticeably harder to backport to
9.3.

I do suspect the majority is from 30 concurrent processes updating an
506GB GIN index, but it would be nice to confirm that. There is also a
message-queue in the DB with a fairly high turnaround.

A 506GB GIN index? Uh, interesting :). What's it used for? Trigrams?

I'd suspect that the message queue isn't the primary culprit, but it's
hard to say for sure.

Currently PG 9.2 moving to 9.3 hopefully before end-of-year,
checkpoint_timeout = 30min, checkpoint_segments = 4096.

Generally a high checkpoint_timeout can significantly reduce the WAL
volume because of fewer full page writes. I've seen cases where spacing
checkpoint further apart by a factor of two reduced the overall WAL
volume by more than two.

According to logs checkpoints are roughly 15 minutes apart.

Can you show log_checkpoints output?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wal-size limited to 16MB - Performance issue for subsequent backup

 On 2014-10-20 21:41:26 +0200, jes...@krogh.cc wrote:

  On 2014-10-20 21:03:59 +0200, jes...@krogh.cc wrote:
  One of our production issues is that the system generates lots of
  wal-files, lots is like 151952 files over the last 24h, which is
 about
  2.4TB worth of WAL files. I wouldn't say that isn't an issue by
 itself,
  but the system does indeed work fine. We do subsequently gzip the
 files
  to limit actual disk-usage, this makes the files roughly 30-50% in
 size.

 I'm a bit doubtful that 16MB vs., say, 64MB files really changes
 anything substantial for you. If it indeed is a problem, it's simple
 enough to join the files temporarily?

I am trying to get my head around a good way to do that. 64MB probably
isn't a silverbullet. But it would definetely benefit the backup in terms
of single thread access to data on rotating drives.

  Have you analyzed what the source of that volume is? Which version of
  postgres are you using? What's your checkpoint_timeout/segments
  settings?

 Suggestions are surely welcome.

 Once you're on 9.3 I'd suggest using pg_xlogdump --stats on it. There's
 a backport of the facility for 9.3 (looking somewhat different than what
 is now in 9.5) at
 http://archives.postgresql.org/message-id/CABRT9RAzGowqLFcEE8aF6VdPoFEy%2BP9gmu7ktGRzw0dgRwVr9Q%40mail.gmail.com

 That'd tell you a fair bit more. It's noticeably harder to backport to 
 9.3.

I'll bookmark that one.

 I do suspect the majority is from 30 concurrent processes updating an
 506GB GIN index, but it would be nice to confirm that. There is also a
 message-queue in the DB with a fairly high turnaround.

 A 506GB GIN index? Uh, interesting :). What's it used for? Trigrams?

It is for full-text-search, but it is being updated entirely regulary,
~100M records. A dump/restore cycle typically reduces the size to 30-40%
of current size.


 Currently PG 9.2 moving to 9.3 hopefully before end-of-year,
 checkpoint_timeout = 30min, checkpoint_segments = 4096.

 Generally a high checkpoint_timeout can significantly reduce the WAL
 volume because of fewer full page writes. I've seen cases where spacing
 checkpoint further apart by a factor of two reduced the overall WAL
 volume by more than two.

I'll work with that, I was just uncomfortable bumping checkpoint_segments
up much higher, any field experience in that corner?

 According to logs checkpoints are roughly 15 minutes apart.

 Can you show log_checkpoints output?

2014-10-20 18:10:22 CEST LOG:  checkpoint starting: time
2014-10-20 18:15:44 CEST LOG:  checkpoint complete: wrote 76851 buffers
(7.3%); 0 transaction log file(s) added, 0 removed, 3238 recycled;
write=295.834 s, sync=23.903 s, total=322.011 s; sync files=2115,
longest=0.278 s, average=0.011 s
2014-10-20 18:40:22 CEST LOG:  checkpoint starting: time
2014-10-20 18:44:30 CEST LOG:  checkpoint complete: wrote 60550 buffers
(5.8%); 0 transaction log file(s) added, 0 removed, 3460 recycled;
write=224.678 s, sync=21.795 s, total=248.340 s; sync files=2090,
longest=0.963 s, average=0.010 s
2014-10-20 19:10:22 CEST LOG:  checkpoint starting: time
2014-10-20 19:14:11 CEST LOG:  checkpoint complete: wrote 42720 buffers
(4.1%); 0 transaction log file(s) added, 0 removed, 3598 recycled;
write=206.259 s, sync=21.185 s, total=229.254 s; sync files=2065,
longest=0.945 s, average=0.010 s
2014-10-20 19:40:22 CEST LOG:  checkpoint starting: time
2014-10-20 19:43:31 CEST LOG:  checkpoint complete: wrote 32897 buffers
(3.1%); 0 transaction log file(s) added, 0 removed, 3626 recycled;
write=161.801 s, sync=26.936 s, total=189.635 s; sync files=2115,
longest=0.458 s, average=0.012 s
2014-10-20 20:10:22 CEST LOG:  checkpoint starting: time
2014-10-20 20:14:04 CEST LOG:  checkpoint complete: wrote 37557 buffers
(3.6%); 0 transaction log file(s) added, 0 removed, 3285 recycled;
write=205.011 s, sync=16.550 s, total=222.442 s; sync files=2113,
longest=0.935 s, average=0.007 s
2014-10-20 20:40:22 CEST LOG:  checkpoint starting: time
2014-10-20 20:45:18 CEST LOG:  checkpoint complete: wrote 58012 buffers
(5.5%); 0 transaction log file(s) added, 0 removed, 3678 recycled;
write=252.750 s, sync=39.178 s, total=295.107 s; sync files=2075,
longest=0.990 s, average=0.018 s
2014-10-20 21:10:22 CEST LOG:  checkpoint starting: time
2014-10-20 21:13:31 CEST LOG:  checkpoint complete: wrote 40530 buffers
(3.9%); 0 transaction log file(s) added, 0 removed, 3652 recycled;
write=167.925 s, sync=19.719 s, total=189.057 s; sync files=2077,
longest=0.470 s, average=0.009 s
2014-10-20 21:40:22 CEST LOG:  checkpoint starting: time
2014-10-20 21:44:20 CEST LOG:  checkpoint complete: wrote 45158 buffers
(4.3%); 0 transaction log file(s) added, 0 removed, 3449 recycled;
write=202.986 s, sync=32.564 s, total=237.441 s; sync files=2100,
longest=0.445 s, average=0.015 s





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables

On 2014-10-19 20:43:29 -0500, Jim Nasby wrote:
 On 10/19/14, 11:41 AM, Andres Freund wrote:
 On 2014-10-18 21:36:48 -0500, Jim Nasby wrote:
 The weird part is that if it's not doing a freeze it will just punt
 on a page if it can't get the cleanup lock.
 
 I don't think that's particularly wierd. Otherwise vacuum can get stuck
 behind a single very hot page - leading to much, much more bloat.
 
 I have to believe that could seriously screw up autovacuum scheduling.
 
 Why?
 
 I'm worried there could be some pathological cases where we'd skip a
 large number of pages, perhaps if a vacuum scan and a seqscan ended up
 running alongside each other.

I've seen little evidence of that. The reverse, a stuck autovacuum, is
imo much more likely. For this to be an actual problem you'd need to
encounter many pages that are not locked, but are pinned. That state
doesn't exist for very long.

 Perhaps this is just paranoia, but we have no idea how bad things
 might be, because we don't have any logging for how many pages we
 skipped because we couldn't lock them.

But so what? If we skip individual pages it won't be too bad - and very
likely waiting very long is going to be more painful. The page won't be
marked 'all visible' so the next vacuum will come around to it
again. And it'll also get cleaned up by opportunistic hot pruning.

 Also, if this really is that big a deal for heap pages, how come we
 don't get screwed by it on Btree index pages, where we mandate that we
 acquire a cleanup lock?

Because we never hold pins for btree pages for very long. Whereas we do
that for heap pages. If you e.g. run a cursor forward you can hold a pin
for essentially unbounded time.

 Now that we have forks, I'm wondering if it would be best to come up
 with a per-page system that could be used to determine when a table
 needs background work to be done. The visibility map could serve a lot
 of this purpose, but I'm not sure if it would work for getting hint
 bits set in the background.
 
 It would. Per definition, all tuples that are 'all visible' need to be
 fully hint bitted.
 
 I think it would also be a win if we had a way to advance relfrozenxid
 and relminmxid. Perhaps something that simply remembered the last XID
 that touched each page...
 
 Not sure what you're getting at here?
 
 That ultimately, our current method for determining when and what to
 vacuum is rather crude, and likely results in wasted effort during
 scans as well as not firing autovac often enough. Keep in mind that
 autovac started as a user-space utility and the best it could possibly
 do was to keep a table of stats counters.

I agree that we should trigger autovacuum more often. It's
*intentionally* not triggered *at all* for insert only workloads (if you
discount anti wraparound vacuums). I think it's time to change that. For
that we'd need to make vacuums that don't delete any tuples cheaper. We
already rescan only the changed parts of the heaps - but we always scan
indexes fully...

 The visibility map obviously helps cut down on extra work during a
 scan, but it only goes so far in that regard.

Aha.

 Instead of relying on the crude methods, if we reliably tracked
 certain txids on a per-block basis in a fork, we could cheaply scan
 the fork and make an extremely informed decision on how much a vacuum
 would gain us, and exactly what blocks it should hit.

 Let me use freezing as an example. If we had a reliable list of the
 lowest txid for each block of a relation that would allow us to do a
 freeze scan by hitting only blocks with minimum txid within our freeze
 range. The same could be done for multixacts.

It'd also become a prime contention point because you'd need to
constantly update it. In contrast to a simple 'is frozen' bit (akin to
is_visible) which only changes infrequently, and only in one direction.

 If we stored 3 txids for each block in a fork, we could fit
 information for ~680 heap blocks in each fork block. So in a database
 with 680G of heap data, we could fully determine every *block* (not
 table) we needed to vacuum by scanning just 1GB of data. That would
 allow for far better autovacuum scheduling than what we do today.

It's not that simple. Wraparounds and locking complicate it
significantly.

 I think the big missing piece lest something like Heikki's xid lsn
 ranges thing gets finished is a freeze map.
 
 The problem with a simple freeze map is when do you actually set the
 bit?

There's precisely one place where you can set it for normal
operation. During vacuum's scan.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] narwhal and PGDLLIMPORT

On 2014-10-20 01:03:31 -0400, Noah Misch wrote:
 On Wed, Oct 15, 2014 at 12:53:03AM -0400, Noah Misch wrote:
  On Tue, Oct 14, 2014 at 07:07:17PM -0400, Tom Lane wrote:
   Dave Page dp...@pgadmin.org writes:
On Tue, Oct 14, 2014 at 11:38 PM, Tom Lane t...@sss.pgh.pa.us wrote:
I think we're hoping that somebody will step up and investigate how
narwhal's problem might be fixed.
  
  I have planned to look at reproducing narwhal's problem once the dust 
  settles
  on orangutan, but I wouldn't mind if narwhal went away instead.
 
   No argument here.  I would kind of like to have more than zero
   understanding of *why* it's failing, just in case there's more to it
   than oh, probably a bug in this old toolchain.  But finding that out
   might well take significant time, and in the end not tell us anything
   very useful.
  
  Agreed on all those points.
 
 I reproduced narwhal's problem using its toolchain on another 32-bit Windows
 Server 2003 system.  The crash happens at the SHGetFolderPath() call in
 pqGetHomeDirectory().  A program can acquire that function via shfolder.dll or
 via shell32.dll; we've used the former method since commit 889f038, for better
 compatibility[1] with Windows NT 4.0.  On this system, shfolder.dll's version
 loads and unloads shell32.dll.  In PostgreSQL built using this older compiler,
 shfolder.dll:SHGetFolderPath() unloads libpq in addition to unloading shell32!
 That started with commit 846e91e.  I don't expect to understand the mechanism
 behind it, but I recommend we switch back to linking libpq with shell32.dll.
 The MSVC build already does that in all supported branches, and it feels right
 for the MinGW build to follow suit in 9.4+.  Windows versions that lack the
 symbol in shell32.dll are now ancient history.

Ick. Nice detective work of a ugly situation.

 I happened to try the same contrib/dblink test suite on PostgreSQL built with
 modern MinGW-w64 (i686-4.9.1-release-win32-dwarf-rt_v3-rev1).  That, too, gave
 a crash-like symptom starting with commit 846e91e.  Specifically, a backend
 that LOADed any module linked to libpq (libpqwalreceiver, dblink,
 postgres_fdw) would suffer this after calling exit(0):
 
 ===
 3056 2014-10-20 00:40:15.163 GMT LOG:  disconnection: session time: 
 0:00:00.515 user=cyg_server database=template1 host=127.0.0.1 port=3936
 
 This application has requested the Runtime to terminate it in an unusual way.
 Please contact the application's support team for more information.
 
 This application has requested the Runtime to terminate it in an unusual way.
 Please contact the application's support team for more information.
 9300 2014-10-20 00:40:15.163 GMT LOG:  server process (PID 3056) exited with 
 exit code 3
 ===
 
 The mechanism turned out to be disjoint from the mechanism behind the
 ancient-compiler crash.  Based on the functions called from exit(), my best
 guess is that exit() encountered recursion and used something like an abort()
 to escape.

Hm.

  (I can send the gdb transcript if anyone is curious to see the
 gory details.)

That would be interesting.

 The proximate cause was commit 846e91e allowing modules to use
 shared libgcc.  A 32-bit libpq acquires 64-bit integer division from libgcc.
 Passing -static-libgcc to the link restores the libgcc situation as it stood
 before commit 846e91e.  The main beneficiary of shared libgcc is C++/Java
 exception handling, so PostgreSQL doesn't care.  No doubt there's some deeper
 bug in libgcc or in PostgreSQL; loading a module that links with shared libgcc
 should not disrupt exit().  I'm content with this workaround.

I'm unconvinced by this reasoning. Popular postgres extensions like
postgis do use C++. It's imo not hard to imagine situations where
switching to a statically linked libgcc statically could cause problems.


Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: Add regression tests for autocommit-off mode for psql and fix some omissions

2014-10-20 Thread David G Johnston

Jim Nasby-5 wrote
On 10/7/14, 2:11 AM, Feike Steenbergen wrote:
On 7 October 2014 01:41, Jim Nasbylt;

Jim.Nasby@

gt; wrote:
The options I see...

1) If there's a definitive way to tell from backend source code what
commands disallow transactions then we can just use that information to
generate the list of commands psql shouldn't do that with.

2) Always run the regression test with auto-commit turned off.

3) Run the regression in both modes (presumably only on the build farm
due
to how long it would take).

1) I don't know about a definitive way. I used grep to find all
statements calling PreventTransactionChain.

Perhaps it wouldn't be too horrific to create some perl code that would
figure out what all of those commands are, and we could then use that to
generate the appropriate list for psql.

2) - I expect most people use autocommit-on; so only running it in
autocommit-off would not test the majority of users.
- autocommit-off also obliges you to explicitly rollback transactions
after
errors occur; this would probably mean a rewrite of some tests?

Well, that is at least doable, but probably rather ugly. It would probably
be less ugly if our test framework had a way to test for errors (ala
pgTap).

Where I was going with this is a full-on brute-force test: execute every
possible command with autocommit turned off. We don't need to check that
each command does what it's supposed to do, only that it can execute.

Of course, the huge problem with that is knowing how to actually
successfully run each command. :( Theoretically the tests could be
structured in such a way that there's a subset of tests that just see if
the command even executes, but creating that is obviously a lot of work
and with our current test framework probably a real pain to maintain.

From the comments here the effort needed to prevent this particular
oversight seems excessive compared to the error it is trying to prevent - an
error that is fairly easily remedied in a minor release and which has an
easy work around.

That said can we just do:

1) I don't know about a definitive way. I used grep to find all
statements calling PreventTransactionChain.

and save the results to an .out file with a comment somewhere that if there
is any change to the content of this file the corresponding command should
be manually tested in psql with autocommit=on. This seems to be what you
are saying but the psql check does not have to be automated...

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Add-regression-tests-for-autocommit-off-mode-for-psql-and-fix-some-omissions-tp5821889p5823728.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup fails with long tablespace paths

2014-10-20 Thread Peter Eisentraut

On 10/20/14 2:59 PM, Tom Lane wrote:
 What do we want to do about this?  I think a minimum expectation would be
 for pg_basebackup to notice and complain when it's trying to create an
 unworkably long symlink entry, but it would be far better if we found a
 way to cope instead.

Isn't it the backend that should error out before sending truncated
files names?

src/port/tar.c:

/* Name 100 */
sprintf(h[0], %.99s, filename);

And then do we need to prevent the creation of tablespaces that can't be
backed up?

 One thing we could possibly do without reinventing tar is to avoid 
using
 absolute path names if a PGDATA-relative one would do.

Maybe we could hack up the tar format to store the symlink target as the
file body, like cpio does.  Of course then we'd lose the property of
this actually being tar.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Patch: Add launchd Support

Hackers,

In Mac OS X 10.10 “Yosemite,” Apple removed SystemStarter, upon which our OS X 
start script has relied since 2007. So here is a patch that adds support for 
its replacement, launchd. It includes 7 day log rotation like the old script 
did. The install script still prefers the SystemStarter approach for older 
versions of the OS, for the sake of easier backward compatibility. We could 
change that if we wanted, since launchd has been part of the OS for around a 
decade.



launchd.patch
Description: Binary data




smime.p7s
Description: S/MIME cryptographic signature

Re: [HACKERS] pg_dump/pg_restore seem broken on hamerkop

2014-10-20 Thread Tatsuo Ishii

 Buildfarm member hamerkop has been failing in the pg_upgrade regression
 test for the last several days.  The problem looks like this:
 
 command: 
 C:/buildfarm/build_root/HEAD/pgsql.build/contrib/pg_upgrade/tmp_check/install/bin/pg_restore
  --port 50432 --username Administrator --exit-on-error --verbose --dbname 
 postgres pg_upgrade_dump_12145.custom  pg_upgrade_dump_12145.log 21
 pg_restore: connecting to database for restore
 pg_restore: [archiver (db)] Error while INITIALIZING:
 pg_restore: [archiver (db)] could not execute query: ERROR:  invalid byte 
 sequence for encoding UTF8: 0x93
 
 I can't help noticing that this started immediately after commit
 0eea804 pg_dump: Reduce use of global variables.  No idea why
 the issue is only showing up on this one animal.

I guess one of possibilities is there's garbage in memory which is
related to restore the process. If so, coverity may be able to find
something. The last coverity analysis was Oct 12. The commit 0eea804
was made on Oct 14. So let's see what new coverity scan reports...

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Add launchd Support

David E. Wheeler da...@justatheory.com writes:
 In Mac OS X 10.10 âYosemite,â Apple removed SystemStarter, upon which our 
 OS X start script has relied since 2007. So here is a patch that adds support 
 for its replacement, launchd. It includes 7 day log rotation like the old 
 script did. The install script still prefers the SystemStarter approach for 
 older versions of the OS, for the sake of easier backward compatibility. We 
 could change that if we wanted, since launchd has been part of the OS for 
 around a decade.

(1) I'd vote for just removing the SystemStarter stuff: it complicates
understanding what's happening, to no very good end.  We can easily
check that the launchd way works back to whatever we think our oldest
supported OS X release is.  (10.4.x according to the buildfarm, at least;
and I think SystemStarter was deprecated even then ...)

(2) AFAICS, this .plist file doesn't do anything about launchd's habit of
not waiting for the network to come up.  See my comments in today's thread
in -general:
http://www.postgresql.org/message-id/1239.1413823...@sss.pgh.pa.us

(3) I don't think you want Disabled = true.

(4) I'm suspicious of all the -c arguments in the .plist file.  In general
I'm not a fan of specifying GUCs on the postmaster command line; that
makes it impossible to override their values via normal methods like
postgresql.conf or ALTER SYSTEM.

(5) According to the launchd.plist man page, there are options for
redirecting stdout and stderr to someplace useful.  It might be worth
exercising those ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Questions on domain on composite / casts ignoring domains


I'm trying to create what amounts to a new type. This would be rather easy if I 
could perform a CHECK on a composite type, which I could do if I could create a 
domain on top of a composite. Is there any reason in particular that hasn't 
been done?

As an alternative, I tried accomplishing this with a straight domain. That 
would work, except for this:

WARNING:  cast will be ignored because the source data type is a domain

Why do we ignore casts from domains to other data types? I'm guessing because 
it's simply not what domains were meant for?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Add launchd Support


On 10/20/14, 5:59 PM, David E. Wheeler wrote:

In Mac OS X 10.10 “Yosemite,” Apple removed SystemStarter, upon which our OS X 
start script has relied since 2007. So here is a patch that adds support for 
its replacement, launchd. It includes 7 day log rotation like the old script 
did. The install script still prefers the SystemStarter approach for older 
versions of the OS, for the sake of easier backward compatibility. We could 
change that if we wanted, since launchd has been part of the OS for around a 
decade.


You're enabling POSTGRESQL in /etc/hostconfig before any of the files are 
copied over... what happens if we puke before the files get copied? Would it be 
better to enable after the scripts are in place?

BTW, Mavericks has a comment that /etc/hostconfig is going away, but google 
isn't telling me what's replacing it...
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Add launchd Support

On Oct 20, 2014, at 4:36 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 (1) I'd vote for just removing the SystemStarter stuff: it complicates
 understanding what's happening, to no very good end.  We can easily
 check that the launchd way works back to whatever we think our oldest
 supported OS X release is.  (10.4.x according to the buildfarm, at least;
 and I think SystemStarter was deprecated even then ...)

Okay. Might have to use OnDemand instead of KeepAlive on 10.4. The former was 
deprecated in 10.5, but I’m not sure when the former was added.

 (2) AFAICS, this .plist file doesn't do anything about launchd's habit of
 not waiting for the network to come up.  See my comments in today's thread
 in -general:
 http://www.postgresql.org/message-id/1239.1413823...@sss.pgh.pa.us

Ha! How funny you posted a call for a patch today. I didn’t see that, just 
needed to get it working today myself.

Anyway, I knew there was a reason I didn’t bother with this years ago: launchd 
does not support dependencies. From the launchd.plist(5)

 DEPENDENCIES
  Unlike many bootstrapping daemons, launchd has no explicit dependency 
 model.
  Interdependencies are expected to be solved through the use of IPC. It is
  therefore in the best interest of a job developer who expects dependents 
 to
  define all of the sockets in the configuration file. This has the added 
 ben-
  efit of making it possible to start the job based on demand instead of 
 imme-
  diately.  launchd will continue to place as many restrictions on jobs 
 that
  do not conform to this model as possible.

This another reason not to use KeepAlive, I guess. OnDemand is supposed to fire 
up a job only when it’s needed. No idea what that means. We might be able to 
put something in LaunchEvents that gets it to fire when the network launches, 
but documentation is hella thin (and may only be supported on Yosemite, where 
there are a bunch of poorly-documented launchd changes).

 (3) I don't think you want Disabled = true.

It’s the default. When you run `launchctl load -w` it overrides it to false in 
its database. I’m fine to have it be less opaque, though.

 (4) I'm suspicious of all the -c arguments in the .plist file.  In general
 I'm not a fan of specifying GUCs on the postmaster command line; that
 makes it impossible to override their values via normal methods like
 postgresql.conf or ALTER SYSTEM.

Yeah, I am okay with removing those; they weren’t in the SystemStarter script. 
Was the only way to replicate the log rotation stuff, but probably best not to 
do that in the start script, anyway.

 (5) According to the launchd.plist man page, there are options for
 redirecting stdout and stderr to someplace useful.  It might be worth
 exercising those ...

Suggestions?

Best,

David




smime.p7s
Description: S/MIME cryptographic signature

Re: [HACKERS] Patch: Add launchd Support

On Oct 20, 2014, at 4:58 PM, Jim Nasby jim.na...@bluetreble.com wrote:

 You're enabling POSTGRESQL in /etc/hostconfig before any of the files are 
 copied over... what happens if we puke before the files get copied? Would it 
 be better to enable after the scripts are in place?

That code was there; I just indented it in an if/then block.

 BTW, Mavericks has a comment that /etc/hostconfig is going away, but google 
 isn't telling me what's replacing it...

launchd.

Best,

David



smime.p7s
Description: S/MIME cryptographic signature

Re: [HACKERS] Re: Add regression tests for autocommit-off mode for psql and fix some omissions


On 10/20/14, 3:49 PM, David G Johnston wrote:

Well, that is at least doable, but probably rather ugly. It would probably
be less ugly if our test framework had a way to test for errors (ala
pgTap).

Where I was going with this is a full-on brute-force test: execute every
possible command with autocommit turned off. We don't need to check that
each command does what it's supposed to do, only that it can execute.

Of course, the huge problem with that is knowing how to actually
successfully run each command.:(  Theoretically the tests could be
structured in such a way that there's a subset of tests that just see if
the command even executes, but creating that is obviously a lot of work
and with our current test framework probably a real pain to maintain.

 From the comments here the effort needed to prevent this particular
oversight seems excessive compared to the error it is trying to prevent - an
error that is fairly easily remedied in a minor release and which has an
easy work around.

That said can we just do:

1) I don't know about a definitive way. I used grep to find all
statements calling PreventTransactionChain.

and save the results to an .out file with a comment somewhere that if there
is any change to the content of this file the corresponding command should
be manually tested in psql with autocommit=on.  This seems to be what you
are saying but the psql check does not have to be automated...


Are you thinking we'd commit the expected output of the perl script and have 
the regression suite call that script to verify it? That seems like a good way 
to fix this. The only better option I can come up with is if the perl script 
generated an actual test that we know would fail if a new command showed up.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] get_actual_variable_range vs idx_scan/idx_tup_fetch


On 10/18/14, 8:58 AM, Bruce Momjian wrote:

On Fri, Oct 17, 2014 at 11:03:04PM -0400, Tom Lane wrote:

Bruce Momjian br...@momjian.us writes:

On Fri, Oct 17, 2014 at 06:15:37PM -0400, Tom Lane wrote:

Those stats were perfectly valid: what the planner is looking for is
accurate minimum and maximum values for the index's leading column, and
that's what it got.  You're correct that a narrower index could have given
the same results with a smaller disk footprint, but the planner got the
results it needed from the index you provided for it to work with.



Uh, why is the optimizer looking at the index on a,b,c and not just the
stats on column a, for example?  I am missing something here.


Because it needs up-to-date min/max values in order to avoid being
seriously misled about selectivities of values near the endpoints.
See commit 40608e7f949fb7e4025c0ddd5be01939adc79eec.


Oh, I had forgotten we did that.  It is confusing that there is no way
via EXPLAIN to see the access, making the method of consulting pg_stat_*
and using EXPLAIN unreliable.  Should we document this somewhere?


I think we should. The common (mis)conception is that pg_stats shows 
*user-driven* access, not access because of stuff the system is doing.

This is actually a huge problem for anyone who's trying to figure out how 
useful indexes are; they see usage and thing they have queries that are using 
the index when in reality they don't.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Add launchd Support

David E. Wheeler da...@justatheory.com writes:
 On Oct 20, 2014, at 4:36 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 (1) I'd vote for just removing the SystemStarter stuff: it complicates
 understanding what's happening, to no very good end.  We can easily
 check that the launchd way works back to whatever we think our oldest
 supported OS X release is.  (10.4.x according to the buildfarm, at least;
 and I think SystemStarter was deprecated even then ...)

 Okay. Might have to use OnDemand instead of KeepAlive on 10.4. The former was 
 deprecated in 10.5, but Iâm not sure when the former was added.

[ looks ... ]  Yeah, there's no mention of KeepAlive in 10.4's
launchd.plist man page.  It does have a convenient example
saying that OnDemand = false does what we want:

 The following XML Property List simply keeps exampled running continu-
 ously:

   ?xml version=1.0 encoding=UTF-8?
   !DOCTYPE plist PUBLIC -//Apple Computer//DTD PLIST 1.0//EN
   http://www.apple.com/DTDs/PropertyList-1.0.dtd 
   plist version=1.0
   dict
keyLabel/key
stringcom.example.exampled/string
keyProgramArguments/key
array
 stringexampled/string
/array
keyOnDemand/key
false/
   /dict
   /plist

 (5) According to the launchd.plist man page, there are options for
 redirecting stdout and stderr to someplace useful.  It might be worth
 exercising those ...

 Suggestions?

I'd just drop them into files in the data directory; we're still going
to recommend that people use the logging_collector, so this is just a
stopgap to collect startup errors.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: Log inability to lock pages during vacuum


On 10/20/14, 10:29 AM, Greg Stark wrote:

On Mon, Oct 20, 2014 at 2:57 AM, Jim Nasby jim.na...@bluetreble.com wrote:

Currently, a non-freeze vacuum will punt on any page it can't get a cleanup
lock on, with no retry. Presumably this should be a rare occurrence, but I
think it's bad that we just assume that and won't warn the user if something
bad is going on.

My thought is that if we skip any pages elog(LOG) how many we skipped. If we
skip more than 1% of the pages we visited (not relpages) then elog(WARNING)
instead.


Is there some specific failure you've run into where a page was stuck
in a pinned state and never got vacuumed?


Not that I know of... but how would I actually know? Having that info available 
is the point of my proposal. :)


I would like to see a more systematic way of going about this. What
LSN or timestamp is associated with the oldest unvacuumed page? How
many times have we tried to visit it? What do those numbers look like
overall -- i.e. what's the median number of times it takes to vacuum a
page and what does the distribution look like of the unvacuumed ages?

With that data it should be possible to determine if the behaviour is
actually working well and where to draw the line to determine outliers
that might represent bugs.


I agree we could use better data about/for vacuum (see 
http://www.postgresql.org/message-id/544468c1.6050...@bluetreble.com).

In the meantime, I think it's worth adding this logging. If in fact this 
basically never happens (the current assumption), it doesn't hurt anything. If 
it turns out our assumption is wrong, then we'll actually be able to find that 
out. :)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: Log inability to lock pages during vacuum

On 2014-10-20 19:18:31 -0500, Jim Nasby wrote:
 In the meantime, I think it's worth adding this logging. If in fact this 
 basically never happens (the current assumption), it doesn't hurt anything. 
 If it turns out our assumption is wrong, then we'll actually be able to fin 
 that out. :)

It does happen, and not infrequently. Just not enough pages to normally
cause significant bloat. The most likely place where it happens is very
small tables that all connections hit with a high frequency. Starting to
issue high volume log spew for a nonexistant problem isn't helping.

If you're super convinced this is urgent then add it as a *single*
datapoint inside the existing messages. But I think there's loads of
stuff in vacuum logging that are more important this.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Add launchd Support

On Oct 20, 2014, at 5:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 [ looks ... ]  Yeah, there's no mention of KeepAlive in 10.4's
 launchd.plist man page.  It does have a convenient example
 saying that OnDemand = false does what we want:

Yeah, let’s see if we can cover both.

 I'd just drop them into files in the data directory; we're still going
 to recommend that people use the logging_collector, so this is just a
 stopgap to collect startup errors.

How about this?

plist version=1.0
dict
keyDisabled/key
false/
keyLabel/key
stringorg.postgresql.postgresql/string
keyUserName/key
stringpostgres/string
keyGroupName/key
stringpostgres/string
keyProgramArguments/key
array
string/usr/local/pgsql/bin/postgres/string
string-D/string
string/usr/local/pgsql/data/string
/array
keyStandardOutPath/key
string/usr/local/pgsql/data/launchd.log/string
keyStandardErrorPath/key
string/usr/local/pgsql/data/launchd.log/string
keyOnDemand/key!-- OS X 10.4 --
false/
keyKeepAlive/key!-- OS X 10.5+ --
true/
/dict
/plist

No fix for the networking issue, of course.

Best,

David



smime.p7s
Description: S/MIME cryptographic signature

Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables


On 10/20/14, 3:11 PM, Andres Freund wrote:

On 2014-10-19 20:43:29 -0500, Jim Nasby wrote:

On 10/19/14, 11:41 AM, Andres Freund wrote:

On 2014-10-18 21:36:48 -0500, Jim Nasby wrote:

The weird part is that if it's not doing a freeze it will just punt
on a page if it can't get the cleanup lock.


I don't think that's particularly wierd. Otherwise vacuum can get stuck
behind a single very hot page - leading to much, much more bloat.


I have to believe that could seriously screw up autovacuum scheduling.


Why?


I'm worried there could be some pathological cases where we'd skip a
large number of pages, perhaps if a vacuum scan and a seqscan ended up
running alongside each other.


I've seen little evidence of that. The reverse, a stuck autovacuum, is
imo much more likely. For this to be an actual problem you'd need to
encounter many pages that are not locked, but are pinned. That state
doesn't exist for very long.


How would you actually get evidence of this... we don't log it. :) (See my 
proposal at 
http://www.postgresql.org/message-id/54446c10.2080...@bluetreble.com)


Perhaps this is just paranoia, but we have no idea how bad things
might be, because we don't have any logging for how many pages we
skipped because we couldn't lock them.


But so what? If we skip individual pages it won't be too bad - and very
likely waiting very long is going to be more painful. The page won't be
marked 'all visible' so the next vacuum will come around to it
again. And it'll also get cleaned up by opportunistic hot pruning.


Probably true. Hopefully we can start logging it and then we'll know for sure.



That ultimately, our current method for determining when and what to
vacuum is rather crude, and likely results in wasted effort during
scans as well as not firing autovac often enough. Keep in mind that
autovac started as a user-space utility and the best it could possibly
do was to keep a table of stats counters.


I agree that we should trigger autovacuum more often. It's
*intentionally* not triggered *at all* for insert only workloads (if you
discount anti wraparound vacuums). I think it's time to change that. For
that we'd need to make vacuums that don't delete any tuples cheaper. We
already rescan only the changed parts of the heaps - but we always scan
indexes fully...


Or maybe vacuum isn't the right way to handle some of these scenarios. It's 
become the catch-all for all of this stuff, but maybe that doesn't make sense 
anymore. Certainly when it comes to dealing with inserts there's no reason we 
*have* to do anything other than set hint bits and possibly freeze xmin.


Instead of relying on the crude methods, if we reliably tracked
certain txids on a per-block basis in a fork, we could cheaply scan
the fork and make an extremely informed decision on how much a vacuum
would gain us, and exactly what blocks it should hit.



Let me use freezing as an example. If we had a reliable list of the
lowest txid for each block of a relation that would allow us to do a
freeze scan by hitting only blocks with minimum txid within our freeze
range. The same could be done for multixacts.


It'd also become a prime contention point because you'd need to
constantly update it. In contrast to a simple 'is frozen' bit (akin to
is_visible) which only changes infrequently, and only in one direction.


Actually, the contention on freeze would very possibly be minimal, because it 
probably doesn't change very often. Even if it did, it's OK if the value isn't 
100% accurate, so long as the recorded XID is guaranteed older than what's 
actually on the page.


If we stored 3 txids for each block in a fork, we could fit
information for ~680 heap blocks in each fork block. So in a database
with 680G of heap data, we could fully determine every *block* (not
table) we needed to vacuum by scanning just 1GB of data. That would
allow for far better autovacuum scheduling than what we do today.


It's not that simple. Wraparounds and locking complicate it
significantly.


I realize what I'm talking about isn't trivial (though, I'm confused by your 
comment about wraparound since presumably TransactionIdPrecedes() and it's ilk 
solve that problem...)

My ultimate point here is that we're using what are (today) very crude methods 
to control what gets vacuumed when, and I think that now that we have resource 
forks would could do *much* better without a tremendous amount of work. But to 
make a big advancement here we'll need to take a step back and rethink some 
things (like vacuum is the only way to handle these problems).

Let me put some thought into this.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables

2014-10-20 Thread Josh Berkus

On 10/20/2014 05:39 PM, Jim Nasby wrote:
 Or maybe vacuum isn't the right way to handle some of these scenarios.
 It's become the catch-all for all of this stuff, but maybe that doesn't
 make sense anymore. Certainly when it comes to dealing with inserts
 there's no reason we *have* to do anything other than set hint bits and
 possibly freeze xmin.

+1

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: Log inability to lock pages during vacuum


On 10/20/14, 7:31 PM, Andres Freund wrote:

On 2014-10-20 19:18:31 -0500, Jim Nasby wrote:

In the meantime, I think it's worth adding this logging. If in fact this 
basically never happens (the current assumption), it doesn't hurt anything. If it 
turns out our assumption is wrong, then we'll actually be able to fin that out.:)

It does happen, and not infrequently. Just not enough pages to normally
cause significant bloat. The most likely place where it happens is very
small tables that all connections hit with a high frequency. Starting to
issue high volume log spew for a nonexistant problem isn't helping.


How'd you determine that? Is there some way to measure this? I'm not doubting 
you; I just don't want to work on a problem that's already solved.


If you're super convinced this is urgent then add it as a*single*
datapoint inside the existing messages. But I think there's loads of
stuff in vacuum logging that are more important this.


See my original proposal; at it's most intrusive this would issue one warning 
per (auto)vacuum if it was over a certain threshold. I would think that a DBA 
would really like to know when this happens, but if we think that's too much 
spew we can limit it to normal vacuum logging.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables

On 2014-10-20 17:43:26 -0700, Josh Berkus wrote:
 On 10/20/2014 05:39 PM, Jim Nasby wrote:
  Or maybe vacuum isn't the right way to handle some of these scenarios.
  It's become the catch-all for all of this stuff, but maybe that doesn't
  make sense anymore. Certainly when it comes to dealing with inserts
  there's no reason we *have* to do anything other than set hint bits and
  possibly freeze xmin.
 
 +1

A page read is a page read. What's the point of heaving another process
do it? Vacuum doesn't dirty pages if they don't have to be
dirtied. Especially stuff like freezing cannot really be dealt with
outside of vacuum unless you make already complex stuff more complex for
a marginal benefit.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: Log inability to lock pages during vacuum

On 2014-10-20 19:43:38 -0500, Jim Nasby wrote:
 On 10/20/14, 7:31 PM, Andres Freund wrote:
 On 2014-10-20 19:18:31 -0500, Jim Nasby wrote:
 In the meantime, I think it's worth adding this logging. If in fact this 
 basically never happens (the current assumption), it doesn't hurt 
 anything. If it turns out our assumption is wrong, then we'll actually be 
 able to fin that out.:)
 It does happen, and not infrequently. Just not enough pages to normally
 cause significant bloat. The most likely place where it happens is very
 small tables that all connections hit with a high frequency. Starting to
 issue high volume log spew for a nonexistant problem isn't helping.
 
 How'd you determine that? Is there some way to measure this?

You'd seen individual pages with too old dead rows in them.

 If you're super convinced this is urgent then add it as a*single*
 datapoint inside the existing messages. But I think there's loads of
 stuff in vacuum logging that are more important this.
 
 See my original proposal; at it's most intrusive this would issue one
 warning per (auto)vacuum if it was over a certain threshold.

Which would vastly increase the log output for setups with small tables
and a nonzero log_autovacuum_min_duration.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Add launchd Support

2014-10-20 Thread Wim Lewis


On Oct 20, 2014, at 5:03 PM, David E. Wheeler da...@justatheory.com wrote:
 This another reason not to use KeepAlive, I guess. OnDemand is supposed to 
 fire up a job only when it’s needed. No idea what that means.

I think the idea of OnDemand is for launchd items to act a bit like inetd does: 
launchd creates the listening socket (or mach port or file-change notification) 
on the port specified in the plist, and only starts the process when someone 
tries to connect to it. This might not be a terrible thing to support, but it 
would require more changes throughout postgres (postmaster would have to check 
in with launchd at start time to get the listen socket; ports and socket paths 
would no longer be specifiable in postgres’ config, etc) and I’m skeptical that 
it’d be worth the work without a more concrete motivation.

Apple has published their changes to Postgres (since they ship it in recent 
versions of OSX) here, fwiw, including the launchd plist they use: 
http://www.opensource.apple.com/source/PostgreSQL/

One thing I noticed is that Apple also used the label “org.postgres.postgres” 
for their launchd job. I don’t know if that will collide in some way with a 
second job with the same label. Launchctl load/unload takes a pathname, not a 
job label, so I don’t think it’d be a problem unless you actually do want to 
run both copies of postgres at the same time.

MacPorts also has a launchd job for their postgresql port, which invokes 
daemondo, which invokes a wrapper script, which invokes postgres. I’m not sure 
why they did it that way.

 2) AFAICS, this .plist file doesn't do anything about launchd's habit of not 
 waiting for the network to come up. 

Have you experimented with this setting?:

   keyKeepAlive/key
   dictkeyNetworkState/keytrue//dict

The launchd.plist man page claims that if you set that key in the 
sub-dictionary:
 If true, the job will be kept alive as long as the network is up, where up is 
 defined as at least one non-loopback  interface being up and having IPv4 or 
 IPv6 addresses assigned to them.  If false, the job will be kept alive in the 
 inverse condition.

On the other hand, it’s not unreasonable to have postgres running on a machine 
with only a loopback interface, depending on the use.

 We might be able to put something in LaunchEvents that gets it to fire when 
 the network launches, but documentation is hella thin (and may only be 
 supported on Yosemite, where there are a bunch of poorly-documented launchd 
 changes).

If one were desperate enough... it’s possible to dig through the launchd 
sources to make up for the gaps in the documentation (also on 
opensource.apple.com; there used to be a community-ish site for it at 
macosforge.org as well). It’s rough going, though, IIRC.

 (3) I don't think you want Disabled = true.
 
 It’s the default. When you run `launchctl load -w` it overrides it to false 
 in its database. I’m fine to have it be less opaque, though.

Yes, AFAICT it’s conventional to specify Disabled=true in a launchd plist and 
use launchctl to enable the item.

 BTW, Mavericks has a comment that /etc/hostconfig is going away, but google 
 isn't telling me what's replacing it…

I think that’s been “going away” for a decade now.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Trailing comma support in SELECT statements


On 10/20/14, 11:16 AM, Andrew Dunstan wrote:

On 10/20/2014 11:59 AM, David E. Wheeler wrote:

On Oct 18, 2014, at 7:06 PM, Jim Nasby jim.na...@bluetreble.com wrote:


Yes.

The only case I can think of where we wouldn't want this is COPY.

BTW, this should also apply to delimiters other than commas; for example, some 
geometry types use ; as a delimiter between points.

I don’t think it should apply to the internals of types, necessarily. JSON, for 
example, always dies on an trailing comma, so should probably stay that way. 
Well, maybe allow it on JSONB input, but not JSON. Though we perhaps don’t want 
their behaviors to diverge.




The JSON spec is quite clear on this. Leading and trailing commas are not 
allowed. I would fight tooth and nail not to allow it for json (and by 
implication jsonb, since they use literally the same parser - in fact we do 
that precisely so their input grammars can't diverge).


+1. Data types that implement specs should follow the spec.

I was more concerned about things like polygon, but the real point (ha!) is 
that we need to think about the data types too. (I will say I don't think 
things that mandate an exact number of elements (like point, box, etc) should 
support extra delimiters).
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Questions on domain on composite / casts ignoring domains

2014-10-20 Thread David G Johnston

Jim Nasby-5 wrote
 I'm trying to create what amounts to a new type. This would be rather easy
 if I could perform a CHECK on a composite type, which I could do if I
 could create a domain on top of a composite. Is there any reason in
 particular that hasn't been done?
 
 As an alternative, I tried accomplishing this with a straight domain. That
 would work, except for this:
 
 WARNING:  cast will be ignored because the source data type is a domain
 
 Why do we ignore casts from domains to other data types? I'm guessing
 because it's simply not what domains were meant for?

A domain is a base type with a constraint.  When you cast you already know
the existing value is valid and the system simply uses the cast available
for the base type instead. i.e., You cannot have a domain with a different
cast rule than the base type over which it is defined.

Likely the lack of capability is simply a matter of complexity in the face
of somewhat uncommon usage and limited resources.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Questions-on-domain-on-composite-casts-ignoring-domains-tp5823745p5823763.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inconsistencies in documentation of row-level locking