Re: [HACKERS] Use of rsync for data directory copying
Bruce, * Bruce Momjian (br...@momjian.us) wrote: > On Sat, Jul 14, 2012 at 09:17:22PM -0400, Stephen Frost wrote: > > So, can you explain which case you're specifically worried about? > > OK. The basic problem is that I previously was not clear about how > reliant our use of rsync (without --checksum) was on the presence of WAL > replay. We should only be relying on WAL replay for hot backups which used pg_start/pg_stop_backup. > Here is an example from our documentation that doesn't have WAL replay: > > http://www.postgresql.org/docs/9.2/static/backup-file.html > > Another option is to use rsync to perform a file system backup. This is > done by first running rsync while the database server is running, then > shutting down the database server just long enough to do a second rsync. > The second rsync will be much quicker than the first, because it has > relatively little data to transfer, and the end result will be > consistent because the server was down. This method allows a file system > backup to be performed with minimal downtime. To be honest, this looks like a recommendation that might have been made all the way back to before we had hot backups. Technically speaking, it should work fine to use the above method where the start/stop backup is only done for the second rsync, if there's a reason to implement such a system (perhaps the WALs grow too fast or too numerous for a full backup with rsync between the start_backup and stop_backup?). > Now, if a write happens in both the first and second half of a second, > and only the first write is seen by the first rsync, I don't think the > second rsync will see the write, and hence the backup will be > inconsistent. To be more specific, rsync relies on the combination of mtime and size to tell if the file has been changed or not. In contrast, cp --update looks like it might only depend on mtime (from reading the cp man page on a Debian system). It seems like there could be an issue where PG is writing to a file, an rsync comes along and copies the file, and then PG writes to that same file again, after the rsync is done, but within the same second. If the file size isn't changed by that write, a later rsync might feel that it isn't necessary to check if the file contents changed. I have to say that I don't believe I'v ever seen that happen though. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Use of rsync for data directory copying
On Sat, Jul 14, 2012 at 09:17:22PM -0400, Stephen Frost wrote: > Bruce, > > * Bruce Momjian (br...@momjian.us) wrote: > > If two writes happens in the middle of a file in the same second, it > > seems one might be missed. Yes, I suppose the WAL does fix that during > > replay, though if both servers were shut down cleanly, WAL would not be > > replayed. > > > > If you using it for a hot backup, and WAL would clean that up. > > Right... If it's hot backup, then WAL will fix it; if it's done after a > clean shut-down, nothing should be writing to those files (much less > multiple writes in the same second), so checksum shouldn't be > necessary... > > If you're doing rsync w/o doing pg_start_backup/pg_stop_backup, that's > not likely to work even *with* --checksum.. > > So, can you explain which case you're specifically worried about? OK. The basic problem is that I previously was not clear about how reliant our use of rsync (without --checksum) was on the presence of WAL replay. Here is an example from our documentation that doesn't have WAL replay: http://www.postgresql.org/docs/9.2/static/backup-file.html Another option is to use rsync to perform a file system backup. This is done by first running rsync while the database server is running, then shutting down the database server just long enough to do a second rsync. The second rsync will be much quicker than the first, because it has relatively little data to transfer, and the end result will be consistent because the server was down. This method allows a file system backup to be performed with minimal downtime. Now, if a write happens in both the first and second half of a second, and only the first write is seen by the first rsync, I don't think the second rsync will see the write, and hence the backup will be inconsistent. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Schema version management
On Sat, Jul 14, 2012 at 12:34 PM, Joel Jacobson wrote: > On Sat, Jul 14, 2012 at 11:25 AM, Peter Eisentraut wrote: >> Well, of course everyone uses directories in moderation. But you might >> want to take a look at the gcc source code. You'll love it. ;-) [505][joel.Joel-Jacobsons-iMac-2: /Users/joel/gcc]$ find . -type d | wc -l 41895 [506][joel.Joel-Jacobsons-iMac-2: /Users/joel/gcc]$ find . -type f | wc -l 167183 [507][joel.Joel-Jacobsons-iMac-2: /Users/joel/gcc]$ Not that bad actually, only 4 files per directory on average. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Use of rsync for data directory copying
Bruce, * Bruce Momjian (br...@momjian.us) wrote: > If two writes happens in the middle of a file in the same second, it > seems one might be missed. Yes, I suppose the WAL does fix that during > replay, though if both servers were shut down cleanly, WAL would not be > replayed. > > If you using it for a hot backup, and WAL would clean that up. Right... If it's hot backup, then WAL will fix it; if it's done after a clean shut-down, nothing should be writing to those files (much less multiple writes in the same second), so checksum shouldn't be necessary... If you're doing rsync w/o doing pg_start_backup/pg_stop_backup, that's not likely to work even *with* --checksum.. So, can you explain which case you're specifically worried about? Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] [PERFORM] DELETE vs TRUNCATE explanation
On Thu, Jul 12, 2012 at 9:55 PM, Jeff Janes wrote: > I've moved this thread from performance to hackers. > > The topic was poor performance when truncating lots of small tables > repeatedly on test environments with fsync=off. > > On Thu, Jul 12, 2012 at 6:00 PM, Jeff Janes wrote: > >> I think the problem is in the Fsync Absorption queue. Every truncate >> adds a FORGET_RELATION_FSYNC to the queue, and processing each one of >> those leads to sequential scanning the checkpointer's pending ops hash >> table, which is quite large. It is almost entirely full of other >> requests which have already been canceled, but it still has to dig >> through them all. So this is essentially an N^2 operation. ... > >> I'm not sure why we don't just delete the entry instead of marking it >> as cancelled. It looks like the only problem is that you can't delete >> an entry other than the one just returned by hash_seq_search. Which >> would be fine, as that is the entry that we would want to delete; >> except that mdsync might have a different hash_seq_search open, and so >> it wouldn't be safe to delete. The attached patch addresses this problem by deleting the entry when it is safe to do so, and flagging it as canceled otherwise. I thought of using has_seq_scans to determine when it is safe, but dynahash.c does not make that function public, and I was afraid it might be too slow, anyway. So instead I used a static variable, plus the knowledge that the only time there are two scans on the table is when mdsync starts one and then calls RememberFsyncRequest indirectly. There is one other place that does a seq scan, but there is no way for control to pass from that loop to reach RememberFsyncRequest. I've added code to disclaim the scan if mdsync errors out. I don't think that this should a problem because at that point the scan object is never going to be used again, so if its internal state gets screwed up it shouldn't matter. However, I wonder if it should also call hash_seq_term, otherwise the pending ops table will be permanently prevented from expanding (this is a pre-existing condition, not to do with my patch). Since I don't know what can make mdsync error out without being catastrophic, I don't know how to test this out. One concern is that if the ops table ever does become bloated, it can never recover while under load. The bloated table will cause mdsync to take a long time to run, and as long as mdsync is in the call stack the antibloat feature is defeated--so we have crossed a tipping point and cannot get back. I don't see that occurring in the current use case, however. With my current benchmark, the anti-bloat is effective enough that mdsync never takes very long to execute, so a virtuous circle exists. As an aside, the comments in dynahash.c seem to suggest that one can always delete the entry returned by hash_seq_search, regardless of the existence of other sequential searches. I'm pretty sure that this is not true. Also, shouldn't this contract about when one is allowed to delete entries be in the hsearch.h file, rather than the dynahash.c file? Also, I still wonder if it is worth memorizing fsyncs (under fsync=off) that may or may not ever take place. Is there any guarantee that we can make by doing so, that couldn't be made otherwise? Cheers, Jeff FsyncRequest_delete_v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous Standalone Master Redoux
So, here's the core issue with degraded mode. I'm not mentioning this to block any patch anyone has, but rather out of a desire to see someone address this core problem with some clever idea I've not thought of. The problem in a nutshell is: indeterminancy. Assume someone implements degraded mode. Then: 1. Master has one synchronous standby, Standby1, and two asynchronous, Standby2 and Standby3. 2. Standby1 develops a NIC problem and is in and out of contact with Master. As a result, it's flipping in and out of synchronous / degraded mode. 3. Master fails catastrophically due to a RAID card meltdown. All data lost. At this point, the DBA is in kind of a pickle, because he doesn't know: (a) Was Standby1 in synchronous or degraded mode when Master died? The only log for that was on Master, which is now gone. (b) Is Standby1 actually the most caught up standby, and thus the appropriate new master for Standby2 and Standby3, or is it behind? With the current functionality of Synchronous Replication, you don't have either piece of indeterminancy, because some external management process (hopefully located on another server) needs to disable synchronous replication when Standby1 develops its problem. That is, if the master is accepting synchronous transactions at all, you know that Standby1 is up-to-date, and no data is lost. While you can answer (b) by checking all servers, (a) is particularly pernicious, because unless you have the application log all "operating in degraded mode" messages, there is no way to ever determine the truth. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] elog/ereport noreturn decoration
Peter Eisentraut writes: > A small sidetrack here. I've managed to set up the Solaris Studio > compiler on Linux and tried this out. It turns out these "statement not > reached" warnings have nothing to do with knowledge about library > functions such as abort() or exit() not returning. The warnings come > mostly from loops that never end (except by returning from the function) > and some other more silly cases where the supposed fallback return > statement is clearly unnecessary. I think these should be fixed, > because the code is wrong and could mask real errors if someone ever > wanted to rearrange those loops or something. > Patch attached. I tried this out with old and new versions of gcc, > clang, and the Solaris compiler, and everyone was happy about. I didn't > touch the regex code. And there's some archeological knowledge about > Perl in there. Hm. I seem to recall that at least some of these lines were themselves put in to suppress compiler warnings. So we may just be moving the warnings from one set of non-mainstream compilers to another set. Still, we might as well try it and see if there's a net reduction. (In some places we might need to tweak the code a bit harder than this, to make it clearer that the unreachable spot is unreachable.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] elog/ereport noreturn decoration
On lör, 2012-06-30 at 10:52 -0400, Tom Lane wrote: > Peter Eisentraut writes: > > But my point was, there aren't any unused code warnings. None of the > > commonly used compilers issue any. I'd be interested to know if there > > is any recent C compiler supported by PostgreSQL that issues some kind > > of unused code warning under any circumstances, and see an example of > > that. Then we can determine whether there is an issue here. > > Well, the Solaris Studio compiler produces "warning: statement not > reached" messages, as seen for example on buildfarm member castoroides. > I don't have one available to experiment with, so I'm not sure whether > it knows that abort() doesn't return; but I think it rather foolish to > assume that such a combination doesn't exist in the wild. A small sidetrack here. I've managed to set up the Solaris Studio compiler on Linux and tried this out. It turns out these "statement not reached" warnings have nothing to do with knowledge about library functions such as abort() or exit() not returning. The warnings come mostly from loops that never end (except by returning from the function) and some other more silly cases where the supposed fallback return statement is clearly unnecessary. I think these should be fixed, because the code is wrong and could mask real errors if someone ever wanted to rearrange those loops or something. Patch attached. I tried this out with old and new versions of gcc, clang, and the Solaris compiler, and everyone was happy about. I didn't touch the regex code. And there's some archeological knowledge about Perl in there. The Solaris compiler does not, by the way, complain about the elog patch I had proposed. diff --git a/contrib/hstore/hstore_io.c b/contrib/hstore/hstore_io.c index aadf050..7bdac3d 100644 --- a/contrib/hstore/hstore_io.c +++ b/contrib/hstore/hstore_io.c @@ -163,8 +163,6 @@ state->ptr++; } - - return false; } #define WKEY 0 diff --git a/contrib/intarray/_int_bool.c b/contrib/intarray/_int_bool.c index dfb113a..d0572af 100644 --- a/contrib/intarray/_int_bool.c +++ b/contrib/intarray/_int_bool.c @@ -136,7 +136,6 @@ } (state->buf)++; } - return END; } /* @@ -301,7 +300,6 @@ else return execute(curitem - 1, checkval, calcnot, chkcond); } - return false; } /* @@ -404,7 +402,6 @@ else return false; } - return false; } bool diff --git a/contrib/intarray/_int_gist.c b/contrib/intarray/_int_gist.c index e429c8b..60de393 100644 --- a/contrib/intarray/_int_gist.c +++ b/contrib/intarray/_int_gist.c @@ -217,8 +217,6 @@ } else PG_RETURN_POINTER(entry); - - PG_RETURN_POINTER(entry); } Datum diff --git a/contrib/ltree/ltxtquery_io.c b/contrib/ltree/ltxtquery_io.c index c2e532c..583ff2a 100644 --- a/contrib/ltree/ltxtquery_io.c +++ b/contrib/ltree/ltxtquery_io.c @@ -139,7 +139,6 @@ state->buf += charlen; } - return END; } /* diff --git a/contrib/ltree/ltxtquery_op.c b/contrib/ltree/ltxtquery_op.c index bedbe24..64f9d21 100644 --- a/contrib/ltree/ltxtquery_op.c +++ b/contrib/ltree/ltxtquery_op.c @@ -40,7 +40,6 @@ else return ltree_execute(curitem + 1, checkval, calcnot, chkcond); } - return false; } typedef struct diff --git a/src/backend/access/gin/ginbtree.c b/src/backend/access/gin/ginbtree.c index 82ac53e..3efdedd 100644 --- a/src/backend/access/gin/ginbtree.c +++ b/src/backend/access/gin/ginbtree.c @@ -146,9 +146,6 @@ stack->predictNumber = 1; } } - - /* keep compiler happy */ - return NULL; } void diff --git a/src/backend/access/gin/ginget.c b/src/backend/access/gin/ginget.c index 022bd27..5702259 100644 --- a/src/backend/access/gin/ginget.c +++ b/src/backend/access/gin/ginget.c @@ -354,8 +354,6 @@ */ stack->off++; } - - return true; } /* diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c index c790ad6..2253e7c 100644 --- a/src/backend/access/gist/gistget.c +++ b/src/backend/access/gist/gistget.c @@ -535,8 +535,6 @@ } while (so->nPageData == 0); } } - - PG_RETURN_BOOL(false); /* keep compiler quiet */ } /* diff --git a/src/backend/executor/nodeGroup.c b/src/backend/executor/nodeGroup.c index 80e282b..a8a1fe6 100644 --- a/src/backend/executor/nodeGroup.c +++ b/src/backend/executor/nodeGroup.c @@ -184,9 +184,6 @@ else InstrCountFiltered1(node, 1); } - - /* NOTREACHED */ - return NULL; } /* - diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c index e0ab599..0d66dab 100644 --- a/src/backend/libpq/be-secure.c +++ b/src/backend/libpq/be-secure.c @@ -201,9 +201,9 @@ { #ifdef USE_SSL return ssl_loaded_verify_locations; -#endif - +#else return false; +#endif } /* diff --git a/src/backend/storage/buffer/freelist.c b/src/backend/storage/buffer/freelist.c index c927747..d96b7a7 100644 --- a/src/backend/storage/buffer/freelist.c +++ b/src/backend/storage/buffer/freelist.c @@ -233,9 +233,6 @@ static void AddBufferToRing(BufferAc
[HACKERS] Getting rid of pre-assignment of index names in CREATE TABLE LIKE
In bug #6734 we have a complaint about a longstanding misfeature of CREATE TABLE LIKE. Ordinarily, this command doesn't select names for copied indexes, but leaves that to be done at runtime by DefineIndex. But if it's copying comments, and an index of the source table has a comment, it's forced to preassign a name to the new index so that it can build a CommentStmt that can apply the comment after the index is made. Apart from being something of a modularity violation, this isn't very safe because of the danger of name collision with earlier indexes for the new table. And that's exactly what's happening in bug #6734. I suggested that we could dodge the problem by allowing IndexStmt to carry a comment to be attached to the new index, and thereby avoid needing an explicit COMMENT command. Attached is a patch that fixes it that way. While I was at it, it seemed like DefineIndex's parameter list had grown well past any sane bound, so I refactored it to pass the IndexStmt struct as-is rather than passing all the fields individually. With or without that choice, though, this approach means a change in DefineIndex's API, as well as the contents of struct IndexStmt. That means it's probably unsafe to back-patch, since it seems plausible that there might be third-party code out there that creates indexes and would use these interfaces. I would like to sneak this fix into 9.2, though. Does anyone think it's already too late to be touching these APIs for 9.2? regards, tom lane diff --git a/src/backend/bootstrap/bootparse.y b/src/backend/bootstrap/bootparse.y index 18f0add852e7832739e3877811e385abcb540fab..ec634f1660e0be883b451abbb380d6dc30e69b93 100644 *** a/src/backend/bootstrap/bootparse.y --- b/src/backend/bootstrap/bootparse.y *** Boot_InsertStmt: *** 279,296 Boot_DeclareIndexStmt: XDECLARE INDEX boot_ident oidspec ON boot_ident USING boot_ident LPAREN boot_index_params RPAREN { do_start(); ! DefineIndex(makeRangeVar(NULL, $6, -1), ! $3, $4, ! InvalidOid, ! $8, ! NULL, ! $10, ! NULL, NIL, NIL, ! false, false, false, false, false, ! false, false, true, false, false); do_end(); } ; --- 279,312 Boot_DeclareIndexStmt: XDECLARE INDEX boot_ident oidspec ON boot_ident USING boot_ident LPAREN boot_index_params RPAREN { + IndexStmt *stmt = makeNode(IndexStmt); + do_start(); ! stmt->idxname = $3; ! stmt->relation = makeRangeVar(NULL, $6, -1); ! stmt->accessMethod = $8; ! stmt->tableSpace = NULL; ! stmt->indexParams = $10; ! stmt->options = NIL; ! stmt->whereClause = NULL; ! stmt->excludeOpNames = NIL; ! stmt->idxcomment = NULL; ! stmt->indexOid = InvalidOid; ! stmt->oldNode = InvalidOid; ! stmt->unique = false; ! stmt->primary = false; ! stmt->isconstraint = false; ! stmt->deferrable = false; ! stmt->initdeferred = false; ! stmt->concurrent = false; ! ! DefineIndex(stmt, $4, ! false, ! false, ! true, /* skip_build */ ! false); do_end(); } ; *** Boot_DeclareIndexStmt: *** 298,315 Boot_DeclareUniqueIndexStmt: XDECLARE UNIQUE INDEX boot_ident oidspec ON boot_ident USING boot_ident LPAREN boot_index_params RPAREN { do_start(); ! DefineIndex(makeRangeVar(NULL, $7, -1), ! $4, $5, ! InvalidOid, ! $9, ! NULL, ! $11, ! NULL, NIL, NIL, ! true, false, false, false, false, ! false, false, true, false, false); do_end(); } ; --- 314,347 Boot_DeclareUniqueIndexStmt: XDECLARE UNIQUE INDEX boot_ident oidspec ON boot_ident USING boot_ident LPAREN boot_index_params RPAREN { + IndexStmt *stmt = makeNode(IndexStmt); + do_start(); ! stmt->idxname = $4; ! stmt->relation = makeRangeVar(NULL, $7, -1); ! stmt->accessMethod = $9; ! stmt->tableSpace = NULL; ! stmt->indexParams = $11; ! stmt->options = NIL; ! stmt->whereClause = NULL; ! stmt->excludeOpNames = NIL; ! stmt->idxcomment = NULL; ! stmt->indexOid = InvalidOid; ! stmt->oldNode = InvalidOid; ! stmt->unique = true; ! stmt->primary = false; ! stmt->isconstraint = false; ! stmt->deferrable = false; ! stmt->initdeferred = false; ! stmt->concurrent = false; ! ! DefineIndex(stmt, $5, ! false, ! false, ! true, /* skip_build */ ! false); do_end(); } ; diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index e5d1d8b699ad6d37cd3dd9e19b1d76ba97c01eaa..f315ab60154d6c6aa3c96f05babb8e79945f70b8 100644 *** a/src/backend/commands/indexcmds.c --- b/src/backend/commands/indexcmds.c *** *** 24,29 -
[HACKERS] Re: Reporting WAL file containing checkpoint's REDO record in pg_controldata's result
On 26.06.2012 18:06, Fujii Masao wrote: On Wed, Mar 28, 2012 at 10:08 AM, Fujii Masao wrote: On Wed, Mar 28, 2012 at 12:30 AM, Alvaro Herrera wrote: Excerpts from Fujii Masao's message of mar mar 27 06:40:34 -0300 2012: Anyway, should I add this patch into the next CF? Or is anyone planning to commit the patch for 9.2? I think the correct thing to do here is add to next CF, and if some committer has enough interest in getting it quickly it can be grabbed from there and committed into 9.2. Yep! I've added it to next CF. Attached is the revised version of the patch. Looks good to me, committed. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sign mismatch in walreceiver.c
On 12.07.2012 23:57, Peter Eisentraut wrote: This looks suspicious static TimeLineID recvFileTLI = -1; because TimeLineID is uint32. The Solaris compiler complains about the sign mismatch. Maybe 0 would be a better initial value? Agreed, fixed, thanks. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for XLogRecPtr in expand_fmt_string?
On Jul 13, 2012, at 2:34 PM, Peter Eisentraut wrote: > I would rather get rid of this %X/%X notation. I know we have all grown > to like it, but it's always been a workaround. We're now making the > move to simplify this whole business by saying, the WAL location is an > unsigned 64-bit number -- which everyone can understand -- but then why > is it printed in some funny format? We should take care that whatever format we pick can be easily matched to a WAL file name. So a 64-bit number printed as 16 hex digits would perhaps be OK, but a 64-bit number printed in base 10 would be a large usability regression. Personally, I'm not convinced we should change anything at all. It's not that easy to visually parse a string of many digits; a little punctuation in the middle is not a bad thing. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.
On 13/07/12 13:38, Jan Urbański wrote: On 12/07/12 11:08, Heikki Linnakangas wrote: On 07.07.2012 00:12, Jan Urbański wrote: So you're in favour of doing unicode -> bytes by encoding with UTF-8 and then using the server's encoding functions? Sounds reasonable to me. The extra conversion between UTF-8 and UCS-2 should be quite fast, and it would be good to be consistent in the way we do conversions in both directions. I'll implement that than (sorry for not following up on that eariler). Here's a patch that always encodes Python unicode objects using UTF-8 and then uses Postgres's internal functions to produce bytes in the server encoding. Cheers, Jan diff --git a/src/pl/plpython/plpy_elog.c b/src/pl/plpython/plpy_elog.c new file mode 100644 index c375ac0..c2b3cb8 *** a/src/pl/plpython/plpy_elog.c --- b/src/pl/plpython/plpy_elog.c *** static char *get_source_line(const char *** 28,33 --- 28,41 /* + * Guard agains re-entrant calls to PLy_traceback, which can happen if + * traceback formatting functions raise Python errors. + */ + #define TRACEBACK_RECURSION_LIMIT 2 + static int recursion_depth = 0; + + + /* * Emit a PG error or notice, together with any available info about * the current Python error, previously set by PLy_exception_set(). * This should be used to propagate Python errors into PG. If fmt is *** PLy_traceback(char **xmsg, char **tbmsg, *** 147,166 StringInfoData xstr; StringInfoData tbstr; /* * get the current exception */ PyErr_Fetch(&e, &v, &tb); /* ! * oops, no exception, return */ ! if (e == NULL) { *xmsg = NULL; *tbmsg = NULL; *tb_depth = 0; return; } --- 155,177 StringInfoData xstr; StringInfoData tbstr; + recursion_depth++; + /* * get the current exception */ PyErr_Fetch(&e, &v, &tb); /* ! * oops, no exception or recursion depth exceeded, return */ ! if (e == NULL || recursion_depth > TRACEBACK_RECURSION_LIMIT) { *xmsg = NULL; *tbmsg = NULL; *tb_depth = 0; + recursion_depth--; return; } *** PLy_traceback(char **xmsg, char **tbmsg, *** 326,331 --- 337,344 (*tb_depth)++; } + recursion_depth--; + /* Return the traceback. */ *tbmsg = tbstr.data; diff --git a/src/pl/plpython/plpy_util.c b/src/pl/plpython/plpy_util.c new file mode 100644 index 4aabafc..fe43312 *** a/src/pl/plpython/plpy_util.c --- b/src/pl/plpython/plpy_util.c *** PLy_free(void *ptr) *** 61,126 PyObject * PLyUnicode_Bytes(PyObject *unicode) { ! PyObject *rv; ! const char *serverenc; ! /* ! * Map PostgreSQL encoding to a Python encoding name. ! */ ! switch (GetDatabaseEncoding()) { ! case PG_SQL_ASCII: ! /* ! * Mapping SQL_ASCII to Python's 'ascii' is a bit bogus. Python's ! * 'ascii' means true 7-bit only ASCII, while PostgreSQL's ! * SQL_ASCII means that anything is allowed, and the system doesn't ! * try to interpret the bytes in any way. But not sure what else ! * to do, and we haven't heard any complaints... ! */ ! serverenc = "ascii"; ! break; ! case PG_WIN1250: ! serverenc = "cp1250"; ! break; ! case PG_WIN1251: ! serverenc = "cp1251"; ! break; ! case PG_WIN1252: ! serverenc = "cp1252"; ! break; ! case PG_WIN1253: ! serverenc = "cp1253"; ! break; ! case PG_WIN1254: ! serverenc = "cp1254"; ! break; ! case PG_WIN1255: ! serverenc = "cp1255"; ! break; ! case PG_WIN1256: ! serverenc = "cp1256"; ! break; ! case PG_WIN1257: ! serverenc = "cp1257"; ! break; ! case PG_WIN1258: ! serverenc = "cp1258"; ! break; ! case PG_WIN866: ! serverenc = "cp866"; ! break; ! case PG_WIN874: ! serverenc = "cp874"; ! break; ! default: ! /* Other encodings have the same name in Python. */ ! serverenc = GetDatabaseEncodingName(); ! break; } ! rv = PyUnicode_AsEncodedString(unicode, serverenc, "strict"); ! if (rv == NULL) ! PLy_elog(ERROR, "could not convert Python Unicode object to PostgreSQL server encoding"); return rv; } --- 61,96 PyObject * PLyUnicode_Bytes(PyObject *unicode) { ! PyObject *bytes, *rv; ! char *utf8string, *encoded; ! /* first encode the Python unicode object with UTF-8 */ ! bytes = PyUnicode_AsUTF8String(unicode); ! if (bytes == NULL) ! PLy_elog(ERROR, "could not convert Python Unicode object to bytes"); ! ! utf8string = PyBytes_AsString(bytes); ! if (utf8string == NULL) { ! Py_DECREF(bytes); ! PLy_elog(ERROR, "could not extract bytes from encoded string"); ! } ! ! /* then convert to server encoding */ ! encoded = (char *) pg_do_encoding_conversion((unsigned char *) utf8string, ! strlen(utf8string), ! PG_UTF8, ! GetDatabaseEncoding()); ! ! /* finally, build a bytes object in the server encoding */ ! rv = PyBytes_Fr
Re: [HACKERS] Synchronous Standalone Master Redoux
On Sat, Jul 14, 2012 at 12:42 AM, Amit kapila wrote: >> From: Jose Ildefonso Camargo Tolosa [ildefonso.cama...@gmail.com] >> Sent: Saturday, July 14, 2012 9:36 AM >>On Fri, Jul 13, 2012 at 11:12 PM, Amit kapila wrote: >> From: pgsql-hackers-ow...@postgresql.org >> [pgsql-hackers-ow...@postgresql.org] on behalf of Jose Ildefonso Camargo >> Tolosa [ildefonso.cama...@gmail.com] >> Sent: Saturday, July 14, 2012 6:08 AM >> On Fri, Jul 13, 2012 at 10:22 AM, Bruce Momjian wrote: >>> On Fri, Jul 13, 2012 at 09:12:56AM +0200, Hampus Wessman wrote: >>> > So how about this for a Postgres TODO: > > Add configuration variable to allow Postgres to disable > synchronous >replication after a specified timeout, and add variable to alert > administrators of the change. >> I agree we need a TODO for this, but... I think timeout-only is not the best choice, there should be a maximum timeout (as a last resource: the maximum time we are willing to wait for standby, this have to have the option of "forever"), but certainly PostgreSQL have to detect the *complete* disconnection of the standby (or all standbys on the synchronous_standby_names), if it detects that no standbys are eligible for sync standby AND the option to do fallback to async is enabled = it will go into standalone mode (as if synchronous_standby_names were empty), otherwise (if option is disabled) it will just continue to wait for ever (the "last resource" timeout is ignored if the fallback option is disabled) I would call this "soft_synchronous_standby", and "soft_synchronous_standby_timeout" (in seconds, 0=forever, a sane value would be ~5 seconds) or something like that (I'm quite bad at picking names :( ). >> >> >After it has gone to standalone mode, if the standby came back will it be >> >able to return back to sync mode with it. > >> That's the idea, yes, after the standby comes back, the master would >> act as if the sync standby connected for the first time: first going >> through the "catchup" mode, and "once the lag between standby and >> primary reaches zero "(...)" we move to real-time streaming state" >> (from 9.1 docs), at that point: normal sync behavior is restored. > > Idea wise, it looks okay, but are you sure that in the current code/design, > it can handle the way you are suggesting. > I am not sure it can work because it might be the case that due to network > instability, the master has gone in standalone mode > and now after standy is able to communicate back, it might be expecting to > get more data rather than go in cacthup mode. > I believe some person who is expert of this code area can comment here to > make it more concrete. Well, I'd need to dive into the code, but as far as I know, is the master who decides to be on "catchup" mode, and standby just takes care of sending feedback to master. Also, it has to handle the situation, because currently, if master goes away because it crashed, or because of network issues, the standby doesn't really know why, and will reconnect to master and do whatever it needs to do to get in sync with master again (be it: try to reconnect several times while master is restarting, or that it just reconnect to a waiting master, and request pending WAL segments). There have to be code in place to handle those issues, because it is already working. I'm trying to get a solution that is as non-intrusive as possible, with lower amount of code added, so that performance doesn't suffer by reusing current logic and actions, with small alterations. > > With Regards, > Amit Kapila. -- Ildefonso Camargo Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC @cmdpromptinc - 509-416-6579 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Allow breaking out of hung connection attempts
On 13.07.2012 14:35, Ryan Kelly wrote: On Mon, Jul 09, 2012 at 05:35:15PM +0900, Shigeru HANADA wrote: On Mon, Jun 25, 2012 at 9:00 PM, Ryan Kelly wrote: - Copying only "select()" part of pqSocketPoll seems not enough, should we use poll(2) if it is supported? I did not think the additional complexity was worth it in this case. Unless you see some reason to use poll(2) that I do not. I checked where select() is used in PG, and noticed that poll is used at only few places. Agreed to use only select() here. poll() potentially performs better, but that hardly matters here, so select() is ok. However, on some platforms, signals don't interrupt select(). On such platforms, hitting CTRL-C will still not interrupt the connection attempt. Also there's a race condition: if you hit CTRL-C just after checking that the global variable is not set, but before entering select(), the select() will again not be interrupted (until the user gets frustrated and hits CTRL-C again). I think we need to sleep one second at a time, and check the global variable on every iteration, until the timeout is reached. That's what we do in non-critical sleep loops like this in the backend. We've actually been trying to get rid of such polling in the backend code lately, to reduce the number of unnecessary interrupts which consume power on an otherwise idle system, but I think that technique would still be OK for psql's connection code. Once the issues above are fixed, IMO this patch can be marked as "Ready for committer". I have also added additional documentation reflecting Heikki's suggestion that PQconnectTimeout be recommended for use by applications using the async API. Attached is v6 of the patch. Thanks! With this patch, any connect_timeout parameter given in the original connection info string is copied to the \connect command. That sounds reasonable at first glance, but I don't think it's quite right: First of all, if you give a new connect_timeout string directly in the \connect command, the old connect_timeout still overrides the new one: $ ./psql postgres://localhost/postgres?connect_timeout=3 psql (9.3devel) Type "help" for help. postgres=# \connect postgres://192.168.1.123/postgres?connect_timeout=1000 timeout expired Previous connection kept Secondly, the docs say that the new connection only inherits a few explicitly listed options from the old connection: dbname, username, host and port. If you add connect_timeout to that list, at least it requires a documentation update. But I don't think it should be inherited in the first place. Or if it should, what about other options, like application_name? Surely they should then be inherited too. All in all I think we should just leave out the changes to \connect, and not inherit connect_timeout from the old connection. If we want to change the behavior of all options, that's separate discussion and a separate patch. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Schema version management
On Sat, Jul 14, 2012 at 11:25 AM, Peter Eisentraut wrote: > Well, of course everyone uses directories in moderation. But you might > want to take a look at the gcc source code. You'll love it. ;-) Yes, but GCC was also created by someone who picks stuff from his bare foot and eats it. ;-) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_prewarm
> c) isn't necessarily safe in production (I've crashed Linux with Fincore > in the recent past). fincore is another soft, please provide a bugreport if you hit issue with pgfincore, I then be able to fix it and all can benefit. -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation signature.asc Description: This is a digitally signed message part.
Re: [HACKERS] Schema version management
On lör, 2012-07-14 at 10:41 +0200, Joel Jacobson wrote: > On Fri, Jul 13, 2012 at 9:41 PM, Peter Eisentraut wrote: > > > Personally, I hate this proposed nested directory structure. I would > > like to have all objects in one directory. > > > > But there is a lot of "personally" in this thread, of course. > > > Why do you hate it? > > It's a bit like saying, > - I hate database normalization, better to keep all rows in one single > table. > or even, > - I hate directories. To a certain extent, yes, I hate (excessive use of) directories. > I have thousands of objects, it would be a total mess to keep them all in a > single directory. Thousands of objects could be a problem, in terms of how the typical file system tools scale. But hundreds or a few thousand not necessarily. It's easy to browse, filter, and sort using common tools, for example. > Using a normalized directory structure makes sense for the SCM use-case, If there is a theory of "normalization" for hierarchical databases, I don't know it but would like to learn about it. > I haven't seen any projects where all the files are kept in one directory. Well, of course everyone uses directories in moderation. But you might want to take a look at the gcc source code. You'll love it. ;-) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Schema version management
On Fri, Jul 13, 2012 at 9:41 PM, Peter Eisentraut wrote: > Personally, I hate this proposed nested directory structure. I would > like to have all objects in one directory. > > But there is a lot of "personally" in this thread, of course. Why do you hate it? It's a bit like saying, - I hate database normalization, better to keep all rows in one single table. or even, - I hate directories. I have thousands of objects, it would be a total mess to keep them all in a single directory. Using a normalized directory structure makes sense for the SCM use-case, I haven't seen any projects where all the files are kept in one directory.
Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework
2012-07-14 00:36 keltezéssel, Tom Lane írta: Alvaro Herrera writes: For what it's worth, I would appreciate it if you would post the lock timeout patch for the upcoming commitfest. +1. I think it's reasonable to get the infrastructure patch in now, but we are running out of time in this commitfest (and there's still a lot of patches that haven't been reviewed at all). regards, tom lane OK, I will do that. Thanks for the review. -- -- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers