from:"Simon Riggs"

Re: [BUGS] Completely broken replica after PANIC: WAL contains references to invalid pages

2013-06-11 Thread Simon Riggs

On 11 June 2013 04:36, Sergey Konoplev gray...@gmail.com wrote:
 Hi,

 On Thu, May 9, 2013 at 7:28 PM, Sergey Konoplev gray...@gmail.com wrote:
 On Tue, Apr 2, 2013 at 11:26 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
 The attached patch fixes this although I don't like the way it knowledge of 
 the
 point up to which StartupSUBTRANS zeroes pages is handled.

 One month has passed since the patched version was installed in our
 production environment and can confirm that everything works perfect.
 Thank you very much for your prompt help, Andres.

 Are there any plans to commit this patch and what version it is going
 to be done to?

 Thank you.

I'll be committing this soon, since we're likely coming up to the next
point release soon.

Thanks for the reminder.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [HACKERS] [BUGS] COPY .... (FORMAT binary) syntax doesn't work

2013-06-01 Thread Simon Riggs

On 27 May 2013 15:31, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 On 26 May 2013 17:10, Tom Lane t...@sss.pgh.pa.us wrote:
 More readable would be to invent an intermediate nonterminal falling
 between ColId and ColLabel, whose expansion would be IDENT |
 unreserved_keyword | col_name_keyword | type_func_name_keyword, and
 then replace ColId_or_Sconst with whatever-we-call-that_or_Sconst.
 Any thoughts about a name for that new nonterminal?

 Do you think complicating the parser in that way is worth the trouble
 for this case? Could that slow down parsing?

 It makes the grammar tables a bit larger (1% or so IIRC).  There would
 be some distributed penalty for that, but probably not much.  Of course
 there's always the slippery-slope argument about that.

 We don't actually have to fix it; clearly not too many people are
 bothered, since no complaints in 3 years. Documenting 'binary' seems
 better.

 Well, my thought is there are other cases.  For instance:

 regression=# create role binary;
 ERROR:  syntax error at or near binary
 LINE 1: create role binary;
 ^
 regression=# create user cross;
 ERROR:  syntax error at or near cross
 LINE 1: create user cross;
 ^

 If we don't have to treat type_func_name_keywords as reserved in these
 situations, shouldn't we avoid doing so?


Seems reasonable argument, so +1. Sorry for delayed reply.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #8192: On very large tables the concurrent update with vacuum lag the hot_standby replica

2013-06-01 Thread Simon Riggs

On 30 May 2013 17:43,  feder...@brandwatch.com wrote:

 It seems on very large tables the concurrent update with vacuum (or
 autovacuum),
 when the slave is in hot standby mode, generates long loops in read on a
 single wal segment during the recovery process.

Not all aspects of hot standby are optimally tuned, as yet. Some room
for improvement exists.

In some cases, conscious choices were made to keep processing on
master fast at the expense of speed on the standby.

There doesn't seem to be anything which is in itself an error here.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] COPY .... (FORMAT binary) syntax doesn't work

2013-05-27 Thread Simon Riggs

On 26 May 2013 16:35, Heikki Linnakangas hlinnakan...@vmware.com wrote:

 My attempts to fix that look pretty ugly, so I'm not even going to
 post them. I can stop the error on binary by causing errors on csv and
 text, obviously not a fix. Any grammar based fix looks like it would
 restrict the list of formats, which breaks the orginal intention of
 the syntax change.


 This seems to work:

This was almost exactly the fix I described above that only fixes that
specific case and then breaks others.

 --- a/src/backend/parser/gram.y
 +++ b/src/backend/parser/gram.y
 @@ -2528,3 +2528,7 @@ copy_generic_opt_elem:
 {
 $$ = makeDefElem($1, $2);
 }
 +   | ColLabel BINARY
 +   {
 +   $$ = makeDefElem($1, (Node *)
 makeString(binary));
 +   }

So, no that doesn't work.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [HACKERS] [BUGS] COPY .... (FORMAT binary) syntax doesn't work

2013-05-27 Thread Simon Riggs

On 26 May 2013 17:10, Tom Lane t...@sss.pgh.pa.us wrote:
 Heikki Linnakangas hlinnakan...@vmware.com writes:

 More readable would be to invent an intermediate nonterminal falling
 between ColId and ColLabel, whose expansion would be IDENT |
 unreserved_keyword | col_name_keyword | type_func_name_keyword, and
 then replace ColId_or_Sconst with whatever-we-call-that_or_Sconst.
 Any thoughts about a name for that new nonterminal?

Do you think complicating the parser in that way is worth the trouble
for this case? Could that slow down parsing?

We don't actually have to fix it; clearly not too many people are
bothered, since no complaints in 3 years. Documenting 'binary' seems
better.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

[BUGS] COPY .... (FORMAT binary) syntax doesn't work

2013-05-26 Thread Simon Riggs

This works fine...
COPY pgbench_accounts TO '/tmp/acc' BINARY;

This new format does not
COPY pgbench_accounts FROM '/tmp/acc' (FORMAT BINARY);
ERROR:  syntax error at or near BINARY at character 47

which looks like I've mistyped something. Until you realise that this
statement gives a completely different error message.

COPY pgbench_accounts FROM '/tmp/acc' (FORMAT anyname);
ERROR:  COPY format anyname not recognized

and we also note that there are no examples in the docs, nor
regression tests to cover this situation.

So I conclude that this hasn't ever worked since it was introduced in 9.0.

The cause is that there is an overlap between the old and the new COPY
syntax, relating to the word BINARY. It's the grammar generating the
error, not post parse analysis.

My attempts to fix that look pretty ugly, so I'm not even going to
post them. I can stop the error on binary by causing errors on csv and
text, obviously not a fix. Any grammar based fix looks like it would
restrict the list of formats, which breaks the orginal intention of
the syntax change.

I'm advised that

COPY pgbench_accounts FROM '/tmp/acc' (FORMAT 'BINARY');

works fine. Though that is not documented and I doubt anyone much uses that.

My conclusion is that we should *not* fix this bug, but just alter the
manual slightly to explain what the correct usage is (use quotes
around 'binary'). Reason for that suggestion is that nobody ever
reported this bug, so either few people use binary mode or they use
the old syntax. Of course, that is not a normal suggestion, so feel
free to walk up and down my spine with boots on for suggesting it.

Thoughts?

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] Completely broken replica after PANIC: WAL contains references to invalid pages

2013-04-01 Thread Simon Riggs

On 30 March 2013 17:21, Andres Freund and...@2ndquadrant.com wrote:

 So if the xid is later than latestObservedXid we extend subtrans one by
 one. So far so good. But we initialize it in
 ProcArrayApplyRecoveryInfo() when consistency is initially reached:
  latestObservedXid = running-nextXid;
  TransactionIdRetreat(latestObservedXid);
 Before that subtrans has initially been started up with:
 if (wasShutdown)
 oldestActiveXID = 
 PrescanPreparedTransactions(xids, nxids);
 else
 oldestActiveXID = checkPoint.oldestActiveXid;
 ...
 StartupSUBTRANS(oldestActiveXID);

 That means its only initialized up to checkPoint.oldestActiveXid. As it
 can take some time till we reach consistency it seems rather plausible
 that there now will be a gap in initilized pages. From
 checkPoint.oldestActiveXid to running-nextXid if there are pages
 inbetween.

That was an old bug.

StartupSUBTRANS() now explicitly fills that gap. Are you saying it
does that incorrectly? How?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] postgres 9.2.2 point conversion from polygon doesn't always give accurate center

2013-02-03 Thread Simon Riggs

On 1 February 2013 17:54, Colin Dunklau colin.dunk...@gmail.com wrote:

 For the below two queries, I expect to get a result of (0.5, 0.5).

 cdunklau=# select point( polygon '((0,0),(0,1),(1,1),(0,1))');
 point
 -
  (0.25,0.75)
 (1 row)


I think you just simply mistyped the coordinates...

sriggs=# select point( polygon '((0,0),(0,1),(1,1),(1,0))');
   point
---
 (0.5,0.5)
(1 row)

Your last point is a duplicate of the 2nd point, so you have a
4-pointed triangle and hence a strange centre.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #7803: Replication Problem(no master is there)

2013-01-15 Thread Simon Riggs

On 15 January 2013 05:12, Tomonari Katsumata
katsumata.tomon...@po.ntts.co.jp wrote:

 We added a REPLICATION privelge onto user accounts to control access.

 Perhaps we should add a CASCADE privilege as well, so that we can
 control whether we can connect to a master and/or a standby.

 Syntax would be

 ALTER USER foo
 [MASTER | CASCADE] REPLICATION

 REPLICATION allows both master and cascaded replication (same as now)
 MASTER REPLICATION allows master only
 CASCADE REPLICATION allows cascaded replication only
 NOREPLICATION allows neither option


 Someone is working for it already ?
 If not yet, may I try to implement it ?

Please do. It looks fairly short.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #7803: Replication Problem(no master is there)

2013-01-11 Thread Simon Riggs

On 11 January 2013 08:40, Heikki Linnakangas hlinnakan...@vmware.com wrote:

 This makes me wonder if there should be a GUC to forbid cascading
 replication, though. If you don't want to do cascading replication (which is
 quite rare, I'd say), you could just disable it to avoid a situation like
 this.

Connection from a standby is disabled by default. Don't enable it...

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #7803: Replication Problem(no master is there)

2013-01-11 Thread Simon Riggs

On 11 January 2013 12:18, Tomonari Katsumata
katsumata.tomon...@po.ntts.co.jp wrote:

 I'm not sure but what about adding the parameter(cascade_mode) on
 recovery.conf ?
 The parameter represents a will to connect to standby server when starting
 as standby.
 If the parameter is set to on, connect to a server forcely like PostgreSQL
 9.2,
 and if the parameter is set to off, connect to the another standby server is
 refused like PostgreSQL 9.1.

We added a REPLICATION privelge onto user accounts to control access.

Perhaps we should add a CASCADE privilege as well, so that we can
control whether we can connect to a master and/or a standby.

Syntax would be

ALTER USER foo
[MASTER | CASCADE] REPLICATION

REPLICATION allows both master and cascaded replication (same as now)
MASTER REPLICATION allows master only
CASCADE REPLICATION allows cascaded replication only
NOREPLICATION allows neither option

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #7801: Streaming failover checkpoints much slower than master, pg_xlog space problems during db load

2013-01-08 Thread Simon Riggs

On 8 January 2013 19:24,  bri...@openroadtech.com wrote:

 Simply stated, pg_xlog grows out of control on a streaming-replication
 backup server with a high volume of writes on the master server. This occurs
 only with checkpoint_completion_target0 and very large (eg. 8GB)
 shared_buffers. pg_xlog on the master stays a fixed size (1.2G for me).

All of this appears to be working as designed.

It will issue a restartpoint every checkpoint_timeout seconds on the standby.

checkpoint_segments is ignored on standby.

 Detail: I have two new/fast servers using streaming replication, with 9.2.2
 from the PostgreSQL yum repository. The servers are connected via 10G
 network, so the failover receives data as fast as it is created.

 It appears that to trigger this problem you need large shared_buffers (mine
 is set to 8GB) and checkpoint_completion_target  0. (.8 or .9 will do). I
 have checkpoint_timeout=5min. (though setting this to 30s *does not*
 alleviate the problem).  The failover machine doesn't seem to be concerned
 with checkpoint_timeout or checkpoint_segments.

 When doing large writes (like pg_restore) the failover machine's pg_xlog
 does not clear files. The reason appears to be that it is not
 checkpointing(verified by log_checkpoints=on and examining the log).  If I
 run 'checkpoint;' manually on the failover in psql(hot_standby=on), it
 immediately cleans up pg_xlog(after a pause to actually perform the
 checkpoint).

 To generate the data,you can just run
 pgbench -i -s1000
 on the master server, then watch pg_xlog grow on the failover.

 This produces a ~10GB of data, then creates indices.  If I run du on the
 failover after the data is loaded, I see:

 # du -s base pg_xlog | sort -n
 10928276pg_xlog
 13144536base

 I have a 10GB pg_xlog.  The load completes in less than 5 minutes, which may
 be relevant here, since eventually the failover will perform a checkpoint.
 On the primary server, pg_xlog caps at 1.2GB.

 If checkpoint_completion_target=0, or shared_buffers=256MB, pg_xlog grows to
 around 1GB and stays there.

 This guess may be off target, but it appears that the database only
 checkpoints every approximately checkpoint_completion_target*
 checkpoint_timeout seconds.  So, for a .9 completion target and the default
 5min timeout, the failover will go for 4+ minutes without executing a single
 checkpoint, regardless of how much data arrives (verified with
 log_checkpoints=on).

 All the while the master server is spitting out 'checkpoints are occurring
 too frequently' messages, which I assume is to be expected during a reload.

 This means that the failover machine needs 2x the database size to not crash
 during a database reload.  We run VMs with limited disk allocation where
 this is not the case.  We periodically need to dump/restore, and I'm
 concerned this simple operation will crash our failover machines.

 I can avoid the problem by setting checkpoint_completion_target=0, which I
 will do for now.

 Here are settings that differ from the default initdb install:

 shared_buffers = 8GB
 work_mem = 64MB
 maintenance_work_mem = 256MB
 max_stack_depth = 8MB
 wal_level = hot_standby
 synchronous_commit = off
 wal_buffers = 64MB
 checkpoint_segments = 10
 checkpoint_completion_target = 0
 max_wal_senders = 2
 hot_standby = on
 effective_cache_size = 8GB
 log_checkpoints = on


-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #7752: FATAL btree error on PITR

2012-12-14 Thread Simon Riggs

On 12 December 2012 01:55,  m.sakre...@gmail.com wrote:
 The following bug has been logged on the website:

 Bug reference:  7752
 Logged by:  Maciek Sakrejda
 Email address:  m.sakre...@gmail.com
 PostgreSQL version: 9.1.6
 Operating system:   Ubuntu 10.04 LTS 64-bit
 Description:

 Ran into the following error in trying to perform a PITR:

 FATAL:  btree level 66135134 not found in index 436254

 This happened when the PITR was almost complete (judging by on-disk database
 size). I guess this may be related to one of the index corruption issues
 fixed in 9.1.7, but if that's the case, would it perhaps make sense to
 complete the PITR without the corrupt index(es) and deal with the index
 issue separately? Clearly, just a warning is dangerously likely to be
 skipped, but maybe a mechanism like backup_label, that prevents normal
 startup before the issue is resolved?

Yes, I've thought about allowing recovery to skip corrupt indexes, but
that feature hasn't been implemented yet.

We'd need to track such things as recovery proceeds and then mark them
as invalid when recovery completes.

Patches welcome.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] PITR potentially broken in 9.2

2012-12-05 Thread Simon Riggs

On 5 December 2012 00:35, Tom Lane t...@sss.pgh.pa.us wrote:
 I wrote:
 So apparently this is something we broke since Nov 18.  Don't know what
 yet --- any thoughts?

 Further experimentation shows that reverting commit
 ffc3172e4e3caee0327a7e4126b5e7a3c8a1c8cf makes it work.  So there's
 something wrong/incomplete about that fix.

 This is a bit urgent since we now have to consider whether to withdraw
 9.2.2 and issue a hasty 9.2.3.  Do we have a regression here since
 9.2.1, and if so how bad is it?

I'll look at this now.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] PITR potentially broken in 9.2

2012-12-05 Thread Simon Riggs

On 5 December 2012 02:27, Tom Lane t...@sss.pgh.pa.us wrote:
 Andres Freund and...@2ndquadrant.com writes:
 But the key is, the database was not actually consistent at that
 point, and so opening hot standby was a dangerous thing to do.

 The bug that allowed the database to open early (the original topic if
 this email chain) was masking this secondary issue.

 Could you check whether the attached patch fixes the behaviour?

 Yeah, I had just come to the same conclusion: the bug is not with
 Heikki's fix, but with the pause logic.  The comment says that it
 shouldn't pause unless users can connect to un-pause, but the actual
 implementation of that test is several bricks shy of a load.

OK, I've had a look at this now.

Andres correctly identified the bug above, which was that the backup
end is noted by the XLOG_BACKUP_END record. recoveryStopsHere() was
not changed when that record type was added, so it ignores the
situation that we are waiting for end of backup and yet it stop
anyway. So I concur with Jeff that there is a bug and think that
Andres has provided the clue to a fix. I'll patch that now. Aboriginal
bug extends back to 9.0.

Pause has got nothing at all to do with this, even if there are other
problems there.

 Your patch is better, but it's still missing a bet: what we need to be
 sure of is not merely that we *could* have told the postmaster to start
 hot standby, but that we *actually have done so*.  Otherwise, it's
 flow-of-control-dependent whether it works or not; we could fail if the
 main loop hasn't gotten to CheckRecoveryConsistency since the conditions
 became true.  Looking at the code in CheckRecoveryConsistency, testing
 LocalHotStandbyActive seems appropriate.

 I also thought it was pretty dangerous that this absolutely critical
 test was not placed in recoveryPausesHere() itself, rather than relying
 on the call sites to remember to do it.

 So the upshot is that I propose a patch more like the attached.

I've reworked pause logic along the lines you suggest. Attached here
for further discussion.

 This is not a regression because the pause logic is broken this same
 way since 9.1.  So I no longer think that we need a rewrap.

I think we do need a rewrap, since the bug is not in pause logic.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


recovery_pause_refactor.v1.patch
Description: Binary data

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] PITR potentially broken in 9.2

2012-12-05 Thread Simon Riggs

On 5 December 2012 13:34, Simon Riggs si...@2ndquadrant.com wrote:

 Aboriginal bug extends back to 9.0.

I don't see any bug in 9.0 and 9.1, just 9.2+

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] PITR potentially broken in 9.2

2012-12-05 Thread Simon Riggs

On 5 December 2012 14:33, Andres Freund and...@2ndquadrant.com wrote:

 Independent of this patch, I am slightly confused about the whole stop
 logic. Isn't the idea that you can stop/start/stop/start/... recovery?
 Because if !recoveryApply we break out of the whole recovery loop and
 are done with things.

You can, but by shutting down server, updating recovery target, then
restarting server.

That's not a beautiful design, its just waiting for someone to make
recovery targets changeable, which is best done when recovery.conf
parameters are in postgresql.conf

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] PITR potentially broken in 9.2

2012-12-05 Thread Simon Riggs

On 5 December 2012 16:40, Tom Lane t...@sss.pgh.pa.us wrote:

 The real question here probably needs to be what is the point of
 recoveryPauseAtTarget in the first place?.  I find it hard to envision
 what's the point of pausing unless the user has an opportunity to
 make a decision about whether to continue applying WAL.  As Simon
 mentioned, we seem to be lacking some infrastructure that would let
 the user adjust the recovery_target parameters before resuming WAL
 processing.  But, assuming for the moment that our workaround for
 that is shutdown the server, adjust recovery.conf, and restart,
 is the pause placed in a useful spot for that?

 BTW, could we make this more convenient by letting recoveryPausesHere()
 reread recovery.conf?  Also, shouldn't the code re-evaluate
 recoveryStopsHere() after that?

The recovery target and the consistency point are in some ways in
conflict. If the recovery target is before the consistency point there
is no point in stopping there, whether or not we pause. What we should
do is say recovery target reached, yet recovery not yet consistent,
continuing.
So ISTM that we should make recoveryStopsHere() return false while we
are inconsistent. Problems solved.

We can re-read parameters after we wake from a pause. Presumably only
in HEAD, or do you mean in 9.2/9,1 also?

My earlier patch to rearrange pause logic did a few things, nothing
very dramatic
* Put the logic for whether we're consistent inside
recoveryPausesHere() as requested
* rearrange logic so setting EndPtr is the very last thing we do
before redo for the current record
* Fixes the bug that we don't pause in the correct place if recoveryApply = true
The comment on that new pause location was slightly wrong

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] PITR potentially broken in 9.2

2012-12-05 Thread Simon Riggs

On 5 December 2012 17:17, Simon Riggs si...@2ndquadrant.com wrote:

 The recovery target and the consistency point are in some ways in
 conflict. If the recovery target is before the consistency point there
 is no point in stopping there, whether or not we pause. What we should
 do is say recovery target reached, yet recovery not yet consistent,
 continuing.
 So ISTM that we should make recoveryStopsHere() return false while we
 are inconsistent. Problems solved.

Patch

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


recovery_dont_stop_when_inconsistent.v1.patch
Description: Binary data

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] PITR potentially broken in 9.2

2012-12-05 Thread Simon Riggs

On 5 December 2012 18:48, Tom Lane t...@sss.pgh.pa.us wrote:
 I wrote:
 Andres Freund and...@2ndquadrant.com writes:
 On 2012-12-05 17:24:42 +, Simon Riggs wrote:
 So ISTM that we should make recoveryStopsHere() return false while we
 are inconsistent. Problems solved.

 I prefer the previous (fixed) behaviour where we error out if we reach a
 recovery target before we are consistent:

 I agree.  Silently ignoring the user's specification is not good.
 (I'm not totally sure about ignoring the pause spec, either, but
 there is no good reason to change the established behavior for
 the recovery target spec.)

 On further thought, it seems like recovery_pause_at_target is rather
 misdesigned anyway, and taking recovery target parameters from
 recovery.conf is an obsolete API that was designed in a world before hot
 standby.  What I suggest people really want, if they're trying to
 interactively determine how far to roll forward, is this:

 (1) A recovery.conf parameter that specifies pause when hot standby
 opens up (that is, as soon as we have consistency).

Agreed.

 (2) A SQL command/function that releases the pause mode *and* specifies
 a new target stop point (ie, an interactive way of setting the recovery
 target parameters).  The startup process then rolls forward to that
 target and pauses again.

That was my original patch, IIRC, with different functions for each
target type, plus an advance N records function for use in
debugging/automated testing.

Having a single function would be possible, taking a single TEXT
parameter that we parse just like the recovery.conf, taking
semi-colons as newlines.

 (3) A SQL command/function that releases the pause mode and specifies
 coming up normally, ie not following the archived WAL any further
 (I imagine this would force a timeline switch).

 The existing pause now function could still fit into this framework;
 but it seems to me to have mighty limited usefulness, considering the
 speed of WAL replay versus human reaction time.

I think that was in there also.

Can't remember why we didn't go for the full API last time. I'll have
another go, in HEAD.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] PITR potentially broken in 9.2

2012-12-05 Thread Simon Riggs

On 5 December 2012 21:15, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 On 5 December 2012 18:48, Tom Lane t...@sss.pgh.pa.us wrote:
 On further thought, it seems like recovery_pause_at_target is rather
 misdesigned anyway, and taking recovery target parameters from
 recovery.conf is an obsolete API that was designed in a world before hot
 standby.  What I suggest people really want, if they're trying to
 interactively determine how far to roll forward, is this:
 ...

 Can't remember why we didn't go for the full API last time. I'll have
 another go, in HEAD.

 That's fine, but the immediate question is what are we doing to fix
 the back branches.  I think everyone is clear that we should be testing
 LocalHotStandbyActive rather than precursor conditions to see if a pause
 is allowed, but are we going to do anything more than that?

No

 The only other thing I really wanted to do is not have the in-loop pause
 occur after we've taken actions that are effectively part of the WAL
 apply step.  I think ideally it should happen just before or after the
 recoveryStopsHere stanza.  Last night I worried about adding an extra
 spinlock acquire to make that work, but today I wonder if we couldn't
 get away with just a naked

 if (xlogctl-recoveryPause)
 recoveryPausesHere();

 The argument for this is that although we might fetch a slightly stale
 value of the shared variable, it can't be very stale --- certainly no
 older than the spinlock acquisition near the bottom of the previous
 iteration of the loop.  And this is a highly asynchronous feature
 anyhow, so fuzziness of plus or minus one WAL record in when the pause
 gets honored is not going to be detectable.  Hence an extra spinlock
 acquisition is not worth the cost.

Yep, thats fine.

Are you doing this or do you want me to? Don't mind either way.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] PITR potentially broken in 9.2

2012-12-05 Thread Simon Riggs

On 5 December 2012 22:23, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Wed, Dec 5, 2012 at 4:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The argument for this is that although we might fetch a slightly stale
 value of the shared variable, it can't be very stale --- certainly no
 older than the spinlock acquisition near the bottom of the previous
 iteration of the loop.  And this is a highly asynchronous feature
 anyhow, so fuzziness of plus or minus one WAL record in when the pause
 gets honored is not going to be detectable.  Hence an extra spinlock
 acquisition is not worth the cost.

 I wonder whether the cost of an extra spinlock acquire/release cycle
 is really noticeable here.  That can get expensive in a hurry when
 lots of processes are contending the spinlock ... but I think that
 shouldn't be the case here; most of the accesses will be coming from
 the startup process.  Of course atomic operations are much more
 expensive than ordinary CPU instructions even under the best of
 circumstances, but is that really material here?  I'm just wondering
 whether this is premature optimization that's going to potentially
 bite us later in the form of subtle, hard-to-reproduce bugs.

 I have the same doubt about whether it's really significant.  However,
 I think it's much more likely that we'd find out it is significant than
 that we'd find out anybody could detect the lack of a memory barrier
 there.

Agreed. And any future logic to stop at a specific point will be
exactly precise because the decision and action will be taken in same
process.

Patch looks good.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #7710: Xid epoch is not updated properly during checkpoint

2012-12-02 Thread Simon Riggs

On 1 December 2012 23:10, Andres Freund and...@anarazel.de wrote:

 So barring objections, I'm going to remove LogStandbySnapshot's behavior
 of returning the updated nextXid.

 I don't see any reason why it would be bad to remove this. I think the
 current behaviour could actually even delay getting to an active state
 slightly in the presence of prepared transactions because its used to
 create to initialize the KnownAssignedXid machinery in xlog_redo. If the
 prepared xacts are suboverflown its a *good* thing to have an old
 -nextXid.

That particular coding was pretty weird. Please change.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #7710: Xid epoch is not updated properly during checkpoint

2012-12-02 Thread Simon Riggs

On 1 December 2012 22:56, Tom Lane t...@sss.pgh.pa.us wrote:
 tar...@gmail.com writes:
 [ txid_current can show a bogus value near XID wraparound ]
 This happens only if wal_level=hot_standby.

 I believe what is happening here is

 (1) CreateCheckPoint sets up checkPoint.nextXid and
 checkPoint.nextXidEpoch, near xlog.c line 7070 in HEAD.  At this point,
 nextXid is still a bit less than the wrap point.

 (2) After performing the checkpoint, at line 7113, CreateCheckPoint
 calls LogStandbySnapshot() which helpfully updates checkPoint.nextXid
 to the latest value.  Which by now has wrapped around.  But it doesn't
 fix checkPoint.nextXidEpoch, so the checkpoint that gets written out has
 effectively lost the epoch bump that should have happened.

 While we could add some more logic to try to correct the epoch value
 in this scenario, I think it's a much better idea to just stop having
 LogStandbySnapshot update the nextXid.  That seems to me to be useless
 complication.  I also quite dislike the fact that we're effectively
 redefining the checkpoint nextXid from being taken before the main
 body of the checkpoint to being taken afterwards, but *only* in
 XLogStandbyInfoActive mode.  If that inconsistency isn't already causing
 bugs (besides this one) today, it'll probably cause them in the future.

I agree that the coding looks weird and agree it shouldn't be there.
The meaning of the checkpoint values should not differ because
wal_level has changed.

 So barring objections, I'm going to remove LogStandbySnapshot's behavior
 of returning the updated nextXid.

Removing it may cause other bugs, but if so, those other bugs need to
be solved in the right way, not by having a too-far-forwards nextxid
on the checkpoint record. Having said that, I can't see any bugs that
would be caused by this.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #7710: Xid epoch is not updated properly during checkpoint

2012-12-02 Thread Simon Riggs

On 2 December 2012 12:51, Simon Riggs si...@2ndquadrant.com wrote:
 On 1 December 2012 22:56, Tom Lane t...@sss.pgh.pa.us wrote:
 tar...@gmail.com writes:
 [ txid_current can show a bogus value near XID wraparound ]
 This happens only if wal_level=hot_standby.

 I believe what is happening here is

 (1) CreateCheckPoint sets up checkPoint.nextXid and
 checkPoint.nextXidEpoch, near xlog.c line 7070 in HEAD.  At this point,
 nextXid is still a bit less than the wrap point.

 (2) After performing the checkpoint, at line 7113, CreateCheckPoint
 calls LogStandbySnapshot() which helpfully updates checkPoint.nextXid
 to the latest value.  Which by now has wrapped around.  But it doesn't
 fix checkPoint.nextXidEpoch, so the checkpoint that gets written out has
 effectively lost the epoch bump that should have happened.

 While we could add some more logic to try to correct the epoch value
 in this scenario, I think it's a much better idea to just stop having
 LogStandbySnapshot update the nextXid.  That seems to me to be useless
 complication.  I also quite dislike the fact that we're effectively
 redefining the checkpoint nextXid from being taken before the main
 body of the checkpoint to being taken afterwards, but *only* in
 XLogStandbyInfoActive mode.  If that inconsistency isn't already causing
 bugs (besides this one) today, it'll probably cause them in the future.

 I agree that the coding looks weird and agree it shouldn't be there.
 The meaning of the checkpoint values should not differ because
 wal_level has changed.

 So barring objections, I'm going to remove LogStandbySnapshot's behavior
 of returning the updated nextXid.

 Removing it may cause other bugs, but if so, those other bugs need to
 be solved in the right way, not by having a too-far-forwards nextxid
 on the checkpoint record. Having said that, I can't see any bugs that
 would be caused by this.

Andres'  patch is invasive and is not for my liking in backbranches.

Tom's recommended change goes further still and not one I'm
comfortable risking, even though I agree with the root cause/change.

I've applied an absolutely minimal fix on this, which introduces no
other changes that could cause unforeseen consequences.

Others may wish to go further, overriding my patches, as they choose.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #7710: Xid epoch is not updated properly during checkpoint

2012-12-02 Thread Simon Riggs

On 2 December 2012 15:25, Tom Lane t...@sss.pgh.pa.us wrote:

 I can point right now to one misbehavior this causes: if you run a
 point-in-time recovery with a stop point somewhere in the middle of the
 checkpoint, you should end up with a nextXid corresponding to the stop
 point.  This hack in LogStandbySnapshot causes you to end up with a
 much later nextXid, if you were running hot-standby.

True, though that does not cause any problem.

 Others may wish to go further, overriding my patches, as they choose.

 Okay, I will take the responsibility for changing this, but it needs to
 change.

OK. At least we have the minimal coding to fall back on if need be.

 This coding was ill-considered from the word go.

Agreed, but then I don't have a clear reason why it is that way and
yet I'm sure I did it for some reason.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #7710: Xid epoch is not updated properly during checkpoint

2012-12-02 Thread Simon Riggs

On 2 December 2012 16:44, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 On 2 December 2012 15:25, Tom Lane t...@sss.pgh.pa.us wrote:
 This coding was ill-considered from the word go.

 Agreed, but then I don't have a clear reason why it is that way and
 yet I'm sure I did it for some reason.

 I think you just did it because it looked easy, and you didn't think
 very hard about the consequences.

I don't typically think such thoughts. Your ability to see more deeply
into certain topics is a credit to you, not an implication of laziness
on me.

I'm sure I believed it a necessary or useful thing to do.

 As far as the concern about new bugs is concerned: if we start replay
 from this checkpoint, we will certainly not consider that the DB is
 consistent until we reach the checkpoint's physical position.  And by
 that point we'll have replayed the XLOG_RUNNING_XACTS record emitted by
 LogStandbySnapshot, so our idea of the nextXid should be fully up to
 date anyway.  The same goes for checkpoints encountered later in the
 replay run --- they'd just be duplicative of the preceding
 XLOG_RUNNING_XACTS record.  There is no reason to put the same XID into
 the checkpoint record.

Exactly, the end result is identical, but the intermediate states may differ.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #7572: virtualxid lock held by bgwriter on promoted slaves

2012-11-29 Thread Simon Riggs

On 27 September 2012 22:29, Simon Riggs si...@2ndquadrant.com wrote:
 On 26 September 2012 22:33,  daniele.varra...@gmail.com wrote:
 The following bug has been logged on the website:

 Bug reference:  7572
 Logged by:  Daniele Varrazzo
 Email address:  daniele.varra...@gmail.com
 PostgreSQL version: 9.1.4
 Operating system:   Linux
 Description:

 Hello,

 when a slave is promoted, the pgwriter keeps holding a lock with virtualxid
 1/1 and virtualtransaction -1/0. Such lock stops pg_reorg to run (as
 reported here:
 http://pgfoundry.org/tracker/index.php?func=detailaid=1011203group_id=1000411atid=1376)
 but I've verified the same condition on 9.1.4).

 Is it possible to free that lock on slave promotion?

 Is it safe to ignore that lock for pg_reorg sake? The program is which is
 probably waiting for all the transactions to finish before swapping the
 table in the system catalog but I'm not sure about that yet.

 This one is mine I don't think its important we hold that lock,
 but will check.

This was an interesting bug to track down and certainly had me stumped
for some time.

Nothing at all todo with the WALwriter. It turns out that the Startup
process never ended its VirtualTransaction, which caused 2 separate
bugs in 9.0/9.1 and 9.2/HEAD. The bugs in 9.2+ were only visible
because of lack of initialisation of the fp fields, also fixed.

Thanks for the bug report.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #7627: Bad query plan when using LIMIT

2012-10-30 Thread Simon Riggs

On 30 October 2012 05:27, Tom Lane t...@sss.pgh.pa.us wrote:
 edw...@clericare.com writes:
 The following two queries differ only by adding LIMIT 1, but the one with
 the limit gets radically worse performance. I've done VACUUM FULL, VACUUM
 ANALYZE, and REINDEX DATABASE and there are no modifications since.

 EXPLAIN ANALYZE SELECT * FROM commits WHERE id IN (SELECT id FROM commits
 ORDER BY tree_high LIMIT 605 ) AND tree_other IS NULL ORDER BY tree_high
 DESC;

 EXPLAIN ANALYZE SELECT * FROM commits WHERE id IN (SELECT id FROM commits
 ORDER BY tree_high LIMIT 605 ) AND tree_other IS NULL ORDER BY tree_high
 DESC LIMIT 1;

 I think what's happening there is that the planner supposes that an
 indexscan in tree_high DESC order will find rows matching the IN
 condition uniformly distributed in the scan order --- but, because of
 the construction of the IN clause, they're actually going to be
 pessimally located at the very end of that scan order.  So it ends up
 forming all of the nestloop result, when it had expected to have to
 compute only 1/595 of it.  We've discussed dialing down the planner's
 optimism about limit plans to not assume perfect independence of filter
 conditions, but I don't think anyone would advocate for having it assume
 the worst possible case, which is what you've got here unfortunately.

Very bad plans with LIMIT are frequent. This is bad for us because
adding LIMIT usually/is supposed to make queries faster, not slower.

We need to do something.

LIMIT optimistically assumes the very best case, and that best case
gets better the number of rows filtered away.

If we are scanning N rows with LIMIT M we must assume that the action
relates in some way to N, rather than assuming the LIMIT is more and
more effective as N/M increases.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] Pg_stat_replication shows sync standby with flush location behind primary in 9.1.5

2012-10-04 Thread Simon Riggs

On 4 October 2012 05:32, Mark Kirkwood mark.kirkw...@catalyst.net.nz wrote:
 I am seeing the situation where the reported flush location for the sync
 standby (standby1 below) is *behind* the reported current xlog location of
 the primary. This is Postgres 9.1.5 , and I was under the impression that
 transactions initiated on the master do not commit until the corresponding
 wal is flushed on the sync standby.

 Now the standby is definitely working in sync mode, because stopping it
 halts all write transactions on the primary (sync_standby_names contains
 only standby1). So is the reported lag in flush location merely an artifact
 of timing in the query, or is there something else going on? [1]

The writing of new WAL is independent of the wait that occurs on
commit, so it is entirely possible, even desirable, that the observed
effect occurs.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #7572: virtualxid lock held by bgwriter on promoted slaves

2012-09-27 Thread Simon Riggs

On 26 September 2012 22:33,  daniele.varra...@gmail.com wrote:
 The following bug has been logged on the website:

 Bug reference:  7572
 Logged by:  Daniele Varrazzo
 Email address:  daniele.varra...@gmail.com
 PostgreSQL version: 9.1.4
 Operating system:   Linux
 Description:

 Hello,

 when a slave is promoted, the pgwriter keeps holding a lock with virtualxid
 1/1 and virtualtransaction -1/0. Such lock stops pg_reorg to run (as
 reported here:
 http://pgfoundry.org/tracker/index.php?func=detailaid=1011203group_id=1000411atid=1376)
 but I've verified the same condition on 9.1.4).

 Is it possible to free that lock on slave promotion?

 Is it safe to ignore that lock for pg_reorg sake? The program is which is
 probably waiting for all the transactions to finish before swapping the
 table in the system catalog but I'm not sure about that yet.

This one is mine I don't think its important we hold that lock,
but will check.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #6239: Looking for a technical contact point for PostgreSQL compatibility issue on Windows8

2012-09-01 Thread Simon Riggs

On 4 October 2011 02:42, Seiko Ishida v-sei...@microsoft.com wrote:

 The following bug has been logged online:

 Bug reference:  6239
 Logged by:  Seiko Ishida
 Email address:  v-sei...@microsoft.com
 PostgreSQL version: 8.2.4
 Operating system:   Windows 8
 Description:Looking for a technical contact point for PostgreSQL
 compatibility issue on Windows8
 Details:

 Hello,

 I am a Program Manager with the Ecosystem Engineering team at Microsoft.

 I am looking for a technical contact point to notify of compatibility issues
 with PostgreSQL.
 Could you please connect me to the appropriate
 individual for this?

 Regards,

 Seiko Ishida
 Microsoft ISV Readiness, EcoSystem Engineering Team
 Program Manager
 Ref : 341057


As a general check, I'd like to raise the topic of Window 8 and
Windows Server 2012 support.

PostgreSQL 9.2 will be released in next few weeks/months, since we are
now at release candidate stage.

Can I confirm that both Microsoft and PostgreSQL community thinks
Windows 8/Server 2012 is fully supported?

I can't see build farm members running either of those relases, and
there's been no further comments on this issue here. Is somebody doing
private testing on behalf of the community?

Should we be mentioning Windows8 and Server 2012 support in our
release notes? It seems like it could be a feature, or at very least
it will be a popular question. OTOH, we don't mention tuned for the
Linux 3 kernel, but then perhaps we should be doing that as well...

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #6239: Looking for a technical contact point for PostgreSQL compatibility issue on Windows8

2012-09-01 Thread Simon Riggs

On 1 September 2012 08:10, Gavin Flower gavinflo...@archidevsys.co.nz wrote:

 Possibly Microsoft could donate boxen to enable PostgreSQL to be tested on
 their O/S's?

Microsoft don't need to donate boxes, they can setup their own build
farm members and contribute directly.

Postgres runs a distributed build farm that anybody can join, no fees,
no restrictions.
http://www.pgbuildfarm.org/cgi-bin/register-form.pl

Community assistance would be available to help.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] primary and hot standby database don' work now

2012-07-29 Thread Simon Riggs

On 22 July 2012 07:00, leo xu leoxu8...@gmail.com wrote:
 hello everyone:
   my pg version is 9.1.2 running on red hat 5.4 x64 servers.one
 day,i found all database indexes can't work,reindex database,it can work,but
 some table has primary key,found exists the same two records in table.
  primary database error:

 2012-07-19 21:18:36.133 CST,,,476,,4fde7dd3.1dc,22,,2012-06-18 09:01:07
 CST,1/0,0,WARNING,01000,specified item offset is too large,xlog redo
 insert: rel 17673/16386/16780; tid 409/151

There's something badly wrong with your WAL records.

I'd suggest you move to a later version of 9.1 and retry.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #6736: Hot-standby replica crashed after failover:

2012-07-17 Thread Simon Riggs

On 13 July 2012 14:38,  maxim.bo...@gmail.com wrote:

 2012-07-13 17:06:19.732 MSK 79724 @ from  [vxid:1/0 txid:0] []PANIC:  WAL
 contains references to invalid pages

The problem is not connected with Hot Standby. The issue relates to
index code that contains problems, leaving invalid pages. These are
not reported until end of recovery. That has now been changed to be
reported earlier, so its clearer that this situation exists.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #6661: out-of-order XID insertion in KnownAssignedXids

2012-06-08 Thread Simon Riggs

On 7 June 2012 17:49, Andres Freund and...@2ndquadrant.com wrote:

 A patch implementing that is attached. Unfortunately not really tested yet
 because its kinda hard to hit that code-path.

 Valentine, can you test that patch?

Patch looks good, so as soon as we confirm it we can apply.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #6661: out-of-order XID insertion in KnownAssignedXids

2012-06-08 Thread Simon Riggs

On 8 June 2012 12:40, Andres Freund and...@2ndquadrant.com wrote:
 Hi All,

 On Friday, June 08, 2012 12:42:00 PM Andres Freund wrote:
 On Friday, June 08, 2012 12:25:25 PM Valentine Gogichashvili wrote:
  unfortunately I did not manage to keep the snapshot of the database, that
  had this issue.
 Ok, after some playing I could reproduce the issue:

 master:
 S1: CREATE FUNCTION insertone() RETURNS void LANGUAGE plpgsql AS $$BEGIN
 INSERT INTO data(data) VALUES (random()); EXCEPTION WHEN division_by_zero THEN
 RAISE NOTICE 'huh'; END;$$;
 S1: BEGIN;
 S1: SELECT insertone() FROM generate_series(1, 1000); --subxid overflow
 S2: BEGIN;
 S2: SELECT insertone(); --allocate xid

 standby:
 pg_basebackup
 LOG:  0: consistent recovery state reached at 0/5C0001C0
 LOG:  0: recovery snapshot waiting for non-overflowed snapshot or until
 oldest active xid on standby is at least 4752 (now 3750)

 master:
 S1: commit:
 S3: checkpoint;

 standby:
 before patch:
 crash in assert/elog
 after:
  LOG:  0: recovery snapshots are now enabled

 So I can confirm the above patch fixes at least that bug.

 Andres

Committed, with minor changes in wording of comments and error messages.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #6629: Creating a gist index fails with too many LWLocks taken

2012-05-11 Thread Simon Riggs

On 11 May 2012 11:07, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:

 I wonder if we should reserve a few of the lwlock slots for critical
 sections, to make this less likely to happen. Not only in this case, but in
 general. We haven't seen this problem often, but it would be quite trivial
 to reserve a few slots.

Why reserve them solely for critical sections?

What is the downside from having 100 slots for general use.

ISTM we should have 250 slots and log a warning if we ever hit 50 or more.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #6629: Creating a gist index fails with too many LWLocks taken

2012-05-11 Thread Simon Riggs

On 11 May 2012 15:14, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 11.05.2012 16:56, Simon Riggs wrote:

 On 11 May 2012 11:07, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com  wrote:

 I wonder if we should reserve a few of the lwlock slots for critical
 sections, to make this less likely to happen. Not only in this case, but
 in
 general. We haven't seen this problem often, but it would be quite
 trivial
 to reserve a few slots.


 Why reserve them solely for critical sections?


 Because if you run out of lwlocks in a critical section, you get a PANIC.

Yes, but why reserve them solely for critical sections? If you have an
escape hatch you use it, always


-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #6629: Creating a gist index fails with too many LWLocks taken

2012-05-11 Thread Simon Riggs

On 11 May 2012 17:48, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 11.05.2012 18:18, Simon Riggs wrote:

 On 11 May 2012 15:14, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com  wrote:

 On 11.05.2012 16:56, Simon Riggs wrote:


 On 11 May 2012 11:07, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com    wrote:

 I wonder if we should reserve a few of the lwlock slots for critical
 sections, to make this less likely to happen. Not only in this case,
 but
 in
 general. We haven't seen this problem often, but it would be quite
 trivial
 to reserve a few slots.



 Why reserve them solely for critical sections?


 Because if you run out of lwlocks in a critical section, you get a PANIC.


 Yes, but why reserve them solely for critical sections? If you have an
 escape hatch you use it, always


 Well, no, because a PANIC is much worse than an ERROR. Think of this as a
 spare oxygen tank while diving, rather than an escape hatch. A spare tank
 can save your life if you run out of oxygen while ascending, but if you run
 out of oxygen on the way down, you don't continue going down with just the
 spare tank.

 Imagine that you have a process that does something like this:

 for (i=0; i  99; i++)
  LWLockAcquire(foolock[i])

 START_CRIT_SECTION();

 XLogInsert(...)

 END_CRIT_SECTION();

 What happens at the moment is that XLogInsert hits the limit when it tries
 to acquire WALInsertLock, so you get a PANIC. If we reserved, say, 5 locks
 for critical sections, so that you could hold 95 locks while outside a
 critical section, and 5 more within on, you would get an error earlier,
 outside the critical section, while acquiring the foolocks. Or if the
 number of foolocks acquired was less than 95, you would not get error at
 all. That avoids the PANIC.

 You can argue for just raising the limit, but that just moves the problem
 around. It's still possible to hit the limit within a critical section, and
 PANIC. Likewise, if we lower the limit, that helps us find the problems
 earlier in the development cycle, but doesn't change the fact that if you
 run too close to the edge, you run out of locks within a critical section
 and PANIC.

 Of course, nothing stops you from writing (bad) code that acquires a lot of
 locks within a critical section, in which case you're screwed anyway.

OK, I agree.

User code can create and hold lwlocks as it chooses, so we need to
protect the server as a whole from bad user code.

The implementation I would prefer would be to put the check in
START_CRIT_SECTION(); so we actually fail before we enter the section.
That way we don't need extra locks, which is good since any additional
amount we pick can still be exceeded by user code.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] in hot standby database execute select * from pg_class indicate error

2012-05-10 Thread Simon Riggs

On 10 May 2012 06:41, leo xu leoxu8...@gmail.com wrote:
 hello:
   i execute any operation in hot standby database,it indicates that index
 pg_class_relname_nsp_index contains unexpected zero page at block 0,please
 reindex it.but hot standby server ,it only allow read,not write,
  i connect to primary server,i can execute any operation.
  standby connects to primary using streaming replication,it has
 synchronous.
  my database server is：9.1.2
  machine is :linux 5.4 64bit.
 i know that i can recreate standby database,it can work.
 is there  others ways!

Run REINDEX on the primary server.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #6619: Misleading output from slave when host is not running

2012-04-27 Thread Simon Riggs

On Fri, Apr 27, 2012 at 8:47 AM,  petteri.r...@aalto.fi wrote:

 LOG:  entering standby mode
 WARNING:  WAL was generated with wal_level=minimal, data may be missing
 HINT:  This happens if you temporarily set wal_level=minimal without taking
 a new base backup.
 FATAL:  hot standby is not possible because wal_level was not set to
 hot_standby on the master server
 HINT:  Either set wal_level to hot_standby on the master, or turn off
 hot_standby here.
 LOG:  startup process (PID 28761) exited with exit code 1
 LOG:  aborting startup due to startup process failure

 The error message above on the FATAL line is wrong (or at least misleading).
 The real problem should be that it can't connect to the master. The
 wal_level on the master is hot_standby (captured after I started it):

The HINT that we should simply set something on the master is a little
misleading with respect to timing. However, if the master and the
standby aren't even connected and you know that, how did you expect
there to be a causal link between the setting on the master and the
state of the standby?

What do you suggest the messages say?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #6425: Bus error in slot_deform_tuple

2012-02-04 Thread Simon Riggs

On Fri, Feb 3, 2012 at 6:45 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wrote:
 I have not gotten very far with the coredump, except to observe that
 gdb says the Assert ought to have passed: ...
 This suggests very strongly that indeed the buffer was changing under
 us.

 I probably ought to let the test case run overnight before concluding
 anything, but at this point it's run for two-plus hours with no errors
 after applying this patch:

 diff --git a/src/backend/access/transam/xlog.c 
 b/src/backend/access/transam/xlog.c
 index cce87a3..b128bfd 100644
 *** a/src/backend/access/transam/xlog.c
 --- b/src/backend/access/transam/xlog.c
 *** RestoreBkpBlocks(XLogRecPtr lsn, XLogRec
 *** 3716,3724 
                }
                else
                {
 -                       /* must zero-fill the hole */
 -                       MemSet((char *) page, 0, BLCKSZ);
                        memcpy((char *) page, blk, bkpb.hole_offset);
                        memcpy((char *) page + (bkpb.hole_offset + 
 bkpb.hole_length),
                                   blk + bkpb.hole_offset,
                                   BLCKSZ - (bkpb.hole_offset + 
 bkpb.hole_length));
 --- 3716,3724 
                }
                else
                {
                        memcpy((char *) page, blk, bkpb.hole_offset);
 +                       /* must zero-fill the hole */
 +                       MemSet((char *) page + bkpb.hole_offset, 0, 
 bkpb.hole_length);
                        memcpy((char *) page + (bkpb.hole_offset + 
 bkpb.hole_length),
                                   blk + bkpb.hole_offset,
                                   BLCKSZ - (bkpb.hole_offset + 
 bkpb.hole_length));


 The existing code makes the page state transiently invalid (all zeroes)
 for no particularly good reason, and consumes useless cycles to do so,
 so this would be a good change in any case.  The reason it is relevant
 to our current problem is that even though RestoreBkpBlocks faithfully
 takes exclusive lock on the buffer, *that is not enough to guarantee
 that no one else is touching that buffer*.  Another backend that has
 already located a visible tuple on a page is entitled to keep accessing
 that tuple with only a buffer pin.  So the existing code transiently
 wipes the data from underneath the other backend's pin.

 It's clear how this explains the symptoms

Yes, that looks like the murder weapon.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] pg_archivecleanup outputs errors messages which are not error messages per se

2012-01-09 Thread Simon Riggs

On Mon, Jan 9, 2012 at 9:55 AM, Benjamin Henrion b...@udev.org wrote:

 I am using pg_archivecleanup in a shell script that sends STDERR to a
 log file, and I see the foilowing:

 pg_archivecleanup: keep WAL file
 /db/current/wal/0009001700DD and later
 pg_archivecleanup: removing file /db/current/wal/0009001700DC
 pg_archivecleanup: removing file /db/current/wal/0009001700DB

 Should those messages considered as being errors?

 I don't think so.

This is a feature not a bug, since that allows the messages to appear
in the PG log when the utility is used as an archive_cleanup_command.

If you'd like to submit a patch to allow -s mode (silent) or similar,
we'd be happy to receive it.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] Incorrect comment in heapam.c

2011-12-20 Thread Simon Riggs

On Tue, Dec 20, 2011 at 5:50 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 Line 834 of heapam.c has the following comment:

 /*
  * This is formatted so oddly so that the correspondence to the macro
  * definition in access/heapam.h is maintained.
  */

 In fact, that macro is defined in access/htup.h...should it be?

IMHO comment is wrong, code is in the right place.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #6307: intarray extention gin index does not work with Hot standby

2011-11-27 Thread Simon Riggs

On Fri, Nov 25, 2011 at 6:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Maxim Boguk maxim.bo...@gmail.com writes:
 I know GIST on intarray[] do not have that problem.
 Very likely the problem is limited to intarray[] GIN indexes only
 (but I going to test some other not-well known GIN indexes tomorrow).

 Broken FTS indexes on Hot Standby should be known years before.

 You might think that, but you'd be wrong :-(.

Yes, that did sound ominous.

 ginRedoUpdateMetapage
 is failing to restore the contents of the pending-list correctly,
 which means this is broken for all types of GIN indexes.  Will fix.

Great detective work Tom as ever, much appreciated.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #6307: intarray extention gin index does not work with Hot standby

2011-11-25 Thread Simon Riggs

On Thu, Nov 24, 2011 at 11:12 PM, Maksym Boguk maxim.bo...@gmail.com wrote:

 postgres=# SELECT * from test where sections  '{2000}';
  id | sections
 +--
 (0 rows)

 Ooops.

Can you see if this is just intarray or if there are other failing cases?

It would be good to get more info on this before I start investigating. Thanks

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #6269: Anomaly detection

2011-10-26 Thread Simon Riggs

On Tue, Oct 25, 2011 at 10:51 AM, Paul Stapersma
paul.staper...@gmail.com wrote:

 PostgreSQL version: 8.3.3

 For a project at my University, we compared PostgreSQL with MySQL's InnoDB.
 In this research, we found several cases in which anomalies where detected
 in Isolation levels that guaranteed not to have these anomalies.

It looks to me that you've made an honest attempt at this comparison.

Thank you for noting points such as
   PostgreSQL’s overall performance was for all isolation levels better
  in comparison to InnoDB’s performance.

To allow this type of study to be widely accepted, you need to declare
who you are and where you are from. And to allow your results to be
confirmed. Accountability and verifiability are essential aspects of
good science. I would also like you to state your funding source, so
that it is clear whether or not marketing funds inspire or reward such
studies. Full disclosure is appreciated.

Using InnoDB 5.0 and PostgreSQL 8.3 isn't very useful and I'm
surprised you haven't used the latest production releases of both
products, especially when we have ground breaking features in exactly
this area in the latest release. Perhaps I misunderstand - you don't
actually say which release of InnoDB you use, though you have URLs
linking to both 5.0 and 5.1 versions.

As others have observed, some other points need more careful
examination and it would be good to provide exact details of your
tests to allow any such results to be reproduced. If problems do
exist, we will be very interested.

There are many people here that would be happy to help if you find
problems, so please don't be put off by these responses to your work.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #6264: Superuser does not have inherent Replication permission

2011-10-25 Thread Simon Riggs

On Tue, Oct 25, 2011 at 1:39 PM, Magnus Hagander mag...@hagander.net wrote:
 On Mon, Oct 24, 2011 at 16:37, Keith Fiske ke...@omniti.com wrote:
 On Sat, Oct 22, 2011 at 11:49 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Keith Fiske ke...@omniti.com writes:

 If you create a user as a NONsuperuser,
 then later ALTER them to be one, they will NOT have the replication
 permission and cannot be used as a replication user until you explicitly
 grant that permission.

 That doesn't sound to me like a bug.  These flags are independent, we
 just provide a certain default at role creation time.


 That is not what the documentation as read would lead people to
 believe. I'd be more than happy to help with clarifying the
 documentation myself if needed. Just let me know how.

 This part I agree with - it makes sense for ALTER to set both flags
 when it enables superuser.

+1

Change the behaviour to match the docs makes most sense to me.


-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] char(0)

2011-10-17 Thread Simon Riggs

On Mon, Oct 17, 2011 at 8:31 AM, Susanne Ebrecht
susa...@2ndquadrant.com wrote:

 PostgreSQL isn't supporting CHAR(0).

What does the SQL Standard say?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #6200: standby bad memory allocations on SELECT

2011-09-09 Thread Simon Riggs

On Thu, Sep 8, 2011 at 11:33 PM, Daniel Farina dan...@heroku.com wrote:

  ERROR: invalid memory alloc request size 18446744073709551613

 At least once, a hot standby was promoted to a primary and the errors seem
 to discontinue, but then reappear on a newly-provisioned standby.

So the query that fails is a btree index on a hot standby. I don't
fully accept it as an HS bug, but lets assume that it is and analyse
what could cause it.

The MO is certain user queries, only observed in HS. So certain
queries might be related to the way we use indexes or not.

There is a single and small difference between how a btree index
operates in HS and normal operation, which relates to whether we
kill tuples in the index. That's simple code and there's no obvious
bugs there, nor anything that specifically allocates memory even. So
the only bug that springs to mind is something related to how we
navigate hot chains with/without killed tuples. i.e. the bug is not
actually HS related, but is only observed under conditions typical in
HS.

HS touches almost nothing else in user space, apart from snapshots. So
there could be a bug there also, maybe in CopySnapshot().

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #6136: Perfomance dies when using PK on 64-bit field

2011-08-01 Thread Simon Riggs

On Sun, Jul 31, 2011 at 2:47 PM, Robert robert.ayrapet...@gmail.com wrote:

 I've found strange behavior of my pg installation (tested both 8.4 and 9.0 -
 they behave same) on FreeBSD platform.
 In short - when some table have PK on bigint field - COPY to that table from
 file becomes slower and slower as table grows. When table reaches ~5GB -
 COPY of 100k records may take up to 20 mins. I've experimented with all
 params in configs, moved indexes to separate hdd etc - nothing made any
 improvement. However, once I'm dropping 64 bit PK - COPY of 100k records
 passes in seconds. Interesting thing - same table has other indexes,
 including composite ones, but none of them include bigint fields, that's why
 I reached decision that bug connected with indexes on bigint fields only.

 In terms of IO picture is following: after copy started gstat shows 100%
 load on index partition (as I mentioned above - I've tried separate hdd to
 keep index tablespace), large queue (over 2k elements), and constant slow
 write on speed of ~2MB\s. Hdd becomes completely unresponsive, even ls on
 empty folder hangs for minute or so.

 To avoid thoughts like your hdd is slow, you haven't tuned postgresql.conf
 etc - all slowness dissapears with drop of bigint PK, same time other
 indexes on same table remain alive. And yes - I've tried drop PK \ recreate
 PK, vacuum full analyze and all other things - nothing helped, only drop
 helps.

Sounds weird. Looking at this now.

Could be that its storing the wrong kind of plan on the RI trigger for PK.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #6136: Perfomance dies when using PK on 64-bit field

2011-08-01 Thread Simon Riggs

On Mon, Aug 1, 2011 at 8:25 AM, simon si...@2ndquadrant.com wrote:

 Could be that its storing the wrong kind of plan on the RI trigger for PK.

No, not that. Will look elsewhere.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #6094: Streaming replication does not catch up when writing enough data

2011-07-07 Thread Simon Riggs

On Thu, Jul 7, 2011 at 1:05 PM, David Hartveld
david.hartv...@mendix.com wrote:

 The following bug has been logged online:

 Bug reference:      6094
 Logged by:          David Hartveld
 Email address:      david.hartv...@mendix.com
 PostgreSQL version: 9.1-beta2
 Operating system:   Debian GNU/Linux 6.0.2 Squeeze
 Description:        Streaming replication does not catch up when writing
 enough data
 Details:

 After creation of two new clusters, and setting them up as master and slave
 (in async mode, according to the current 9.1 docs), the execution of a large
 SQL script (creating a db, tables, sequences, etc., filling them with data
 through COPY) runs properly on the master, but does not stream to the slave,
 i.e. the slave does not catch up. In the master log, the following line is
 printed many times:

Your output indicates that there is a problem in your replication
setup and this is why the slave does not catch up.

This is not a performance issue. It is either a bug in replication, or
a user configuration issue. Since few things have changed in 9.1 in
this area, at the moment the balance of probablity if user error. If
you can provide a more isolated bug report we may be able to
investigate.

This is being discussed in a thread on the General list and there is
no reason to post twice.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #6041: Unlogged table was created bad in slave node

2011-06-07 Thread Simon Riggs

On Wed, Jun 1, 2011 at 7:28 PM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, May 26, 2011 at 12:34 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Euler Taveira de Oliveira's message of jue may 26 12:00:05 
 -0400 2011:
 I think we should emit the real cause in those cases, if possible (not too
 much overhead). The message would be Unlogged table content is not 
 available
 in standby server.

 I guess what it should do is create an empty file in the slave.

 Probably it should, because won't the table malfunction after the slave
 is promoted to master, if there's no file at all there?  Or will the
 process of coming live create an empty file even if there was none?

 Coming live creates an empty file.

 But Euler is pointing out a different issue, which is usability.  If the
 slave just acts like the table is present but empty, we are likely to
 get bug reports about that too.  An error telling you you aren't allowed
 to access such a table on slaves would be more user-friendly, if we can
 do it without too much pain.

 I looked into this a bit.  A few observations:

 (1) This problem is actually not confined to unlogged tables;
 temporary tables have the same issue.  For example, if you create a
 temporary table on the master and then, on the slave, do SELECT * FROM
  pg_temp_3.hi_mom (or whatever the name of the temp schema where the
 temp table is) you get the same error.  In fact I suspect if you took
 a base backup that included the temporary relation and matched the
 backend ID you could even manage to read out the old contents (modulo
 any fun and exciting XID wraparound issues).  But the problem is of
 course more noticeable for unlogged tables since they're not hidden
 away in a special funny schema.

Seems like you're trying to fix the problem directly, which as you
say, has problems.

At some point we resolve from a word mentioned in the FROM clause to a
relfilenode.

Surely somewhere there we can notice its unlogged before we end up
down in the guts of smgr?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #6042: unlogged table with Streaming Replication

2011-05-27 Thread Simon Riggs

On Fri, May 27, 2011 at 7:15 AM, Tomonari Katsumata
katsumata.tomon...@po.ntts.co.jp wrote:

 I've thought that if there are no standby,
 the primary would behave like stand-alone...
 sorry, this is my misunderstanding.

It is a common misunderstanding. The programmed behaviour leads to the
most trusted level of robustness.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] Warm Standby startup process unconditionally hangs

2011-03-23 Thread Simon Riggs

 a report about this before now. (You replied there had been one).

ProcSendSignal() doesn't do anything at all when called with the
Startup proc's pid, because BackendPidGetProc() returns NULL.

The root cause appears to be the separation of the two parts of the
Hot Standby patch that occurred late in 8.4 cycle. Looks like the
protection intended to allow the bgwriter to coexist with the startup
process was never applied.

Similar issue exists in 9.0 also when HS not enabled. (Fixed)

The patches are slightly different because the infrastructure isn't
all there in 8.4.

Please can you test these before I commit to 8.4.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


publish_startup_info_84.v1.patch
Description: Binary data

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #5938: PostgreSQL Installer outputs log file with superuser password in clear text

2011-03-22 Thread Simon Riggs

On Tue, Mar 22, 2011 at 4:09 PM, Dave Page dp...@pgadmin.org wrote:


 On Tue, Mar 22, 2011 at 3:45 PM, Dave Page dp...@pgadmin.org wrote:


 On Tue, Mar 22, 2011 at 5:10 AM, Craig Sacco craig.sa...@gmail.com
 wrote:

 The following bug has been logged online:

 Bug reference:      5938
 Logged by:          Craig Sacco
 Email address:      craig.sa...@gmail.com
 PostgreSQL version: 9.0.3
 Operating system:   Microsoft Windows (all variants, 32 and 64 bit)
 Description:        PostgreSQL Installer outputs log file with superuser
 password in clear text
 Details:

 The PostgreSQL installer outputs a log file to the temporary directory
 with
 the superuser password in clear text. We are deploying PostgreSQL as part
 of
 a commercial product and would like to ensure that the password is not
 available to ordinary users.


 This has been fixed for the next releases.

 For the sake of the archives, it should also be noted that the file is in a
 secure directory, much as a .pgpass file would be, so this is generally only
 an issue for the situation described above, and not when a user installs a
 copy himself.

I accept its not a worst-case problem, but we should rate the problem
A-D as with other security issues.
All cases should get a rating so we know what we're dealing with

The problem is that the password is disclosed in a surprising way.
.pgpass files are explicitly put there by a user, so they know what
they've done.

Putting a password in cleartext somewhere is an issue if people don't
know about it.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS][PATCH] crash in 9.1's psql:describeOneTableDetails

2011-02-12 Thread Simon Riggs

On Sat, 2011-02-12 at 15:39 +0100, Andres Freund wrote:
 On Thursday, February 10, 2011 10:50:38 AM Andres Freund wrote:
  Hi,
  
  Sorry, I unfortunately had the wrong window focused when hitting enter...
   # \d category
   column number 2 is out of range 0..1
   
   results in:
   
   Program received signal SIGSEGV, Segmentation fault.
 Ok, found the bug, simple oversight in invalid foreign key patch.
 
 Andres

Thanks, will apply.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

[BUGS] [Fwd: DBD::Pg on HP-UX 11.31 64bit]

2010-12-14 Thread Simon Riggs

FW

 Forwarded Message 
From: H.Merijn Brand h.m.br...@xs4all.nl
To: DBI Developers Mailing List dbi-...@perl.org, Simon Riggs
si...@2ndquadrant.com
Subject: DBD::Pg on HP-UX 11.31 64bit
Date: Tue, 14 Dec 2010 13:35:49 +0100

I have postgres running on most my HP-UX varieties, ranging from HP-UX
10.20/32bit through 11.31/64bit. It works fine everywhere, except on
HP-UX 11.31-ipf in 64bit mode. Note that this is Itanium architecture.

postgres' own test suite passes, but all connects fail with DBD::Pg.
The compiled psql works fine. Same for 8.4.5 and 9.0.1. I'm working
with 8.4.5 now to make it as stable as possible to compare to my other
architectures.

Everything is a success when I use the same perl built in 32bit mode

I've run truss (HP-UX' version of trace, and narrowed the fail part to
141 lines by being as explicit as possible (e.g setting PGHOST to an IP
address to prevent calls to the resolver). Or to 56 lines for the
version ran with gcc compiled perl5.12.

I now need guidance or help to nail down where things actually cause
these failures.

Here are two traces. The first for perl-5.10.1 compiled with HP's
C-ANSI-C, the second for perl-5.12.2 compiled with gcc-4.2.4

This is perl, v5.10.1 (*) built for IA64.ARCHREV_0-LP64

open (/pro/pgsql/lib/libpq.so.5, O_RDONLY|0x800, 0)   
   = 3
fstat (3, 0x9fffdd70)   
   = 0
read (3, 7f454c46020201010100 |  ELF  .., 64) 
   = 64
pread (3, 7f454c46020201010100 |  ELF  .., 1024, 0)   
   = 1024
stat (/usr/lib/hpux64/dpd, 0x9fffd3d0)
   = 0
open (/usr/lib/hpux64/dpd/libpq.so.5.bpd, O_RDONLY|0x800, 0)  
   ERR#2 ENOENT
getuid ()   
   = 203 (203)
getgid ()   
   = 200 (200)
mmap (NULL, 428816, PROT_READ|PROT_EXEC, MAP_SHARED|MAP_SHLIB, 3, 0)
   = 0xc9d28000
mmap (NULL, 9860, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_SHLIB, 3, 458752)   
   = 0x9fffbf764000
close (3)   
   = 0
getuid ()   
   = 203 (203)
getgid ()   
   = 200 (200)
mmap (NULL, 8192, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, 
-1, 0)= 0x9fffbf76c000
getuid ()   
   = 203 (203)
getgid ()   
   = 200 (200)
open (/pro/local/lib/libcrypto.so, O_RDONLY|0x800, 0) 
   ERR#2 ENOENT
open (/usr/local/ssl/lib/libcrypto.so, O_RDONLY|0x800, 0) 
   = 3
fstat (3, 0x9fffdd20)   
   = 0
read (3, 7f454c46020201010100 |  ELF  .., 64) 
   = 64
pread (3, 7f454c46020201010100 |  ELF  .., 1024, 0)   
   = 1024
stat (/usr/lib/hpux64/dpd, 0x9fffd380)
   = 0
open (/usr/lib/hpux64/dpd/libcrypto.so.bpd, O_RDONLY|0x800, 0)
   ERR#2 ENOENT
getuid ()   
   = 203 (203)
getgid ()   
   = 200 (200)
mmap (NULL, 3530960, PROT_READ|PROT_EXEC, MAP_SHARED|MAP_SHLIB, 3, 0)   
   = 0xc6fe8000
mmap (NULL, 11552, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_SHLIB, 
-1, 0)   = 0x9fffbf76
mmap (0x9fffbf731000, 191296, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_SHLIB, 
3, 3538944) = 0x9fffbf731000
close (3)   
   = 0
getuid ()   
   = 203 (203)
getgid ()   
   = 200 (200)
mmap (NULL, 32768, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, 
-1, 0)   = 0x9fffbf728000
getuid ()   
   = 203 (203)
getgid ()   
   = 200 (200)
open (/pro/local/lib/libssl.so, O_RDONLY|0x800, 0)
   ERR#2 ENOENT
open (/usr/local/ssl/lib/libssl.so, O_RDONLY|0x800, 0)
   = 3
fstat (3, 0x9fffdd20

Re: [BUGS] BUG #5727: Indexes broken in streaming replication

2010-10-26 Thread Simon Riggs

On Tue, 2010-10-26 at 14:08 -0400, Tom Lane wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
  * Operations on hash indexes are not presently WAL-logged, so replay will 
  not update these indexes. Hash indexes will not be used for query plans 
  during recovery.
 
  The initial patch indeed had a special-case in the planner to ignore 
  hash indexes during hot standby, but it was left out because the lack of 
  WAL-logging is a general problem with hash indexes, not a hot standby 
  issue.
 
 Yeah, and also the index would still be broken after the slave exits hot
 standby and becomes live; so that hack didn't cure the problem anyway.

OK, that's a good argument.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] Re: BUG #5602: Recovering from Hot-Standby file backup leads to the currupted indexes

2010-08-12 Thread Simon Riggs

On Thu, 2010-08-12 at 01:31 -0400, Tom Lane wrote:
 So the DBA is
 just flying blind as to whether the slave is trustworthy yet.  I can't
 prove that that's what burnt the original complainant, but it fits the
 symptoms.

The safest approach is to 

1. run pg_start_backup() on master, remember LSN
2. copy backup_label from master to standby
3. wait for starting LSN to be applied on standby
4. run backup on standby
5. run pg_stop_backup() on master

That ensures we don't run without full page writes during backup and
that we have an explicit consistency point to work from.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] Re: BUG #5602: Recovering from Hot-Standby file backup leads to the currupted indexes

2010-08-05 Thread Simon Riggs

On Thu, 2010-08-05 at 11:28 -0700, valgog wrote:
  This process seems almost entirely unrelated to the documented way of
  doing it; I'm not surprised that you end up with some files not in sync.
  Please see pg_start_backup and friends.

 It was done as documented in 
 http://www.postgresql.org/docs/9.0/static/backup-incremental-updated.html

The procedure used does differ from that documented. However, IMHO the
procedure *documented* is *not* safe and could lead to corrupt indexes
in the way described, since the last recovered point might be mid-way
between two halves of an index split record, which will never be
corrected during HS. What I find surprising is that the technique the OP
describes should be safe, assuming step 5 waits for the correct point of
consistency before attempting to run queries.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #5566: High levels of savepoint nesting trigger stack overflow in AssignTransactionId

2010-07-22 Thread Simon Riggs

On Wed, 2010-07-21 at 21:20 +0200, Andres Freund wrote:
  
   If want I can write a patch for that as well, seems to be trivial enough.
  Updated patch attached.
 hm. I dont want to push - just to ask: Is any comitter looking either
 at the patch or the bug?

Patch looks OK, though IMHO wouldn't be looking to backpatch this.

I'll look at this in more detail next week, unless others wish to.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] PD_ALL_VISIBLE flag error on 9.0 alpha 4

2010-03-13 Thread Simon Riggs

On Sat, 2010-03-13 at 11:29 -0800, Josh Berkus wrote:
  That's better, I was worried you'd gone all complimentary on me.
 
 grinNever fear that!
 
 Was that setting originally part of your design for HS?  If so, why did
 you back off from it?

We all agreed its a kluge, that's why.

It's also my 3rd choice of solution behind fine-grained lock conflicts
(1st) which would avoid many issues and master/standby in lock step
(2nd).

Having said that I'm much in favour of providing a range of options and
then letting users tell us what works for them.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] PD_ALL_VISIBLE flag error on 9.0 alpha 4

2010-03-11 Thread Simon Riggs

On Wed, 2010-03-10 at 17:12 -0800, Josh Berkus wrote:
 On 3/10/10 3:26 PM, Simon Riggs wrote:
  OK, that's enough to not remove it. I was aware of more negative
  thoughts and conscious of my own feelings about it being a kluge.
 
 Well, it *is* a kludge, but it may be the best one for people who want
 to use HS/SR to support web applications.  So I think we should work on
 making it less kludgy.

That's better, I was worried you'd gone all complimentary on me.

 Ultimately we're going to need publish-XID-to-master, but that's not
 realistic for 9.0.

Still some open items that I'm working on, but not that.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] PD_ALL_VISIBLE flag error on 9.0 alpha 4

2010-03-10 Thread Simon Riggs

On Tue, 2010-03-09 at 22:02 -0800, Josh Berkus wrote:

 1. Set up 9.0a4 doing SR replication with a 2nd 9.0a4
 2. Ran pgbench for a while.
 3. Aborted pgbench with Ctl-C
 4. Changed vacuum_defer_cleanup_age in postgresql.conf and reloaded
 5. Ran pgbench again, and got:
 
 Sidney-Stratton:pg90 josh$ pgbench -c 2 -T 300 bench
 starting vacuum...WARNING:  PD_ALL_VISIBLE flag was incorrectly set in
 relation pgbench_branches page 0
 WARNING:  PD_ALL_VISIBLE flag was incorrectly set in relation
 pgbench_branches page 1
 WARNING:  PD_ALL_VISIBLE flag was incorrectly set in relation
 pgbench_tellers page 0
 WARNING:  PD_ALL_VISIBLE flag was incorrectly set in relation
 pgbench_tellers page 1

Understandable.

Time to remove vacuum_defer_cleanup_age, I think.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] PD_ALL_VISIBLE flag error on 9.0 alpha 4

2010-03-10 Thread Simon Riggs

On Wed, 2010-03-10 at 23:08 +0200, Heikki Linnakangas wrote:

  
  Time to remove vacuum_defer_cleanup_age, I think.
 
 Umm, so what's the bug?

Whether you call it a bug or just an annoyance is debatable, but the
source of it is clear. Given the lack of effectiveness, I propose
removing it.

Would you agree or disagree with the suggested removal?

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] PD_ALL_VISIBLE flag error on 9.0 alpha 4

2010-03-10 Thread Simon Riggs

On Wed, 2010-03-10 at 17:55 -0500, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  Time to remove vacuum_defer_cleanup_age, I think.
  
  Umm, so what's the bug?
 
  Whether you call it a bug or just an annoyance is debatable, but the
  source of it is clear.
 
 Maybe to you, but the rest of us would like to know.

If vacuum_defer_cleanup_age is set higher this causes the xmin to go
backwards, leading to the PD_ALL_VISIBLE flag was incorrectly set
warning. 

Having this false xmin move backwards doesn't endanger the standby,
since the xids arrive and are checked normally. If they stop arriving
that is fine.

Having the false xmin going backwards is not a serious issue on primary
because the actual xmin does not go backwards. No observer loses
information as a result of this, it is only about whether cleanup
records are generated later than normal, or not.

  Given the lack of effectiveness, I propose
  removing it.
 
 I read Josh's recent report at
 http://archives.postgresql.org/message-id/4b973c3f.9070...@agliodbs.com
 to say that it's quite effective.  I think you're being way too hasty to
 decide that it can just be dropped.

OK, that's enough to not remove it. I was aware of more negative
thoughts and conscious of my own feelings about it being a kluge.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #5357: PGAdmin: SQL Query Editor does not (always) open files

2010-03-02 Thread Simon Riggs

On Tue, 2010-03-02 at 08:32 -0700, Joshua Tolley wrote:
 On Tue, Mar 02, 2010 at 10:08:55AM +, Julien wrote:
  Description:PGAdmin: SQL Query Editor does not (always) open files
 
 A better place for pgadmin problems is the pgadmin-support list:
 http://www.pgadmin.org/support/list.php

I think we need to work on a distribution mechanism for such requests,
rather than trouble bug reporters with our bureaucracy. pgAdmin has been
bundled with Postgres for some time now.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

[BUGS] Assertion failure on reload of GUC_SUPERUSER_ONLY parms

2009-10-03 Thread Simon Riggs


TRAP: BadState(!(((bool) ((CurrentUserId) != ((Oid) 0, File:
miscinit.c, Line: 295)

Gabriele Bartolini originally reported this to me as a bug in Hot
Standby, though I have now been able to reproduce in CVS head. We were
just unlucky enough to hit it while doing thorough HS testing.

backtrace
#0  0x2b7beafd2b45 in raise () from /lib64/libc.so.6
#1  0x2b7beafd40e0 in abort () from /lib64/libc.so.6
#2  0x006c016d in ExceptionalCondition (
conditionName=value optimized out, errorType=value optimized
out, 
fileName=value optimized out, lineNumber=value optimized out)
at assert.c:57
#3  0x006ccf4d in GetUserId () at miscinit.c:295
#4  0x006db1b9 in superuser () at superuser.c:48
#5  0x006d5a2b in GetConfigOption (name=0xb2b6e8 log_filename)
at guc.c:5226
#6  0x006d8a2c in ProcessConfigFile (context=PGC_SIGHUP)
at guc-file.l:319
#7  0x005e49e2 in SIGHUP_handler (
postgres_signal_arg=value optimized out) at postmaster.c:2062
#8  signal handler called
#9  0x2b7beb060bc3 in select () from /lib64/libc.so.6
#10 0x005e1bf7 in ServerLoop () at postmaster.c:1360
#11 0x005e2b14 in PostmasterMain (argc=3, argv=0xb246a0)
at postmaster.c:1065

It turns out that when you modify a GUC_SUPERUSER_ONLY parameter, such
as log_filename, in postgresql.conf and then reload you will get the
assertion failure.

It looks to me like the correct fix would be to use
GetUserIdAndContext() instead, though I would suggest inventing
GetUserIdIfAny() which would skip the assertion test for use in
superuser().

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] GIN needs tonic

2009-09-15 Thread Simon Riggs


On Tue, 2009-09-15 at 09:41 +0300, Heikki Linnakangas wrote:
 Simon Riggs wrote:
  In recovery of GIN operations using CVS HEAD I see consistently
  
  TRAP: FailedAssertion(!(((bool) ((spcNode) != ((Oid) 0, File:
  tablespace.c, Line: 116)
  
  Looking at code, new list page WAL record is a GIN record type and at
  line 115 in gin/ginfast.c I see that the value of the node is unset.
  That means XLOG_GIN_INSERT_LISTPAGE always has specNode == 0 and so
  triggers the assertion failure.
 
 Yeah, so it seems. The fix should be as simple as:

I can't speak for anything more than that it stops it from failing with
an assertion. That is very good since it allows me to bypass it and
continue with my own testing.

 This means that the WAL replay of that record type has never been tested
 correctly :-(.

This must have been added after mid-Feb this year. I notice there are a
few places where functionality is tested against temp tables, which may
mask other non-recoverable issues in this and other rmgrs. We should
make it standard practice to include only non-temp tables to cover
functionality other than specific temp table commands.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] GIN needs tonic

2009-09-15 Thread Simon Riggs


On Tue, 2009-09-15 at 14:31 -0400, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  On Tue, 2009-09-15 at 09:41 +0300, Heikki Linnakangas wrote:
  This means that the WAL replay of that record type has never been tested
  correctly :-(.
 
  This must have been added after mid-Feb this year. I notice there are a
  few places where functionality is tested against temp tables, which may
  mask other non-recoverable issues in this and other rmgrs. We should
  make it standard practice to include only non-temp tables to cover
  functionality other than specific temp table commands.
 
 I've pointed out before that the regression tests are not particularly
 meant to provide an exhaustive test of WAL recovery.  In this particular
 case, so far as I can tell the bug is only observable with
 full_page_writes turned off --- otherwise XLogInsert will invariably
 decide to log the full page, because it's going to see a zeroed-out
 LSN in the passed-in buffer.  

Yes, I was testing with full_page_writes = off at that point.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] GIN needs tonic

2009-09-15 Thread Simon Riggs


On Tue, 2009-09-15 at 15:34 -0400, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  On Tue, 2009-09-15 at 14:31 -0400, Tom Lane wrote:
  I've pointed out before that the regression tests are not particularly
  meant to provide an exhaustive test of WAL recovery.  In this particular
  case, so far as I can tell the bug is only observable with
  full_page_writes turned off --- otherwise XLogInsert will invariably
  decide to log the full page, because it's going to see a zeroed-out
  LSN in the passed-in buffer.  
 
  Yes, I was testing with full_page_writes = off at that point.
 
 In further testing, I've found that the CVS HEAD regression tests
 actually will expose the error if you run them with full_page_writes off
 and then force a WAL replay.  (8.4 would not have, though, because of the
 DROP TABLESPACE at the end of the run.)  In any case, I stand by the
 opinion that the standard regression tests aren't really meant to
 exercise WAL recovery.

Yes, that's exactly how I located the first bug. Sorry if that wasn't
clear.

 BTW, there's more than one bug here :-(.  Heikki found one, but the
 code is also attaching the buffer indicator to the wrong rdata entry
 --- the record header, not the workspace, is what gets suppressed
 if the full page is logged.

Well, whether they were meant to they do and to a much greater extent
than any other body of tests that I'm aware of. Even if we had another
test suite I would still expect recovery to handle a run on the primary
of the full regression tests without problem. I look for multiple ways
of testing and that is just one.

I guess if you or another committer spends some time writing a test
framework that is useful and that you can trust, I'm sure many people
will add to it. That'll be true for any of the major/complex areas not
covered by public test suites: concurrency, recovery and the optimizer. 

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] GIN needs tonic

2009-09-15 Thread Simon Riggs


On Tue, 2009-09-15 at 12:09 -0400, Tom Lane wrote:
 I'm working on this now.

Thanks to you and Heikki for fixing this while I worked around it.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

[BUGS] GIN needs tonic

2009-09-14 Thread Simon Riggs

In recovery of GIN operations using CVS HEAD I see consistently

TRAP: FailedAssertion(!(((bool) ((spcNode) != ((Oid) 0, File:
tablespace.c, Line: 116)

Looking at code, new list page WAL record is a GIN record type and at
line 115 in gin/ginfast.c I see that the value of the node is unset.
That means XLOG_GIN_INSERT_LISTPAGE always has specNode == 0 and so
triggers the assertion failure.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-30 Thread Simon Riggs


On Fri, 2009-06-26 at 16:48 -0400, Tom Lane wrote:

 * I find the RecoveryInProgress test in XLogNeedsFlush rather dubious.
 Why is it okay to disable that?  For at least one of the two callers
 (SetHintBits) it seems like the safe answer is true not false.
 This doesn't matter too much yet but it will for hot standby.

IIUC you think it is safest to *not* write hint bits during Hot Standby?

So we would treat all commits as async commits?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Thu, 2009-06-25 at 12:55 +, Fujii Masao wrote:
 The following bug has been logged online:
 
 Bug reference:  4879
 Logged by:  Fujii Masao
 Email address:  masao.fu...@gmail.com
 PostgreSQL version: 8.4dev
 Operating system:   RHEL5.1 x86_64
 Description:bgwriter fails to fsync the file in recovery mode
 Details: 

Looking at it now.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Thu, 2009-06-25 at 12:55 +, Fujii Masao wrote:

 The restartpoint by bgwriter in recovery mode caused the following error.
 
 ERROR:  could not fsync segment 0 of relation base/11564/16422_fsm: No
 such file or directory

I think I see a related bug also.

register_dirty_segment() run by Startup process doesn't differentiate
correctly whether the bgwriter is active. md.c line 1194. So it will
continue to register fsync requests into its own private pendingOpsTable
when it should be forwarding them to bgwriter. When bgwriter runs
mdsync() the contents of startup's pendingOpsTable will be ignored.

That looks to me to be a serious bug.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Thu, 2009-06-25 at 17:35 +0300, Heikki Linnakangas wrote:
 Heikki Linnakangas wrote:
  Hmm, what happens when the startup process performs a write, and
  bgwriter is not running? Do the fsync requests queue up in the shmem
  queue until the end of recovery when bgwriter is launched? I guess I'll
  have to try it out...
 
 Oh dear, doesn't look good. The startup process has a pendingOpsTable of
 its own. bgwriter won't fsync() files that the startup process has
 written itself. That needs to be fixed, or you can lose data when an
 archive recovery crashes after a restartpoint.

Yes, that's what I see also. Patch attached.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support
Index: src/backend/access/transam/xlog.c
===
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.343
diff -c -r1.343 xlog.c
*** src/backend/access/transam/xlog.c	11 Jun 2009 14:48:54 -	1.343
--- src/backend/access/transam/xlog.c	25 Jun 2009 15:00:13 -
***
*** 5472,5479 
  			 * process in addition to postmaster!
  			 */
  			if (InArchiveRecovery  IsUnderPostmaster)
  SendPostmasterSignal(PMSIGNAL_RECOVERY_STARTED);
! 
  			/*
  			 * main redo apply loop
  			 */
--- 5472,5482 
  			 * process in addition to postmaster!
  			 */
  			if (InArchiveRecovery  IsUnderPostmaster)
+ 			{
  SendPostmasterSignal(PMSIGNAL_RECOVERY_STARTED);
! SetForwardFsyncRequests();
! 			}
! 	
  			/*
  			 * main redo apply loop
  			 */
Index: src/backend/storage/smgr/md.c
===
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/storage/smgr/md.c,v
retrieving revision 1.146
diff -c -r1.146 md.c
*** src/backend/storage/smgr/md.c	11 Jun 2009 14:49:02 -	1.146
--- src/backend/storage/smgr/md.c	25 Jun 2009 15:02:26 -
***
*** 141,146 
--- 141,147 
  
  static HTAB *pendingOpsTable = NULL;
  static List *pendingUnlinks = NIL;
+ static 	remember_fsync_requests_locally = false;
  
  static CycleCtr mdsync_cycle_ctr = 0;
  static CycleCtr mdckpt_cycle_ctr = 0;
***
*** 199,204 
--- 200,206 
  	  100L,
  	  hash_ctl,
     HASH_ELEM | HASH_FUNCTION | HASH_CONTEXT);
+ 		remember_fsync_requests_locally = true;
  		pendingUnlinks = NIL;
  	}
  }
***
*** 1180,1185 
--- 1182,1202 
  }
  
  /*
+  * InArchiveRecovery we need forward requests to the bgwriter.
+  */
+ void
+ SetForwardFsyncRequests(void)
+ {
+ 	remember_fsync_requests_locally = false;
+ 
+ 	/*
+ 	 * In case caller has any pending ops
+ 	 */
+ 	if (pendingOpsTable)
+ 		mdsync();
+ }
+ 
+ /*
   * register_dirty_segment() -- Mark a relation segment as needing fsync
   *
   * If there is a local pending-ops table, just make an entry in it for
***
*** 1191,1197 
  static void
  register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
  {
! 	if (pendingOpsTable)
  	{
  		/* push it into local pending-ops table */
  		RememberFsyncRequest(reln-smgr_rnode, forknum, seg-mdfd_segno);
--- 1208,1214 
  static void
  register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg)
  {
! 	if (remember_fsync_requests_locally)
  	{
  		/* push it into local pending-ops table */
  		RememberFsyncRequest(reln-smgr_rnode, forknum, seg-mdfd_segno);
***
*** 1219,1225 
  static void
  register_unlink(RelFileNode rnode)
  {
! 	if (pendingOpsTable)
  	{
  		/* push it into local pending-ops table */
  		RememberFsyncRequest(rnode, MAIN_FORKNUM, UNLINK_RELATION_REQUEST);
--- 1236,1242 
  static void
  register_unlink(RelFileNode rnode)
  {
! 	if (remember_fsync_requests_locally)
  	{
  		/* push it into local pending-ops table */
  		RememberFsyncRequest(rnode, MAIN_FORKNUM, UNLINK_RELATION_REQUEST);
***
*** 1377,1383 
  void
  ForgetRelationFsyncRequests(RelFileNode rnode, ForkNumber forknum)
  {
! 	if (pendingOpsTable)
  	{
  		/* standalone backend or startup process: fsync state is local */
  		RememberFsyncRequest(rnode, forknum, FORGET_RELATION_FSYNC);
--- 1394,1400 
  void
  ForgetRelationFsyncRequests(RelFileNode rnode, ForkNumber forknum)
  {
! 	if (remember_fsync_requests_locally)
  	{
  		/* standalone backend or startup process: fsync state is local */
  		RememberFsyncRequest(rnode, forknum, FORGET_RELATION_FSYNC);
***
*** 1416,1422 
  	rnode.spcNode = 0;
  	rnode.relNode = 0;
  
! 	if (pendingOpsTable)
  	{
  		/* standalone backend or startup process: fsync state is local */
  		RememberFsyncRequest(rnode, InvalidForkNumber, FORGET_DATABASE_FSYNC);
--- 1433,1439 
  	rnode.spcNode = 0;
  	rnode.relNode = 0;
  
! 	if (remember_fsync_requests_locally)
  	{
  		/* standalone backend or startup process: fsync state is local */
  		RememberFsyncRequest(rnode

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Thu, 2009-06-25 at 17:02 +0300, Heikki Linnakangas wrote:

 I think the real problem is this in mdunlink():
 
  /* Register request to unlink first segment later */
  if (!isRedo  forkNum == MAIN_FORKNUM)
  register_unlink(rnode);
 
 When we replay the unlink of the relation, we don't te bgwriter about
 it. Normally we do, so bgwriter knows that if the fsync() fails with
 ENOENT, it's ok since the file was deleted.
 
 It's tempting to just remove the !isRedo condition, but then we have
 another problem: if bgwriter hasn't been started yet, and the shmem
 queue is full, we get stuck in register_unlink() trying to send the
 message and failing.
 
 In archive recovery, we always start bgwriter at the beginning of WAL
 replay. In crash recovery, we don't start bgwriter until the end of wAL
 replay. So we could change the !isRedo condition to
 !InArchiveRecovery. It's not a very clean solution, but it's simple.

That seems to work for me, though I have some doubts as to the way two
phase commit is coded. 2PC seems to assume that if a file still exists
we must be in recovery and its OK to ignore.

Clean? We've changed the conditions under which the unlink needs to be
registered and !InArchiveRecovery defines the changed conditions
precisely.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Thu, 2009-06-25 at 11:31 -0400, Tom Lane wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
  Heikki Linnakangas wrote:
  Hmm, what happens when the startup process performs a write, and
  bgwriter is not running? Do the fsync requests queue up in the shmem
  queue until the end of recovery when bgwriter is launched? I guess I'll
  have to try it out...
 
  Oh dear, doesn't look good. The startup process has a pendingOpsTable of
  its own. bgwriter won't fsync() files that the startup process has
  written itself. That needs to be fixed, or you can lose data when an
  archive recovery crashes after a restartpoint.
 
 Ouch.  I'm beginning to think that the best thing is to temporarily
 revert the change that made bgwriter active during recovery. 

That seems the safest course, to avoid derailing the schedule.

Please define temporarily. Will it be re-enabled in 8.4.1, assuming we
find an acceptable fix?

 It's
 obviously not been adequately thought through or tested.

Hmmm...

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Thu, 2009-06-25 at 18:12 +0300, Heikki Linnakangas wrote:

 A short fix would be to have bgwriter do the shutdown checkpoint instead
 in archive recovery. I don't recall if there was a reason it wasn't
 coded like that to begin with, though.

I think the problem was that it was coded both ways at different stages
of patch evolution. The decision to retain the shutdown checkpoint by
the startup process was taken in January, IIRC.

Having startup process issue this

if (InArchiveRecovery)
RequestCheckpoint(CHECKPOINT_IS_SHUTDOWN | 
CHECKPOINT_FORCE |
CHECKPOINT_WAIT)
else

should make the startup process wait for bgwriter to complete the
checkpoint. But we need to set LocalRecoveryInProgress appropriately
also.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Thu, 2009-06-25 at 18:57 +0300, Heikki Linnakangas wrote:

 I came up with the attached patch, which includes Simon's patch to
 have all fsync requests forwarded to bgwriter during archive recovery.
 To fix the startup checkpoint issue, startup process requests a forced
 restartpoint, which will flush any fsync requests bgwriter has
 accumulated, before doing the actual checkpoint in the startup
 process.

That looks fine.

 This is completely untested still, but does anyone immediately see any
 more problems?

Nothing seen.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Thu, 2009-06-25 at 12:33 -0400, Tom Lane wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
  This is completely untested still, but does anyone immediately see any
  more problems?
 
 Isn't SetForwardFsyncRequests going to cause the final mdsync to fail?

Yes it will, as it is now.

I didn't read the parts of the patch that were described as being from
my patch. Heikki, please tell me if you change things... it means I have
to be both patch author and reviewer. Perhaps just separate patches, so
we can review them independently.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Thu, 2009-06-25 at 12:43 -0400, Tom Lane wrote:

 What about revert the patch?

That's probably just as dangerous.

Much easier to just avoid that state altogether, if you must.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Thu, 2009-06-25 at 19:37 +0300, Heikki Linnakangas wrote:
 Tom Lane wrote:
  Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
  This is completely untested still, but does anyone immediately see any
  more problems?
  
  Isn't SetForwardFsyncRequests going to cause the final mdsync to fail?
 
 Yes, I just noticed that myself, testing it for the first time. That
 check needs to be suppressed in the startup checkpoint.

Better to do it as it was in my patch. We can turn off/on then.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Thu, 2009-06-25 at 13:25 -0400, Tom Lane wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
  It's tempting to just remove the !isRedo condition, but then we have
  another problem: if bgwriter hasn't been started yet, and the shmem
  queue is full, we get stuck in register_unlink() trying to send the
  message and failing.
 
  In archive recovery, we always start bgwriter at the beginning of WAL
  replay. In crash recovery, we don't start bgwriter until the end of wAL
  replay. So we could change the !isRedo condition to
  !InArchiveRecovery. It's not a very clean solution, but it's simple.
 
 I looked through the callers of smgrdounlink(), and found that
 FinishPreparedTransaction passes isRedo = true.  I'm not sure at the
 moment what the reasoning behind that was, but it does seem to mean that
 checking InArchiveRecovery instead of isRedo may not be quite right.

I think that's because FinishPreparedTransaction() implicitly assumes
that if a file still exists we are in recovery. I can't comment on
whether that is necessarily true for some reason, but it doesn't sound
like it is a safe assumption. I would guess it's using isRedo as an
implicit override rather than using the correct meaning of the variable.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Thu, 2009-06-25 at 14:46 -0400, Tom Lane wrote:

 On the other point: are we going to eliminate mdunlink's isRedo
 parameter?  Maybe the better thing is to have its callers pass the value
 of InArchiveRecovery?

We have one caller that is using a parameter incorrectly. It seems we
should correct the caller rather than change the API and potentially
effect all callers.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Thu, 2009-06-25 at 21:59 +0300, Heikki Linnakangas wrote:

 I think my initial analysis of this bug was bogus.

Perhaps, but the pendingOpsTable issue is a serious one, so we haven't
wasted our time by fixing it.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Thu, 2009-06-25 at 15:10 -0400, Tom Lane wrote:

 Are you (Heikki and Simon) in a position to finish these things today?

Certainly will try.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Thu, 2009-06-25 at 15:10 -0400, Tom Lane wrote:

 So to summarize the state of play, it seems
 we have these issues:
 
 * need to delete startup process's local pendingOpsTable once bgwriter
 is launched, so that requests go to bgwriter instead

Need to ensure that fsync requests are directed to the process that will
act on the fsync requests.

 * need to push end-of-recovery checkpoint into bgwriter

That's probably the easiest thing to do, but the issue is that we must
fsync all files mentioned in the pendingOpsTable in *any* process that
has been accumulating such requests.

 * need to do something about IsRecovery tests that will now be executed
 in bgwriter

Yes

 * need to fix mistaken code in FinishPreparedTransaction

Yes

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Thu, 2009-06-25 at 22:29 +0300, Heikki Linnakangas wrote:
 Tom Lane wrote:
  While nosing around the problem areas, I think I've found yet another
  issue here.  The global bool InRecovery is only maintained correctly
  in the startup process, which wasn't a problem before 8.4.  However,
  if we are making the bgwriter execute the end-of-recovery checkpoint,
  there are multiple places where it is tested that are going to be
  executed by bgwriter.  I think (but am not 100% sure) that these
  are all the at-risk references:
  XLogFlush
  CheckPointMultiXact
  CreateCheckPoint (2 places)
  Heikki's latest patch deals with the tests in CreateCheckPoint (rather
  klugily IMO) but not the others.  I think it might be better to fix
  things so that InRecovery is maintained correctly in the bgwriter too.
 
 We could set InRecovery=true in CreateCheckPoint if it's a startup
 checkpoint, and reset it afterwards. 

That seems like a bad idea. 

As you said earlier, 

On Thu, 2009-06-25 at 23:15 +0300, Heikki Linnakangas wrote:

 Well, we have RecoveryInProgress() now that answers the question is
 recovery still in progress in the system. InRecovery now means am I
 a process that's performing WAL replay?.

so it would be a mistake to do as you propose above because it changes
the meaning of these well defined parts of the system.

* XLogFlush mentions InRecovery right at the end and the correct usage
would be RecoveryIsInProgress() rather than InRecovery.

* The usage of InRecovery in CheckPointMultiXact() should also be
replaced with RecoveryIsInProgress()

* The two usages of InRecovery can also be replaced by
RecoveryIsInProgress()

So, yes, there are some places where InRecovery is used in code executed
by the bgwriter, but the correct fix is to use RecoveryIsInProgress().
It is a hack to try to set the InRecovery state flag in bgwriter and one
that would change the clear meaning of InRecovery, to no good purpose.

Subsequent discussion on this subthread may no longer be relevant.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Thu, 2009-06-25 at 22:29 +0300, Heikki Linnakangas wrote:

 Hmm, I see another small issue. We now keep track of the minimum
 recovery point. Whenever a data page is flushed, we set minimum
 recovery point to the LSN of the page in XLogFlush(), instead of
 fsyncing WAL like we do in normal operation. During the end-of-recovery
 checkpoint, however, RecoveryInProgress() returns false, so we don't
 update minimum recovery point in XLogFlush(). You're unlikely to be
 bitten by that in practice; you would need to crash during the
 end-of-recovery checkpoint, and then set the recovery target to an
 earlier point. It should be fixed nevertheless.

RecoveryInProgress returns false only because you set
LocalRecoveryInProgress at the start of the checkpoint, not at the end.
It only needs to be set immediately prior to the call for XLogInsert()
in CreateCheckpoint(). If we do that then XLogFlush() acts as advertised
and we have no worries.

Tom: Heikki's work on MinRecoveryPoint seems sound to me and altering it
now seems like something too dangerous to attempt at this stage. I see
no need; we have no new bug, just a minor point of how we code the fix
to the bug we do have.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Thu, 2009-06-25 at 17:11 -0400, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  So, yes, there are some places where InRecovery is used in code executed
  by the bgwriter, but the correct fix is to use RecoveryIsInProgress().
 
 Agreed, but this gets us no closer to solving the real problem, which is
 that when we perform the end-of-recovery checkpoint, we need to act like
 we are *not* in recovery anymore, for at least some purposes.  

Not for some purposes, just for *one* purpose: the end of recovery
checkpoint needs to run XLogInsert() which has specific protection
against being run during recovery.

 Most
 notably, to allow us to write a WAL entry at all; but I am suspicious
 that pretty much every InRecovery/RecoveryIsInProgress test that that
 checkpoint might execute should behave as if we're not in recovery.

You are right to question whether we should revoke the patch. ISTM that
we are likely to decrease code robustness by doing that at this stage.

I think we have the problems pretty much solved now.

If we want to increase robustness, I would suggest we add a
recovery.conf parameter to explicitly enable use of bgwriter during
recovery, off by default. In addition to the fixes being worked on
currently.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Thu, 2009-06-25 at 17:17 -0400, Tom Lane wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
  We do store the minimum recovery point required by the base backup in
  control file, but it's not advanced once the recovery starts. So if you
  start recovery, starting from say 123, let it progress to location 456,
  kill recovery and start it again, but only let it go up to 300, you end
  up with a corrupt database.
 
 I don't believe that you have the ability to do that.  Once the recovery
 process is launched, the user does not have the ability to control where
 the WAL scan proceeds from.  Or at least that's how it used to work, and
 if someone has tried to change it, it's broken for exactly this reason.
 
 The behavior you seem to have in mind would be completely disastrous
 from a performance standpoint, as we'd be writing and fsyncing
 pg_control constantly during a recovery.  I wouldn't consider that a
 good idea from a reliability standpoint either --- the more writes to
 pg_control, the more risk of fatal corruption of that file.
 
 This is sounding more and more like something that needs to be reverted.

AFAICS the problem Heikki is worried about exists 8.2+. If you stop
recovery, edit recovery.conf to an earlier recovery target and then
re-run recovery then it is possible that data that would not exist until
after the (new) recovery point has made its way to disk. The code in 8.4
does a few things to improve that but the base problem persists and
revoking code won't change that. We should describe the issue in the
docs and leave it at that - there is no particular reason to believe
anybody would want to do such a thing. 

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Fri, 2009-06-26 at 00:37 +0300, Heikki Linnakangas wrote:

 Ok, I've committed the above fixes everyone agreed on.

Sorry, but I haven't agreed to very much of what you just committed. 

There are some unnecessary and confusing hacks that really don't help
anybody sort out why any of this has been done.

At this stage of RC, I expected you to post the patch and have the other
2 people working actively on this issue review it before you commit.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Thu, 2009-06-25 at 18:40 -0400, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  On Fri, 2009-06-26 at 00:37 +0300, Heikki Linnakangas wrote:
  Ok, I've committed the above fixes everyone agreed on.
 
  At this stage of RC, I expected you to post the patch and have the other
  2 people working actively on this issue review it before you commit.
 
 We're going to slip release a day to allow time to review this properly.
 Given that, I have no problem with the proposed fix being in CVS --- it
 makes it a shade easier for other people to test it.

I wasn't suggesting that we post the patch and leave it for days, I was
thinking more of the 3 people actively working together on the problem
looking at the code before its committed. Describing it as stuff we
agreed seems particularly strange in that light.

On the patch, the kluge to set InRecovery is unnecessary and confusing.
If I submitted that you'd kick my ass all around town.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Thu, 2009-06-25 at 19:16 -0400, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  On the patch, the kluge to set InRecovery is unnecessary and confusing.
 
 I'll look into a better way to do that tonight.  Did you have any
 specific improvement in mind?

Yes, all already mentioned on this thread.

I'm unable to spend time on this tomorrow, but I think the main elements
of the solution are there now. The bug was found in my feature, so mea
culpa. Thanks to everybody for helping identify and fix it.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #4879: bgwriter fails to fsync the file in recovery mode

2009-06-25 Thread Simon Riggs


On Thu, 2009-06-25 at 20:25 -0400, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  On Thu, 2009-06-25 at 19:16 -0400, Tom Lane wrote:
  Simon Riggs si...@2ndquadrant.com writes:
  On the patch, the kluge to set InRecovery is unnecessary and confusing.
  
  I'll look into a better way to do that tonight.  Did you have any
  specific improvement in mind?
 
  Yes, all already mentioned on this thread.
 
 After looking at this a bit more, the easy solution mentioned upthread
 doesn't work.  We want the end-of-recovery checkpoint to be able to
 write WAL (obviously), but it *must not* try to call TruncateMultiXact
 or TruncateSUBTRANS, because those subsystems aren't initialized yet.
 The check in XLogFlush needs to behave as though InRecovery were true
 too.  So the idea of testing RecoveryInProgress() rather than InRecovery
 cannot handle all these cases.

Yes it can, but only if LocalRecoveryInProgress is set immediately
before XLogInsert() in CreateCheckpoint(). Then it is correct

 However, I still agree with your thought that having InRecovery true
 only in the process that's applying WAL records is probably the cleanest
 definition.  And it also seems to me that having crystal-clear
 definitions of these states is essential, because not being clear about
 them is exactly what got us into this mess.

Yes

 What I am thinking is that instead of the hack of clearing
 LocalRecoveryInProgress to allow the current process to write WAL,
 we should have a separate test function WALWriteAllowed() with a
 state variable LocalWALWriteAllowed, and the hack should set that
 state without playing any games with LocalRecoveryInProgress.  Then
 RecoveryInProgress() remains true during the end-of-recovery checkpoint
 and we can condition the TruncateMultiXact and TruncateSUBTRANS calls
 on that.  Meanwhile the various places that check RecoveryInProgress
 to decide if WAL writing is allowed should call WALWriteAllowed()
 instead.

No need.

 I have not yet gone through all the code sites to make sure this is a
 consistent way to do it, but we clearly need more states than we have
 now if we are to avoid weird overloadings of the state meanings.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

1 2 3 >

1 - 100 of 201 matches

Mail list logo