Hi,

I discavered the problem cause. I think taht horiguchi's discovery is another 
problem...
Problem has CreateRestartPoint. In recovery mode, PG should not WAL record.
Because PG does not know latest WAL file location.
But in this problem case, PG(standby) write WAL file at RestartPoint in archive 
recovery.
In recovery mode, I think that RestartPoint can write only MinRecoveryPoint.

Here is Standby's pg_xlog directory in problem caused.
[mitsu-ko@localhost postgresql-9.2.4-c]$ ls Standby/pg_xlog/
000000020000000000000003  000000020000000000000007  00000002000000000000000B  
00000003.history
000000020000000000000004  000000020000000000000008  00000002000000000000000C  
00000003000000000000000E
000000020000000000000005  000000020000000000000009  00000002000000000000000D  
00000003000000000000000F
000000020000000000000006  00000002000000000000000A  00000002000000000000000E  
archive_status

This problem case is here.
[Standby] 2013-04-26 04:26:44 EDT DEBUG:  00000: attempting to remove WAL 
segments older than log file 000000030000000000000002
[Standby] 2013-04-26 04:26:44 EDT LOCATION:  RemoveOldXlogFiles, xlog.c:3568
[Standby] 2013-04-26 04:26:44 EDT DEBUG:  00000: recycled transaction log file 
"000000010000000000000002"
[Standby] 2013-04-26 04:26:44 EDT LOCATION:  RemoveOldXlogFiles, xlog.c:3607
[Standby] 2013-04-26 04:26:44 EDT DEBUG:  00000: recycled transaction log file 
"000000020000000000000002"
[Standby] 2013-04-26 04:26:44 EDT LOCATION:  RemoveOldXlogFiles, xlog.c:3607
[Standby] 2013-04-26 04:26:44 EDT LOG:  00000: restartpoint complete: wrote 9 
buffers (0.2%); 0 transaction log file(s) added, 0 removed, 2 recycled; 
write=0.601 s, sync=1.178 s, total=1.781 s; sync files=3, longest=1.176 s, 
average=0.392 s
[Standby] 2013-04-26 04:26:44 EDT LOCATION:  LogCheckpointEnd, xlog.c:7893
[Standby] 2013-04-26 04:26:44 EDT LOG:  00000: recovery restart point at 
0/90FE448
[Standby] 2013-04-26 04:26:44 EDT DETAIL:  last completed transaction was at 
log time 2013-04-26 04:25:53.203725-04
[Standby] 2013-04-26 04:26:44 EDT LOCATION:  CreateRestartPoint, xlog.c:8601
[Standby] 2013-04-26 04:26:44 EDT LOG:  00000: restartpoint starting: xlog
[Standby] 2013-04-26 04:26:44 EDT LOCATION:  LogCheckpointStart, xlog.c:7821
cp: cannot stat `../arc/00000003000000000000000F': そのようなファイルやディレクトリはありません
[Standby] 2013-04-26 04:26:44 EDT DEBUG:  00000: could not restore file 
"00000003000000000000000F" from archive: return code 256
[Standby] 2013-04-26 04:26:44 EDT LOCATION:  RestoreArchivedFile, xlog.c:3323
[Standby] 2013-04-26 04:26:44 EDT LOG:  00000: unexpected pageaddr 0/2000000 in 
log file 0, segment 15, offset 0
[Standby] 2013-04-26 04:26:44 EDT LOCATION:  ValidXLOGHeader, xlog.c:4395
cp: cannot stat `../arc/00000003000000000000000F': そのようなファイルやディレクトリはありません
[Standby] 2013-04-26 04:26:44 EDT DEBUG:  00000: could not restore file 
"00000003000000000000000F" from archive: return code 256

In recovery, pg normary search WAL file at archive recovery.
If propery WAL file is nothing(archive command is failed), next search pg_xlog 
directory.
Normary, propety WAL file is nothing in pg_xlog.
But this case has propety name's WAL file(But it's mistaken and illegal) in 
pg_xlog.
So recovery is failed and broken Standby.

So I fix CreateRestartPoint at branching point of executing MinRecoveryPoint.
It seems to fix this problem, but attached patch is plain.


Best Regard,
--
NTT Open Source Software Center
Mitsumasa KONDO
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 5452ae1..ae4bcd8 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7371,7 +7371,8 @@ CreateRestartPoint(int flags)
         * restartpoint. It's assumed that flushing the buffers will do that as a
         * side-effect.
         */
-       if (XLogRecPtrIsInvalid(lastCheckPointRecPtr) ||
+       if (( ControlFile->state == DB_IN_ARCHIVE_RECOVERY && RecoveryInProgress()) ||
+               XLogRecPtrIsInvalid(lastCheckPointRecPtr) ||
                lastCheckPoint.redo <= ControlFile->checkPointCopy.redo)
        {
                ereport(DEBUG2,
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to