Re: [GENERAL] Minor bug/inconveniance with restore from backup, using PITR base backup and archived wal files
Tom Lane wrote: Tommy Gildseth <[EMAIL PROTECTED]> writes: After a bit of looking around, and with some help from the fine people in #postgresql on freenode, I think I figured out what was going on. The last wal archive file was 00010003009F, and after finishing recovery, postgresql created the file 00020003009F (ie. 0002 instead of 0001) in pg_xlog. It's customary for PG to "create" new XLOG segments by recycling old ones. The wal-files were archived read-only, and this file permission seemed to be carried over to the new file created by postgresql in pg_xlog, causing the cluster to fall over and die. I would say that the bug is in your restore script: it should have made sure that the files it copies into the xlog directory are given the right ownership/permissions. Well, the restore command(script) is simply copied from the suggestion in the manual (restore_command = 'cp /path/to/my/archived/wal/files/%f "%p"'). In my opinion, it's not very obvious that the last wal file needs read/write permissions set, and it's certainly not documented anywhere on http://www.postgresql.org/docs/current/static/continuous-archiving.html that I can see. There's also the matter of the inconsistency that postgresql knows to recycle *and* chmod the file if it's originally located in pg_xlog/ folder, but not if it's originally located in the wal files archive folder. I guess it's more of a gotcha than a bug per se. -- Tommy Gildseth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Minor bug/inconveniance with restore from backup, using PITR base backup and archived wal files
Tommy Gildseth <[EMAIL PROTECTED]> writes: > ... problem came at the end of the recovery, after the log line: > [2008-09-23 15:33:14.764 CEST] [pgtest01] [:] [] [18393] [] LOG: archive > recovery complete > followed immediately after by this line: > [2008-09-24 13:04:52.168 CEST] pgtest01] [:] [] [13324] [] PANIC: > could not open file "pg_xlog/00020003009F" (log file 3, > segment 159): Permission denied > and, the cluster shutting down. > After a bit of looking around, and with some help from the fine people > in #postgresql on freenode, I think I figured out what was going on. > The last wal archive file was 00010003009F, and after > finishing recovery, postgresql created the file 00020003009F > (ie. 0002 instead of 0001) in pg_xlog. It's customary for PG to "create" new XLOG segments by recycling old ones. > The wal-files were > archived read-only, and this file permission seemed to be carried over > to the new file created by postgresql in pg_xlog, causing the cluster to > fall over and die. I would say that the bug is in your restore script: it should have made sure that the files it copies into the xlog directory are given the right ownership/permissions. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Minor bug/inconveniance with restore from backup, using PITR base backup and archived wal files
Tommy Gildseth wrote: I've recently been testing our backup/restore procedures, and discovered a minor inconvenience. Running 8.2.9 btw -- Tommy Gildseth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Minor bug/inconveniance with restore from backup, using PITR base backup and archived wal files
I've recently been testing our backup/restore procedures, and discovered a minor inconvenience. I emptied out the data directory(on a test-box), and restored it from a backup. I made sure that pg_xlog and pg_xlog/archive_status was empty. I then set up the recovery.conf file in the root of the data directory, and pointed restore_command in the direction of the archived wal files. After starting postgresql, the cluster started restoring from the archive logs, and everything looks fine, and doing what it should. The problem came at the end of the recovery, after the log line: [2008-09-23 15:33:14.764 CEST] [pgtest01] [:] [] [18393] [] LOG: archive recovery complete followed immediately after by this line: [2008-09-24 13:04:52.168 CEST] pgtest01] [:] [] [13324] [] PANIC: could not open file "pg_xlog/00020003009F" (log file 3, segment 159): Permission denied and, the cluster shutting down. After a bit of looking around, and with some help from the fine people in #postgresql on freenode, I think I figured out what was going on. The last wal archive file was 00010003009F, and after finishing recovery, postgresql created the file 00020003009F (ie. 0002 instead of 0001) in pg_xlog. The wal-files were archived read-only, and this file permission seemed to be carried over to the new file created by postgresql in pg_xlog, causing the cluster to fall over and die. Changing permissions on the last wal-file in the archive directory to read/write made this problem go away. Moving the last wal file from the archive folder to pg_xlog/ or, if I had partial archive log files in pg_xlog, also made this problem go away, irrespective of whether the file was chmod read only, or read/write. Simply chmoding the wal-file in pg_xlog/, and trying to restart the cluster did not work, and I found no other way to recover from this, other than to start over again from the beginning. -- Tommy Gildseth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general