Re: [GENERAL] Minor bug/inconveniance with restore from backup, using PITR base backup and archived wal files

2008-09-27 Thread Tommy Gildseth

Tom Lane wrote:

Tommy Gildseth <[EMAIL PROTECTED]> writes:
After a bit of looking around, and with some help from the fine people 
in #postgresql on freenode, I think I figured out what was going on.
The last wal archive file was 00010003009F, and after 
finishing recovery, postgresql created the file 00020003009F 
(ie. 0002 instead of 0001) in pg_xlog.


It's customary for PG to "create" new XLOG segments by recycling old
ones.

The wal-files were 
archived read-only, and this file permission seemed to be carried over 
to the new file created by postgresql in pg_xlog, causing the cluster to 
fall over and die.


I would say that the bug is in your restore script: it should have made
sure that the files it copies into the xlog directory are given the
right ownership/permissions.



Well, the restore command(script) is simply copied from the suggestion 
in the manual (restore_command = 'cp /path/to/my/archived/wal/files/%f 
"%p"'). In my opinion, it's not very obvious that the last wal file 
needs read/write permissions set, and it's certainly not documented 
anywhere on 
http://www.postgresql.org/docs/current/static/continuous-archiving.html 
that I can see.
There's also the matter of the inconsistency that postgresql knows to 
recycle *and* chmod the file if it's originally located in pg_xlog/ 
folder, but not if it's originally located in the wal files archive 
folder. I guess it's more of a gotcha than a bug per se.


--
Tommy Gildseth

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Minor bug/inconveniance with restore from backup, using PITR base backup and archived wal files

2008-09-25 Thread Tom Lane
Tommy Gildseth <[EMAIL PROTECTED]> writes:
> ... problem came at the end of the recovery, after the log line:

> [2008-09-23 15:33:14.764 CEST] [pgtest01] [:] [] [18393] [] LOG: archive 
> recovery complete

> followed immediately after by this line:

>   [2008-09-24 13:04:52.168 CEST] pgtest01] [:] [] [13324] [] PANIC: 
> could not open file "pg_xlog/00020003009F" (log file 3, 
> segment 159): Permission denied

> and, the cluster shutting down.
> After a bit of looking around, and with some help from the fine people 
> in #postgresql on freenode, I think I figured out what was going on.
> The last wal archive file was 00010003009F, and after 
> finishing recovery, postgresql created the file 00020003009F 
> (ie. 0002 instead of 0001) in pg_xlog.

It's customary for PG to "create" new XLOG segments by recycling old
ones.

> The wal-files were 
> archived read-only, and this file permission seemed to be carried over 
> to the new file created by postgresql in pg_xlog, causing the cluster to 
> fall over and die.

I would say that the bug is in your restore script: it should have made
sure that the files it copies into the xlog directory are given the
right ownership/permissions.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Minor bug/inconveniance with restore from backup, using PITR base backup and archived wal files

2008-09-25 Thread Tommy Gildseth

Tommy Gildseth wrote:
I've recently been testing our backup/restore procedures, and discovered 
a minor inconvenience.


Running 8.2.9 btw


--
Tommy Gildseth

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Minor bug/inconveniance with restore from backup, using PITR base backup and archived wal files

2008-09-25 Thread Tommy Gildseth
I've recently been testing our backup/restore procedures, and discovered 
a minor inconvenience.


I emptied out the data directory(on a test-box), and restored it from a 
backup. I made sure that pg_xlog and pg_xlog/archive_status was empty.
I then set up the recovery.conf file in the root of the data directory, 
and pointed restore_command in the direction of the archived wal files.


After starting postgresql, the cluster started restoring from the 
archive logs, and everything looks fine, and doing what it should. The 
problem came at the end of the recovery, after the log line:


[2008-09-23 15:33:14.764 CEST] [pgtest01] [:] [] [18393] [] LOG: archive 
recovery complete


followed immediately after by this line:

 [2008-09-24 13:04:52.168 CEST] pgtest01] [:] [] [13324] [] PANIC: 
could not open file "pg_xlog/00020003009F" (log file 3, 
segment 159): Permission denied


and, the cluster shutting down.
After a bit of looking around, and with some help from the fine people 
in #postgresql on freenode, I think I figured out what was going on.
The last wal archive file was 00010003009F, and after 
finishing recovery, postgresql created the file 00020003009F 
(ie. 0002 instead of 0001) in pg_xlog. The wal-files were 
archived read-only, and this file permission seemed to be carried over 
to the new file created by postgresql in pg_xlog, causing the cluster to 
fall over and die. Changing permissions on the last wal-file in the 
archive directory to read/write made this problem go away. Moving the 
last wal file from the archive folder to pg_xlog/ or, if I had partial 
archive log files in pg_xlog, also made this problem go away, 
irrespective of whether the file was chmod read only, or read/write.
Simply chmoding the wal-file in pg_xlog/, and trying to restart the 
cluster did not work, and I found no other way to recover from this, 
other than to start over again from the beginning.



--
Tommy Gildseth

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general