[ADMIN] Recovering a deleted database problem

2007-01-05 Thread Andy Shellam (Mailing Lists)
Earlier this evening I made the usual mistake someone makes at some 
point in their lives - and dropped a database thinking I didn't need it, 
then realised later I did.
So, because I have DDL statement logging turned on, I could find the 
exact time/date it happened, and attempted to restore from my 
file-system level backup taken at 2am this morning, and rolled forward 
all my WAL logs archived throughout the day (98 files.)


In the recovery.conf, I specified the date/time from the log file that 
the database was dropped and set recovery_target_inclusive so it would 
not include this transaction.


However the restore has finished, and PostgreSQL thinks the database is 
there, but the relevant data directory in base is missing - so it's 
removed the file-system objects but not the system database entry.
I've checked the base backup, and this directory is in the backup, hence 
it has been removed at some point during the restore.


What I'm going to do now is to set the recovery target to about a minute 
earlier to make sure the transaction has not started when the recovery 
finishes - but I'm just asking if I'm missing something obvious, as this 
is the first time I've done a restore from WAL logs.


(Note, after writing this, I tried restoring to a minute earlier (ie. 
18:57:40) and still have the same problem.
As a quick fix, I copied the base/35290 directory from the backup before 
I had run the recovery - does anyone know any caveats to doing this, as 
the DB seems to be working OK?)


My recovery.conf is:

# PostgreSQL database recovery config file

restore_command = 'cp /path/to/wal/archive/%f %p'
recovery_target_time = '2007-01-04 18:58:40 -00:00'
recovery_target_inclusive = 'false'

The log entry where I discovered the date/time is:

2007-01-04 18:58:40 GMT 84.45.66.158 postgres postgresql - LOG:  
statement: DROP DATABASE [dbname];


The error I get when I try to connect to [dbname] after the restore is:

FATAL: database [dbname] does not exist
DETAIL: The database subdirectory base/35290 is missing.

But the [dbname] database is still in the system catalogues:

/usr/local/pgsql/bin/psql -U postgresql -d postgres -c select datname 
from pg_database;

   datname

postgres
[db1]
template1
template0
[dbname]
[db2]
[db3]
[db4]
(8 rows)


This is PostgreSQL 8.1.5 on FreeBSD 6.1.

Many thanks,

--
Andy Shellam
NetServe Support Team

the Mail Network
an alternative in a standardised world


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [ADMIN] Recovering a deleted database problem

2007-01-05 Thread Tom Lane
Andy Shellam (Mailing Lists) [EMAIL PROTECTED] writes:
 (Note, after writing this, I tried restoring to a minute earlier (ie. 
 18:57:40) and still have the same problem.

The PITR recovery process in effect rolls forward until it finds
a transaction-commit record = the specified time.  Now for normal
database operations, stopping just short of the commit of the
transaction is enough to ensure that the transaction has no effect.
But for the XLOG_DBASE_DROP record, not so --- replaying that means
rm -rf base/whatever.  So you've got to make sure the replay stops
before it reaches that record, and that means you need a stop time
= the commit time of some *prior* transaction.  I suppose this was
a slow time of day and you didn't have any other commits in the prior
minute :-( ... so take another look in the log and see what was the
last commit before that, and use that time.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [ADMIN] Recovering a deleted database problem

2007-01-05 Thread Andy Shellam (Mailing Lists)
Thanks for the info Tom, too much data will have been entered into the 
other databases in the cluster by now so I cannot give it another shot 
on that server, plus all of yesterday's WAL logs will have been purged 
by now by the daily backup routine.


Is it enough to simply have re-copied in the base/xxx directory from the 
base backup, after the PITR recovery had completed (obviously any 
changes made to that database since the base backup won't have been 
restored but thankfully it's backed up nightly and doesn't change too 
often :-) )  All CRUD operations seem to be working on that database OK 
and the app that (I now know) uses it hasn't complained.


What I'll probably do is try to simulate the same process again on a 
different machine to get myself a bit more familiar.  Is there any other 
situations you can think of where this may also be relevant, or is it 
just when dropping a complete database?


Many thanks,

Andy.

Tom Lane wrote:

Andy Shellam (Mailing Lists) [EMAIL PROTECTED] writes:
  
(Note, after writing this, I tried restoring to a minute earlier (ie. 
18:57:40) and still have the same problem.



The PITR recovery process in effect rolls forward until it finds
a transaction-commit record = the specified time.  Now for normal
database operations, stopping just short of the commit of the
transaction is enough to ensure that the transaction has no effect.
But for the XLOG_DBASE_DROP record, not so --- replaying that means
rm -rf base/whatever.  So you've got to make sure the replay stops
before it reaches that record, and that means you need a stop time
= the commit time of some *prior* transaction.  I suppose this was
a slow time of day and you didn't have any other commits in the prior
minute :-( ... so take another look in the log and see what was the
last commit before that, and use that time.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

!DSPAM:37,459e6a32137101648020742!


  



--
Andy Shellam
NetServe Support Team

the Mail Network
an alternative in a standardised world



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [ADMIN] Recovering a deleted database problem

2007-01-05 Thread Tom Lane
Andy Shellam (Mailing Lists) [EMAIL PROTECTED] writes:
 Is it enough to simply have re-copied in the base/xxx directory from the 
 base backup, after the PITR recovery had completed (obviously any 
 changes made to that database since the base backup won't have been 
 restored but thankfully it's backed up nightly and doesn't change too 
 often :-) )

Well, I'd be a little worried about whether the base backup was
self-consistent, but if it was taken at a time where the DB was idle
then you can probably get away with this.

 What I'll probably do is try to simulate the same process again on a 
 different machine to get myself a bit more familiar.  Is there any other 
 situations you can think of where this may also be relevant, or is it 
 just when dropping a complete database?

AFAIK the only operations that have non-rollbackable side effects are
CREATE/DROP DATABASE and CREATE/DROP TABLESPACE.  For any of these,
you'd end up with inconsistent state if you try to stop replay just
before the commit record.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq