Re: [GENERAL] Postgresql-9.0.1 Recovery

Venkat Balaji Tue, 30 Aug 2011 23:41:41 -0700

Thanks Craig !

Below  is what i did -


1. pg_start_backup()
2. rsync the data dir
3. pg_stop_backup()

I believe the backup is valid because, i was able to bring up the cluster
without any issues (ofcourse with data loss).

+ve signs-

I am able to bring up the cluster with the Online backup, but, only with the
loss  of data.

-ve signs and things to be strongly foreseen while backup testing -

   - pg_clog files were not synced. I suspect they were being written at the
   time of backup. I might have tried to sync the data dir when pg_clog files
   were half filled.
   - Though the WAL Archives are there, Postgres is not trying to recover
   beyond the timestamp at which pg_clog was missing.
   - Even if i replace the missing pg_clog files (which i did), Postgres is
   asking for the corresponding wal archive files

Yes. What i learnt is that we need to ensure that all the pg_clog files must
be fully copied as on the backup time. We cannot afford to miss any of them.

Thanks
Venkat
On Wed, Aug 31, 2011 at 5:46 AM, Craig Ringer <ring...@ringerc.id.au> wrote:

> On 30/08/2011 6:59 PM, Venkat Balaji wrote:
>
>> Hello Everyone,
>>
>> I have a situation here -
>>
>> I am trying to restore the production online backup and recover the same.
>>
>>  - I had initially rsynced (excluded pg_log) the data directory and the
>> tarred and zipped the same
>>
>
> Did you do that after pg_start_backup() or on a stopped database server?
>
> If you did it on a running database server without first running
> pg_start_backup(), your backup is invalid.
>
> Personally I like to take my base backups from an LVM snapshot of the
> datadir just to be extra safe. That isn't necessary, though, and a regular
> rsync or tar or whatever of a datadir after pg_start_backup() is fine.
>
> Remember to run pg_stop_backup() afterwards.
>
>
>   - I got an error "unable to read <filename> from pg_clog location"
>> (file size is around 160K)
>>
>
> ... from PostgreSQL, when you tried to start it?
>
> What emitted that error message?
>
>
>  What i understood is that, rsync some how missed out on syncing the
>> files in "pg_clog"  so, i had manually coped the missing pg_clog file
>> from production and tried recovery.
>>
>
> That won't work. You need a consistent snapshot of all the files in the
> data dir. You cannot just mix and match copies taken at different times.
>
> For efficiency reasons PostgreSQL will recycle used clog files. You can't
> just copy a file over and hope that because it has the same name, it still
> contains the data you want.
>
> Your backup *failed* at the point where you got an incomplete copy of the
> data directory.
>
>
>  Do i need to get that particular wal archive which is before online
>> backup time ?
>>
>
> No, you need to get the missing clog files. If you cannot do that, try
> using pg_resetxlog, but be aware that that may lose transactions and can
> potentially cause corruption of tables and indexes.
>
>
>  By this experience what i understand is that Postgresql stores committed
>> and uncommited transactions in pg_xlog / wal archive files and
>> information (not the transaction data) about transaction commit status
>> is stored in pg_clog. Am I correct ?
>>
>
> That sounds right to me, but I don't know as much about how Pg stores
> things as I should.
>
>
>  I am in the process of designing a disaster recovery planner for our
>> productions systems.
>>
>
> Congratulations!
>
> Be extremely glad this didn't happen in a real recovery scenario. This is a
> marvellous example of why you should always test your backups - you actually
> did, and found a problem that would've been a critical issue if the backup
> were actually needed.
>
> --
> Craig Ringer
>

Re: [GENERAL] Postgresql-9.0.1 Recovery

Reply via email to