Re: [Slony1-general] Please HELP - URGENT - transaction wraparound error

John Sidney-Woollett Sun, 30 Oct 2005 10:54:30 -0800

Andew

You're right, I realised after, that they're not full vacuums.

There was another database (mail_lxtreme) that was unused (as far as Ican tell) and which was not being vacuumed:


 SELECT datname, age(datfrozenxid) FROM pg_database;
   datname    |     age
--------------+-------------
 mail_lxtreme | -2074187459
 bp_live      |  1079895636
 template1    |  1076578064
 template0    | -2074187459
(4 rows)

In the end, I did a moveset on all 6 sets from the (damaged) master tothe slave. Then I shutdown slon and postgress on the old master anddeleted its data dir, and then re-initdb'd it.


I removed the replication info on the surviving node by doing an uninstall.

I created a new cluster and subscribed to get all the data back onto therebuilt server.

Later when I'm feeling less tired, I'll switchover to reinstate theformer master (as it is a much faster server).

I'm going to be a lot more careful when I add databases to ensure thatthey always vacuumed periodically.

I'l also going to add a new nagios script to scan serverlog for any WARNor ERROR messages for the current day - this way I should get notice ofa problem before it becomes a disaster!


Thanks for your feedback.

John

Andrew Sullivan wrote:

On Sun, Oct 30, 2005 at 09:00:12AM +0000, John Sidney-Woollett wrote:

over 2 billion transactions
DETAIL:  You may have already suffered transaction-wraparound data loss.

We have cronscripts that perform FULL vacuums



Not on all your your databases.  And anyway

# vacuum template1 every sunday
35 2 * * 7 /usr/local/pgsql/bin/vacuumdb --analyze --verbose template1

# vacuum live DB every day
35 5 * * * /usr/local/bin/psql -c "vacuum verbose analyze" -d bp_live -U
postgres --output /home/postgres/cronscripts/live/vacuumfull.log



Those aren't fill vacuums.  There must be some database in there that
you're not telling us about.  Do you have anything other than
template0, template1, and bp_live?  Also, has template0 always been
frozen?

2) What can I do to recover the data?



Nothing, save for restoring from old backups.

I can failover to the slave server, but what do I need to do to rebuild
the original database?



You'll need to rebuild it from scratch.  You could do a switchover
instead, but I think that's risky in this case.

Should I failover now?!! And then start rebuilding the old master
database (using slon, I presume)?



That's what I'd do.  It's just like adding a new node.

How do I stop this EVER happening again??!!!



Well, _something_ didn't get vacuumed in time.  Better find out what
that was.  I'm also extremely surprised you didn't see the warnings
in time -- are you sure you're not overlooking something important in
your logs?

A

_______________________________________________
Slony1-general mailing list
[email protected]
http://gborg.postgresql.org/mailman/listinfo/slony1-general

Re: [Slony1-general] Please HELP - URGENT - transaction wraparound error

Reply via email to