[HACKERS] crash / data recovery issues

2008-02-06 Thread Robert Treat
I'm trying to do some data recovery on an 8.1.9 system.  The brief history is 
the system crashed, attempted to do xlog replay but that failed.   I did a 
pg_resetxlog to get something that would startup, and it looks as if the 
indexes on pg_class have become corrupt. (ie. reindex claimes duplicate rows, 
which do not show up when doing count() manipulations on the data).  As it 
turns out, I can't drop these indexes either (system refuses with message 
indexes are needed by the system).  This has kind of let the system in an 
unworkable state.  

I've tried to do a pg_dump, but get schema with OID 96568 does not exist 
error.  The database has a number (~100) temp schemas in it, so I was 
suspecting that the problem was with some object referencing a temp schema 
with broken dependencies, but I looked through pg_depend for any referencing 
objects but found none. I also looked through  pg_type, pg_proc, pg_class, 
pg_constraint, pg_operator, pg_opclass, pg_conversion at their respective 
*namespace fields and also found no matches.   Any suggestions on what else 
might cause this, or how to get past it?  

I also did some digging to find the original error on xlog replay and it 
was  failed to re-find parent key in 763769 for split pages 21032/21033. 
I'm wondering if this is actually something you can push past with 
pg_resetxlog, or if I need to do a pg_resetxlog and pass in values prior to 
that error point (i guess essentially letting pg_resetxlog do a lookup)... 
thoughts? 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] crash / data recovery issues

2008-02-06 Thread Alvaro Herrera
Robert Treat wrote:
 I'm trying to do some data recovery on an 8.1.9 system.  The brief history is 
 the system crashed, attempted to do xlog replay but that failed.   I did a 
 pg_resetxlog to get something that would startup, and it looks as if the 
 indexes on pg_class have become corrupt. (ie. reindex claimes duplicate rows, 
 which do not show up when doing count() manipulations on the data).  As it 
 turns out, I can't drop these indexes either (system refuses with message 
 indexes are needed by the system).  This has kind of let the system in an 
 unworkable state.  

You can work out of it by starting a standalone server with system
indexes disabled (postgres -O -P, I think) and do a REINDEX on it (the
form of it that reindexes all system indexes -- I think it's REINDEX
DATABASE).

 I also did some digging to find the original error on xlog replay and it 
 was  failed to re-find parent key in 763769 for split pages 21032/21033. 
 I'm wondering if this is actually something you can push past with 
 pg_resetxlog, or if I need to do a pg_resetxlog and pass in values prior to 
 that error point (i guess essentially letting pg_resetxlog do a lookup)... 
 thoughts? 

You should be able to get out of that by reindexing that index.
(Actually, after you do a pg_resetxlog I think the best is to pg_dump
the whole thing and reload it.  That gives you at least the assurance
that your FKs are not b0rked)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] crash / data recovery issues

2008-02-06 Thread Tom Lane
Robert Treat [EMAIL PROTECTED] writes:
 I'm trying to do some data recovery on an 8.1.9 system.
 ...
 I also did some digging to find the original error on xlog replay and it 
 was  failed to re-find parent key in 763769 for split pages 21032/21033. 

Hmm, the only known cause of that was fixed in 8.1.6.  Don't suppose you made
a copy of everything before destroying the evidence with pg_resetxlog?
If you did, any chance I could get access to it?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] crash / data recovery issues

2008-02-06 Thread Robert Treat
On Wednesday 06 February 2008 13:56, Alvaro Herrera wrote:
 Robert Treat wrote:
  it looks as if the indexes on pg_class have become corrupt. (ie. reindex
  claimes duplicate rows, which do not show up when doing count()
  manipulations on the data).  As it turns out, I can't drop these indexes
  either (system refuses with message indexes are needed by the system). 
  This has kind of let the system in an unworkable state.

 You can work out of it by starting a standalone server with system
 indexes disabled (postgres -O -P, I think) and do a REINDEX on it (the
 form of it that reindexes all system indexes -- I think it's REINDEX
 DATABASE).


Sorry, I should have mentioned I tried the above was under postgres -d 
1 -P -O -D /path/to/data, but the reindex complains (doing reindex directly 
on the pg_class indexes, or doing reindex system).  

Personally I was surprised to find out it wouldn't let me drop the indexes 
under this mode,  but thats a different story.  Oh, probably worth noting I 
am able to reindex other system tables this way, just not pg_class. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster