Re: [GENERAL] Corrupt database? 8.1/FreeBSD6.0

2007-01-18 Thread Alvaro Herrera
Tom Lane wrote: I wrote: ... but I suddenly fear that we've missed a fundamental point about pg_clog truncation. And WAL wraparound for that matter. To wit, a sufficiently long-lived temp table could contain old XIDs, and there's no way for anyone except the owning backend to clean them

Re: [GENERAL] Corrupt database? 8.1/FreeBSD6.0

2007-01-18 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes: Tom Lane wrote: But it seems that we need a band-aid for 8.1 and earlier. The simplest fix I can think of is for vacuum not to attempt to advance the datvacuumxid/datfrozenxid fields if it skipped over any temp tables of other backends. That's a bit

Re: [GENERAL] Corrupt database? 8.1/FreeBSD6.0

2007-01-13 Thread Jeff Amiel
Tom Lane [EMAIL PROTECTED] wrote: BTW, please don't do anything to try to correct the problem until we're pretty sure we understand how this happened --- we might ask you for more info. AFAICS this isn't having any bad effects except for bleats in your log file, so you can wait. Happened

Re: [GENERAL] Corrupt database? 8.1/FreeBSD6.0

2007-01-13 Thread Tom Lane
Jeff Amiel [EMAIL PROTECTED] writes: Jan 13 08:27:26 prod-app-1 postgres[92257]: [30259-1] jboss 92257 ERROR: could not access status of transaction 2107200825 Jan 13 08:27:26 prod-app-1 postgres[92257]: [30259-2] jboss 92257 DETAIL: could not open file pg_clog/07D9: No such file or

Re: [GENERAL] Corrupt database? 8.1/FreeBSD6.0

2007-01-13 Thread Jeff Amiel
Tom Lane [EMAIL PROTECTED] wrote: Really? Wow, *that's* an interesting thought. Is it likely that that temp table could contain many-hour-old data? Certainly...our connection pool used by jboss can have connections to postgres persisting for multiple days. (We're still looking for a way to

Re: [GENERAL] Corrupt database? 8.1/FreeBSD6.0

2007-01-13 Thread Tom Lane
I wrote: ... but I suddenly fear that we've missed a fundamental point about pg_clog truncation. And WAL wraparound for that matter. To wit, a sufficiently long-lived temp table could contain old XIDs, and there's no way for anyone except the owning backend to clean them out, or even

Re: [GENERAL] Corrupt database? 8.1/FreeBSD6.0

2007-01-12 Thread Jeff Amiel
Tom Lane [EMAIL PROTECTED] wrote:So apparently there's some row in template0 that didn't get marked committed before the pg_clog segment for it went away. Given 8.1's rather schizophrenic view of whether it can modify template0 or not, this is not too surprising, but I thought we'd put in some

Re: [GENERAL] Corrupt database? 8.1/FreeBSD6.0

2007-01-12 Thread Alvaro Herrera
Jeff Amiel wrote: Sure enough I did make autovacuum more agressive about 30-45 days ago (have to check the logs to find the exact date). Was originally whatever default settings that came out of the box with 8.1. Naptime is currently set to 60 seconds. Am I to assume that this

Re: [GENERAL] Corrupt database? 8.1/FreeBSD6.0

2007-01-12 Thread Jeff Amiel
Alvaro Herrera [EMAIL PROTECTED] wrote: Did you perchance connect to template0 (probably marking it as connectable in the process), made some modification there, and then mark it as non-connectable again, without executing VACUUM FREEZE on it? AFAICS we only execute VACUUM FREEZE on it, so we

Re: [GENERAL] Corrupt database? 8.1/FreeBSD6.0

2007-01-12 Thread Alvaro Herrera
Tom Lane wrote: Jeff Amiel [EMAIL PROTECTED] writes: We've downloaded/compiled pg_filedump, but are stumped as to what relation (or even what database) to start with. Turn up log_min_messages to DEBUG2 and you'll be able to see which table autovac is failing at. If I had to bet I'd bet

Re: [GENERAL] Corrupt database? 8.1/FreeBSD6.0

2007-01-12 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes: Hmm, that would mean an ANALYZE got done on template0, no? ... but AFAICT process_whole_db() always sets analyze=false. The thing that's bothering me is that I don't see any certainty that template0 is only processed via the process_whole_db() path. In

Re: [GENERAL] Corrupt database? 8.1/FreeBSD6.0

2007-01-12 Thread Alvaro Herrera
Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: Hmm, that would mean an ANALYZE got done on template0, no? ... but AFAICT process_whole_db() always sets analyze=false. The thing that's bothering me is that I don't see any certainty that template0 is only processed via the

Re: [GENERAL] Corrupt database? 8.1/FreeBSD6.0

2007-01-12 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes: Hmm, as far as I can tell, the database entry would not be created merely by a vacuum. The only way to create a database entry in pgstat is by calling pgstat_recv_tabstat(); and pgstat_report_tabstat is only called in postgres.c (not invoked via

Re: [GENERAL] Corrupt database? 8.1/FreeBSD6.0

2007-01-12 Thread Tom Lane
Jeff Amiel [EMAIL PROTECTED] writes: Any ideas on how should I move forward? Well, if the problem is indeed in pg_statistic, it'll be easy to repair (think TRUNCATE...). Have you turned up the logging level to find out? BTW, please don't do anything to try to correct the problem until we're

Re: [GENERAL] Corrupt database? 8.1/FreeBSD6.0

2007-01-12 Thread Jeff Amiel
Tom Lane [EMAIL PROTECTED] wrote: Well, if the problem is indeed in pg_statistic, it'll be easy to repair (think TRUNCATE...). Have you turned up the logging level to find out? BTW, please don't do anything to try to correct the problem until we're pretty sure we understand how this happened

[GENERAL] Corrupt database? 8.1/FreeBSD6.0

2007-01-11 Thread Jeff Amiel
PostgreSQL 8.1.2 on i386-portbld-freebsd6.0, compiled by GCC cc (GCC) 3.4.4 [FreeBSD] 20050518 Started seeing this in the logs this afternoon...scaring the begeezus out of me. Jan 11 19:20:19 prod-app-1 postgres[1752]: [5-1] 1752 ERROR: could not access status of transaction 1924023481 Jan

Re: [GENERAL] Corrupt database? 8.1/FreeBSD6.0

2007-01-11 Thread Jeff Amiel
Looking backwards in the logs we see it a few other times this month... (Autovacuum occurring just prior)...same transaction ID How could it be the same transaction ID from several days prior? Jan 2 03:05:04 prod-app-1 postgres[8524]: [4-1] 8524 LOG: autovacuum: processing database template0

Re: [GENERAL] Corrupt database? 8.1/FreeBSD6.0

2007-01-11 Thread Tom Lane
Jeff Amiel [EMAIL PROTECTED] writes: PostgreSQL 8.1.2 on i386-portbld-freebsd6.0, compiled by GCC cc (GCC) 3.4.4 [FreeBSD] 20050518 Jan 2 03:05:04 prod-app-1 postgres[8524]: [4-1] 8524 LOG: autovacuum: processing database template0 Jan 2 03:05:05 prod-app-1 postgres[8524]: [5-1] 8524

Re: [GENERAL] Corrupt database? 8.1/FreeBSD6.0

2007-01-11 Thread Tom Lane
Jeff Amiel [EMAIL PROTECTED] writes: We've downloaded/compiled pg_filedump, but are stumped as to what relation (or even what database) to start with. Turn up log_min_messages to DEBUG2 and you'll be able to see which table autovac is failing at. If I had to bet I'd bet on