[ADMIN] Errors on VACUUM
I received the following error from a VACUUM ANALYZE: NOTICE: FlushRelationBuffers(place, 454): block 231 is referenced (private 0, global 4) FATAL 1: VACUUM (vc_repair_frag): FlushRelationBuffers returned -2 and psql lost the connection right after that. This was repeatable, after the first occurrence I re-started the postmaster and tried again, same result. The particular table in question is static data, that is it was originally filled in once by COPY and there were no subsequent inserts or deletes. But I had just added a column to the table and updated it from another table to eliminate a frequently-done join, that was the reason for doing the VACUUM. Since I had the text file with the complete table contents from the original COPY, I decided to re-build the table, so next I did DROP TABLE, and I got this: NOTICE: Buffer Leak: [004] (freeNext=-3, freePrev=-3, relname=place_pid, blockNum=1, flags=0xc, refcount=2 -1) NOTICE: Buffer Leak: [005] (freeNext=-3, freePrev=-3, relname=place, blockNum=231, flags=0xc, refcount=4 -1) NOTICE: Buffer Leak: [008] (freeNext=-3, freePrev=-3, relname=place, blockNum=85, flags=0xc, refcount=3 -1) NOTICE: Buffer Leak: [011] (freeNext=-3, freePrev=-3, relname=place, blockNum=0, flags=0xc, refcount=2 -1) . . repeating about 20 more times with blockNum varying, and finally ending with the table being dropped. I then re-built the table from scratch and everything seems to be fine now, VACUUM no longer gives errors. Here is my system configuration: PostgreSQL 7.0.2 RedHat Linux 7.0 P-III 800, 768MB RAM, 80GB disk So I'm wondering what happened here, what might have caused the original error, if something else is potentially still corrupted, and if I should maybe re-build the whole database to be safe? Advice much appreciated! Bob Smith Hammett Edison, Inc. [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [ADMIN] Errors on VACUUM
Bob Smith [EMAIL PROTECTED] writes: PostgreSQL 7.0.2 Time to update ... I'm rather surprised that restarting the postmaster didn't make the error go away, but it's unlikely that anyone will care to investigate --- unless you can reproduce the problem in 7.1.3 or later. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [ADMIN] Errors on VACUUM
On Friday, March 15, 2002, at 09:53 , Tom Lane wrote: Bob Smith [EMAIL PROTECTED] writes: PostgreSQL 7.0.2 Time to update ... I know, I know... But I'm about two weeks from bringing a whole new server on-line with the latest version of Postgres, so I don't want to hassle with upgrading the old one. I just need to keep it limping along for a little while longer. I'm rather surprised that restarting the postmaster didn't make the error go away, but it's unlikely that anyone will care to investigate --- unless you can reproduce the problem in 7.1.3 or later. So this isn't an error that would be caused by some sort of file problem? The server got new hard drives a few months ago and files went through a backup/restore with tar, I didn't use pg_dump/pg_restore, so it occurred to me that something might have been scrambled. But until this VACUUM failure, the only problem I've _ever_ had, before or after the drive upgrade, was one occurrence of the Buffer Leak error which I couldn't reproduce. Well, I guess I just keep my fingers crossed for a couple of weeks until I have my new server up. Thanks! Bob Smith Hammett Edison, Inc. [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [ADMIN] Errors on VACUUM
Bob Smith [EMAIL PROTECTED] writes: I'm rather surprised that restarting the postmaster didn't make the error go away, but it's unlikely that anyone will care to investigate --- unless you can reproduce the problem in 7.1.3 or later. So this isn't an error that would be caused by some sort of file problem? AFAICT what you have there is just dangling buffer reference counts; ie, something didn't release its reference to a disk page buffer in shared memory. We used to have quite a number of bugs of that ilk, though I've not heard of such lately. They're fairly harmless normally. It's not clear to me why a restart (which would naturally clear shared memory) didn't fix the problem. Perhaps you had some configuration on disk that tickled the original buffer-leak bug again each time you restarted; but I dunno what. VACUUM itself was never prone to such errors, so it's hard to see how a restart and immediate VACUUM would show the problem. Were you doing other stuff in between? Have you got any interesting stuff (like functional indexes, maybe) on this table? regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [ADMIN] Errors on VACUUM
On Friday, March 15, 2002, at 03:07 , Tom Lane wrote: It's not clear to me why a restart (which would naturally clear shared memory) didn't fix the problem. Er, um, (humble look), I just studied my syslog, and in fact the postmaster _didn't_ restart. I tried to restart it using the Red Hat control script from /etc/rc.d/init.d, but apparently that script lied to me when it said the restart succeeded. The syslog clearly shows the postmaster didn't actually restart until about 10 minutes later, _after_ the second error. So I was talking to the same postmaster every time. Arrggh. I'm reassured now that this is just a passing memory management issue and not a symptom of a more serious problem, I'm sure the error wouldn't have happened the second time if the postmaster had in fact restarted. Everything looks fine now so I'm going to stop worrying about it. Thanks much for your help! Bob Smith Hammett Edison, Inc. [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster