Re: [HACKERS] [bug fix] Suppress "autovacuum: found orphan temp table" message

MauMau Tue, 22 Jul 2014 06:15:24 -0700

From: "Andres Freund" <and...@2ndquadrant.com>

On 2014-07-22 19:13:56 +0900, MauMau wrote:
But this is true if restart_after_crash = on in postgresql.conf, becausethecrash restart only occurs in that case. However, in HA cluster, whetheritis shared-disk or replication, restart_after_crash is set to off, isn'tit?
In almost all setups I've seen it's set to on, even in HA scenarios.

I'm afraid that's because people don't notice the existence or purpose ofthis parameter. The 9.1 release note says:

Add restart_after_crash setting which disables automatic server restartafter a backend crash (Robert Haas)This allows external cluster management software to control whether thedatabase server restarts or not.

Reading this, I guess the parameter was introduced, and should be used, forHA environments controlled by the clusterware. Restarting the databaseserver on the same machine may fail, or the restarted server may fail again,due to the broken hardware components, so I guess it was considered betterto let the clusterware determine what to do.

Moreover, as the comment says, the behavior of keeping leftover tempfiles
is for debugging by developers.  It's not helpful for users, isn't it?  I
thought messages of DEBUG level is more appropriate, because the behavioris
for debugging purposes.
GRR. That doesn't change the fact that there'll be files left over after
a crash restart.

Yes... that's a source of headache. But please understand that there's aproblem -- trying to leave temp relations just for debugging is causing aflood of messages, which the customer is actually concerned about.

I think you're making lots of noise over a trivial log message.

Maybe so, and I hope so. I may be too nervous about what the customer willask and/or request next. If they request something similar to what Iproposed here, let me consult you again.

Could you please reconsider this?


No. Just removing a warning isn't the way to solve this. If you want to
improve things you'll actually need to improve things not just stick
your head into the sand.

I have a few ideas below, but none of them seems better than the originalproposal. What do you think?

1. startup process deletes the catalog entries and data files of leftovertemp relations at the end of recovery.This is probably difficult, impossible or undesirable, because the startupprocess cannot access system catalogs. Even if it's possible, it is againstthe developers' desire to leave temp relation files for debugging.

2. autovacuum launcher deletes the catalog entries and data files ofleftover temp relations during its initialization.This may be possible, but it is against the developers' desire to leave temprelation files for debugging.

3. Emit the "orphan temp relation" message only when the associated datafile actually exists.autovacuum workers check if the temp relation file is left over with stat().If not, delete the catalog entry in pg_class silently.This sounds reasonable because the purpose of the message is to notify usersof potential disk space shortage. In the streaming replication case, nodata files should exist on the promoted new primary, so no messages shouldbe emitted.However, in the shared-disk HA cluster case, the temp relation files areleft over on the shared disk, so this fix doesn't improve anything.

4. Emit the "orphan temp relation" message only when restart_after_crash ison.

i.e.
 ereport(restart_after_crash ? LOG : DEBUG1, ...


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] Suppress "autovacuum: found orphan temp table" message

Reply via email to