I added an Assert(DatabasePath != NULL) to
RelationCacheInitFilePreInvalidate() and didn't see a single crash when
running the tests. I thought that adding a "VACUUM FREEZE pg_class"
in the recovery tests (where there is a standby) ought to do it, but it
doesn't. What's the deal there?
Here are some proposed changes. Some of these comment edits are WIP :-)
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c
index 0d6100fb08..8a8620943f 100644
--- a/src/backend/utils/cache/inval.c
+++ b/src/backend/utils/cache/inval.c
@@ -872,6 +872,8 @@
ProcessCommittedInvalidationMessages(SharedInvalidationMessage *msgs,
int
nmsgs, bool RelcacheInitFileInval,
Oid
dbid, Oid tsid)
{
+ Assert(InRecovery);
+
if (nmsgs <= 0)
return;
@@ -884,12 +886,13 @@
ProcessCommittedInvalidationMessages(SharedInvalidationMessage *msgs,
dbid);
/*
- * RelationCacheInitFilePreInvalidate, when the invalidation
message
- * is for a specific database, requires DatabasePath to be set,
but we
- * should not use SetDatabasePath during recovery, since it is
- * intended to be used only once by normal backends. Hence, a
quick
- * hack: set DatabasePath directly then unset after use.
+ * When the invalidation message is for a specific database,
+ * RelationCacheInitFilePreInvalidate requires DatabasePath to
be set,
+ * but we're not allowed to use SetDatabasePath during recovery
(since
+ * it is intended to be used by normal backends). Hence, a
quick hack:
+ * set DatabasePath directly then unset after use.
*/
+ Assert(!DatabasePath); /* don't clobber an existing value */
if (OidIsValid(dbid))
DatabasePath = GetDatabasePath(dbid, tsid);
diff --git a/src/backend/utils/cache/relcache.c
b/src/backend/utils/cache/relcache.c
index aa4427724d..71b2212cbd 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1934,6 +1934,11 @@ RelationIdGetRelation(Oid relationId)
RelationClearRelation(rd, true);
/*
+ * Normal entries are valid by now -- except nailed
ones when
+ * loaded before relcache initialization. There isn't
enough
+ * infrastructure yet to do pg_class lookups, but we
need their
+ * rd_rel entries to be updated, so we let these
through.
+ */
* Normally entries need to be valid here, but before
the relcache
* has been initialized, not enough infrastructure
exists to
* perform pg_class lookups. The structure of such
entries doesn't
@@ -2346,8 +2351,7 @@ RelationClearRelation(Relation relation, bool rebuild)
RelationCloseSmgr(relation);
/*
- * Treat nailed-in system relations separately, they always need to be
- * accessible, so we can't blow them away.
+ * We cannot blow away nailed-in relations, so treat them especially.
*/
if (relation->rd_isnailed)
{
@@ -5942,7 +5946,8 @@ write_relcache_init_file(bool shared)
* wrote out was up-to-date.)
*
* This mustn't run concurrently with the code that unlinks an init file
- * and sends SI messages, so grab a serialization lock for the duration.
+ * and sends SI messages (see RelationCacheInitFilePreInvalidate), so
grab
+ * a serialization lock for the duration.
*/
LWLockAcquire(RelCacheInitLock, LW_EXCLUSIVE);
@@ -6061,6 +6066,10 @@ RelationHasUnloggedIndex(Relation rel)
* changed one or more of the relation cache entries that are kept in the
* local init file.
*
+ * When DatabasePath is set, both the init file for that database and the
+ * shared (global) init files are to be removed; otherwise only the latter is.
+ * This is useful during recovery (XXX really?)
+ *
* To be safe against concurrent inspection or rewriting of the init file,
* we must take RelCacheInitLock, then remove the old init file, then send
* the SI messages that include relcache inval for such relations, and then
@@ -6180,9 +6189,9 @@ unlink_initfile(const char *initfilename, int elevel)
{
if (unlink(initfilename) < 0)
{
- /* It might not be there, but log any error other than ENOENT */
+ /* It might not be there, but report any error other than
ENOENT */
if (errno != ENOENT)
- ereport(ERROR,
+ ereport(elevel,
(errcode_for_file_access(),
errmsg("could not remove cache file
\"%s\": %m",
initfilename)));