I added an Assert(DatabasePath != NULL) to
RelationCacheInitFilePreInvalidate() and didn't see a single crash when
running the tests.  I thought that adding a "VACUUM FREEZE pg_class"
in the recovery tests (where there is a standby) ought to do it, but it
doesn't.  What's the deal there?

Here are some proposed changes.  Some of these comment edits are WIP :-)

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c
index 0d6100fb08..8a8620943f 100644
--- a/src/backend/utils/cache/inval.c
+++ b/src/backend/utils/cache/inval.c
@@ -872,6 +872,8 @@ 
ProcessCommittedInvalidationMessages(SharedInvalidationMessage *msgs,
                                                                         int 
nmsgs, bool RelcacheInitFileInval,
                                                                         Oid 
dbid, Oid tsid)
 {
+       Assert(InRecovery);
+
        if (nmsgs <= 0)
                return;
 
@@ -884,12 +886,13 @@ 
ProcessCommittedInvalidationMessages(SharedInvalidationMessage *msgs,
                         dbid);
 
                /*
-                * RelationCacheInitFilePreInvalidate, when the invalidation 
message
-                * is for a specific database, requires DatabasePath to be set, 
but we
-                * should not use SetDatabasePath during recovery, since it is
-                * intended to be used only once by normal backends.  Hence, a 
quick
-                * hack: set DatabasePath directly then unset after use.
+                * When the invalidation message is for a specific database,
+                * RelationCacheInitFilePreInvalidate requires DatabasePath to 
be set,
+                * but we're not allowed to use SetDatabasePath during recovery 
(since
+                * it is intended to be used by normal backends).  Hence, a 
quick hack:
+                * set DatabasePath directly then unset after use.
                 */
+               Assert(!DatabasePath);  /* don't clobber an existing value */
                if (OidIsValid(dbid))
                        DatabasePath = GetDatabasePath(dbid, tsid);
 
diff --git a/src/backend/utils/cache/relcache.c 
b/src/backend/utils/cache/relcache.c
index aa4427724d..71b2212cbd 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -1934,6 +1934,11 @@ RelationIdGetRelation(Oid relationId)
                                RelationClearRelation(rd, true);
 
                        /*
+                        * Normal entries are valid by now -- except nailed 
ones when
+                        * loaded before relcache initialization.  There isn't 
enough
+                        * infrastructure yet to do pg_class lookups, but we 
need their
+                        * rd_rel entries to be updated, so we let these 
through.
+                        */
                         * Normally entries need to be valid here, but before 
the relcache
                         * has been initialized, not enough infrastructure 
exists to
                         * perform pg_class lookups. The structure of such 
entries doesn't
@@ -2346,8 +2351,7 @@ RelationClearRelation(Relation relation, bool rebuild)
        RelationCloseSmgr(relation);
 
        /*
-        * Treat nailed-in system relations separately, they always need to be
-        * accessible, so we can't blow them away.
+        * We cannot blow away nailed-in relations, so treat them especially.
         */
        if (relation->rd_isnailed)
        {
@@ -5942,7 +5946,8 @@ write_relcache_init_file(bool shared)
         * wrote out was up-to-date.)
         *
         * This mustn't run concurrently with the code that unlinks an init file
-        * and sends SI messages, so grab a serialization lock for the duration.
+        * and sends SI messages (see RelationCacheInitFilePreInvalidate), so 
grab
+        * a serialization lock for the duration.
         */
        LWLockAcquire(RelCacheInitLock, LW_EXCLUSIVE);
 
@@ -6061,6 +6066,10 @@ RelationHasUnloggedIndex(Relation rel)
  * changed one or more of the relation cache entries that are kept in the
  * local init file.
  *
+ * When DatabasePath is set, both the init file for that database and the
+ * shared (global) init files are to be removed; otherwise only the latter is.
+ * This is useful during recovery (XXX really?)
+ *
  * To be safe against concurrent inspection or rewriting of the init file,
  * we must take RelCacheInitLock, then remove the old init file, then send
  * the SI messages that include relcache inval for such relations, and then
@@ -6180,9 +6189,9 @@ unlink_initfile(const char *initfilename, int elevel)
 {
        if (unlink(initfilename) < 0)
        {
-               /* It might not be there, but log any error other than ENOENT */
+               /* It might not be there, but report any error other than 
ENOENT */
                if (errno != ENOENT)
-                       ereport(ERROR,
+                       ereport(elevel,
                                        (errcode_for_file_access(),
                                         errmsg("could not remove cache file 
\"%s\": %m",
                                                        initfilename)));

Reply via email to