Hi,

On 2018-05-29 19:14:51 -0400, Alvaro Herrera wrote:
> I added an Assert(DatabasePath != NULL) to
> RelationCacheInitFilePreInvalidate() and didn't see a single crash when
> running the tests.  I thought that adding a "VACUUM FREEZE pg_class"
> in the recovery tests (where there is a standby) ought to do it, but it
> doesn't.  What's the deal there?
> 
> Here are some proposed changes.  Some of these comment edits are WIP :-)

I'm a bit confused by these changes - there seems to be some that look
like a borked diff?  And a number of others look pretty unrelated?

> -- 
> Álvaro Herrera                https://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

> diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c
> index 0d6100fb08..8a8620943f 100644
> --- a/src/backend/utils/cache/inval.c
> +++ b/src/backend/utils/cache/inval.c
> @@ -872,6 +872,8 @@ 
> ProcessCommittedInvalidationMessages(SharedInvalidationMessage *msgs,
>                                                                        int 
> nmsgs, bool RelcacheInitFileInval,
>                                                                        Oid 
> dbid, Oid tsid)
>  {
> +     Assert(InRecovery);
> +

Idk, seems unrelated.

>       if (nmsgs <= 0)
>               return;
>  
> @@ -884,12 +886,13 @@ 
> ProcessCommittedInvalidationMessages(SharedInvalidationMessage *msgs,
>                        dbid);
>  
>               /*
> -              * RelationCacheInitFilePreInvalidate, when the invalidation 
> message
> -              * is for a specific database, requires DatabasePath to be set, 
> but we
> -              * should not use SetDatabasePath during recovery, since it is
> -              * intended to be used only once by normal backends.  Hence, a 
> quick
> -              * hack: set DatabasePath directly then unset after use.
> +              * When the invalidation message is for a specific database,
> +              * RelationCacheInitFilePreInvalidate requires DatabasePath to 
> be set,
> +              * but we're not allowed to use SetDatabasePath during recovery 
> (since
> +              * it is intended to be used by normal backends).  Hence, a 
> quick hack:
> +              * set DatabasePath directly then unset after use.
>                */
> +             Assert(!DatabasePath);  /* don't clobber an existing value */
>               if (OidIsValid(dbid))
>                       DatabasePath = GetDatabasePath(dbid, tsid);

Same.


> diff --git a/src/backend/utils/cache/relcache.c 
> b/src/backend/utils/cache/relcache.c
> index aa4427724d..71b2212cbd 100644
> --- a/src/backend/utils/cache/relcache.c
> +++ b/src/backend/utils/cache/relcache.c
> @@ -1934,6 +1934,11 @@ RelationIdGetRelation(Oid relationId)
>                               RelationClearRelation(rd, true);
>  
>                       /*
> +                      * Normal entries are valid by now -- except nailed 
> ones when
> +                      * loaded before relcache initialization.  There isn't 
> enough
> +                      * infrastructure yet to do pg_class lookups, but we 
> need their
> +                      * rd_rel entries to be updated, so we let these 
> through.
> +                      */
>                        * Normally entries need to be valid here, but before 
> the relcache
>                        * has been initialized, not enough infrastructure 
> exists to
>                        * perform pg_class lookups. The structure of such 
> entries doesn't
> @@ -2346,8 +2351,7 @@ RelationClearRelation(Relation relation, bool rebuild)
>       RelationCloseSmgr(relation);

This sure looks like it's a syntax error? So I guess you might not have
staged the removal ports of the diff?


>       /*
> -      * Treat nailed-in system relations separately, they always need to be
> -      * accessible, so we can't blow them away.
> +      * We cannot blow away nailed-in relations, so treat them especially.
>        */

The former seems just as accurate, and is basically just the already
existing comment?


>       if (relation->rd_isnailed)
>       {
> @@ -5942,7 +5946,8 @@ write_relcache_init_file(bool shared)
>        * wrote out was up-to-date.)
>        *
>        * This mustn't run concurrently with the code that unlinks an init file
> -      * and sends SI messages, so grab a serialization lock for the duration.
> +      * and sends SI messages (see RelationCacheInitFilePreInvalidate), so 
> grab
> +      * a serialization lock for the duration.
>        */
>       LWLockAcquire(RelCacheInitLock, LW_EXCLUSIVE);

Unrelated?


> @@ -6061,6 +6066,10 @@ RelationHasUnloggedIndex(Relation rel)
>   * changed one or more of the relation cache entries that are kept in the
>   * local init file.
>   *
> + * When DatabasePath is set, both the init file for that database and the
> + * shared (global) init files are to be removed; otherwise only the latter 
> is.
> + * This is useful during recovery (XXX really?)
> + *

I'm confused?


>   * To be safe against concurrent inspection or rewriting of the init file,
>   * we must take RelCacheInitLock, then remove the old init file, then send
>   * the SI messages that include relcache inval for such relations, and then
> @@ -6180,9 +6189,9 @@ unlink_initfile(const char *initfilename, int elevel)
>  {
>       if (unlink(initfilename) < 0)
>       {
> -             /* It might not be there, but log any error other than ENOENT */
> +             /* It might not be there, but report any error other than 
> ENOENT */
>               if (errno != ENOENT)
> -                     ereport(ERROR,
> +                     ereport(elevel,
>                                       (errcode_for_file_access(),
>                                        errmsg("could not remove cache file 
> \"%s\": %m",
>                                                       initfilename)));

Included the elevel inclusion. Can't parse the difference between log
and report here?

Greetings,

Andres Freund

Reply via email to