On 2015-06-01 14:22:32 -0400, Robert Haas wrote:

> commit d33b4eb0167f465edb00bd6c0e1bcaa67dd69fe9
> Author: Robert Haas <rh...@postgresql.org>
> Date:   Fri May 29 14:35:53 2015 -0400
> 
>     foo

Hehe!

> diff --git a/src/backend/access/transam/multixact.c 
> b/src/backend/access/transam/multixact.c
> index 9568ff1..aca829d 100644
> --- a/src/backend/access/transam/multixact.c
> +++ b/src/backend/access/transam/multixact.c
> @@ -199,8 +199,9 @@ typedef struct MultiXactStateData
>       MultiXactOffset nextOffset;
>  
>       /*
> -      * Oldest multixact that is still on disk.  Anything older than this
> -      * should not be consulted.  These values are updated by vacuum.
> +      * Oldest multixact that may still be referenced from a relation.
> +      * Anything older than this should not be consulted.  These values are
> +      * updated by vacuum.
>        */
>       MultiXactId oldestMultiXactId;
>       Oid                     oldestMultiXactDB;
> @@ -213,6 +214,18 @@ typedef struct MultiXactStateData
>        */
>       MultiXactId lastCheckpointedOldest;
>  
> +     /*
> +      * This is the oldest file that actually exist on the disk.  This value
> +      * is initialized by scanning pg_multixact/offsets, and subsequently
> +      * updated each time we complete a truncation.  We need a flag to
> +      * indicate whether this has been initialized yet.
> +      */
> +     MultiXactId oldestMultiXactOnDisk;
> +     bool            oldestMultiXactOnDiskValid;
> +
> +     /* Has TrimMultiXact been called yet? */
> +     bool            didTrimMultiXact;

I'm not really convinced tying things closer to having done trimming is
easier to understand than tying things to recovery having finished.

E.g.
        if (did_trim)
                oldestOffset = GetOldestReferencedOffset(oldest_datminmxid);
imo is harder to understand than if (!InRecovery).

Maybe we could just name it finishedStartup and rename the functions 
accordingly?

> @@ -1956,12 +1971,6 @@ StartupMultiXact(void)
>        */
>       pageno = MXOffsetToMemberPage(offset);
>       MultiXactMemberCtl->shared->latest_page_number = pageno;
> -
> -     /*
> -      * compute the oldest member we need to keep around to avoid old member
> -      * data overrun.
> -      */
> -     DetermineSafeOldestOffset(MultiXactState->oldestMultiXactId);
>  }

Maybe it's worthwhile to add a 'NB: At this stage the data directory is
not yet necessarily consistent' StartupMultiXact's comments, to avoid
reintroducing problems like this?

>       /*
> +      * We can read this without a lock, because it only changes when nothing
> +      * else is running.
> +      */
> +     did_trim = MultiXactState->didTrimMultiXact;

Err, Hot Standby? It might be ok to not lock, but the comment is
definitely wrong. I'm inclined to simply use locking, this doesn't look
sufficiently critical performancewise.

> +static MultiXactOffset
> +GetOldestReferencedOffset(MultiXactId oldestMXact)
> +{
> +     MultiXactId             earliest;
> +     MultiXactOffset oldestOffset;
> +
> +     /*
> +      * Because of bugs in early 9.3.X and 9.4.X releases (see comments in
> +      * TrimMultiXact for details), oldest_datminmxid might point to a
> +      * nonexistent multixact.  If so, use the oldest one that actually 
> +      * exists.  Anything before this can't be successfully used anyway.
> +      */
> +     earliest = GetOldestMultiXactOnDisk();
> +     if (MultiXactIdPrecedes(oldestMXact, earliest))
> +             oldestMXact = earliest;

Hm. If GetOldestMultiXactOnDisk() gets the starting point by scanning
the disk it'll always get one at a segment boundary, right? I'm not sure
that's actually ok; because the value at the beginning of the segment
can very well end up being a 0, as MaybeExtendOffsetSlru() will have
filled the page with zeros.

I think that should be harmless, the worst that can happen is that
oldestOffset errorneously is 0, which should be correct, even though
possibly overly conservative, in these cases.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to