Re: pgsql: Compute XID horizon for page level index vacuum on primary.

Simon Riggs Fri, 29 Mar 2019 08:59:19 -0700

On Fri, 29 Mar 2019 at 15:29, Andres Freund <and...@anarazel.de> wrote:



> On 2019-03-29 09:37:11 +0000, Simon Riggs wrote:
>


> > While trying to understand this, I see there is an even better way to
> > optimize this. Since we are removing dead index tuples, we could alter
> the
> > killed index tuple interface so that it returns the xmax of the tuple
> being
> > marked as killed, rather than just a boolean to say it is dead.
>
> Wouldn't that quite possibly result in additional and unnecessary
> conflicts? Right now the page level horizon is computed whenever the
> page is actually reused, rather than when an item is marked as
> deleted. As it stands right now, the computed horizons are commonly very
> "old", because of that delay, leading to lower rates of conflicts.
>

I wasn't suggesting we change when the horizon is calculated, so no change
there.

The idea was to cache the data for later use, replacing the hint bit with a
hint xid.

That won't change the rate of conflicts, up or down - but it does avoid I/O.


> > Indexes can then mark the killed tuples with the xmax that killed them
> > rather than just a hint bit. This is possible since the index tuples
> > are dead and cannot be used to follow the htid to the heap, so the
> > htid is redundant and so the block number of the tid could be
> > overwritten with the xmax, zeroing the itemid. Each killed item we
> > mark with its xmax means one less heap fetch we need to perform when
> > we delete the page - it's possible we optimize that away completely by
> > doing this.
>
> That's far from a trivial feature imo. It seems quite possible that we'd
> end up with increased overhead, because the current logic can get away
> with only doing hint bit style writes - but would that be true if we
> started actually replacing the item pointers? Because I don't see any
> guarantee they couldn't cross a page boundary etc? So I think we'd need
> to do WAL logging during index searches, which seems prohibitively
> expensive.
>

Don't see that.

I was talking about reusing the first 4 bytes of an index tuple's
ItemPointerData,
which is the first field of an index tuple. Index tuples are MAXALIGNed, so
I can't see how that would ever cross a page boundary.


> And I'm also doubtful it's worth it because:
>
> > Since this point of the code is clearly going to be a performance issue
> it
> > seems like something we should do now.
>
> I've tried quite a bit to find a workload where this matters, but after
> avoiding redundant buffer accesses by sorting, and prefetching I was
> unable to do so.  What workload do you see where this would be really be
> bad? Without the performance optimization I'd found a very minor
> regression by trying to force the heap visits to happen in a pretty
> random order, but after sorting that went away.  I'm sure it's possible
> to find a case on overloaded rotational disks where you'd find a small
> regression, but I don't think it'd be particularly bad.
>

The code can do literally hundreds of random I/Os in an 8192 blocksize.
What happens with 16 or 32kB?

"Small regression" ?

-- 
Simon Riggs                http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: pgsql: Compute XID horizon for page level index vacuum on primary.

Reply via email to