On 2015-04-15 08:42:33 -0400, Simon Riggs wrote:
> > Because it makes it subsequent accesses to the page cheaper.
> 
> Cheaper for whom?

Everyone. Including further readers. Following HOT chains in read mostly
workloads can be really expensive. If you have workloads with a 'hot'
value range that's frequently updated, but that range moves you can
easily end up with heavily chained tuples which won't soon be touched by
a writer again.

And writers will often not yet be able to prune the page because there's
still live readers for the older versions (like other updaters).

> > Of
> > course, that applies in all cases, but when the page is already dirty,
> > the cost of pruning it is probably quite small - we're going to have
> > to write the page anyway, and pruning it before it gets evicted
> > (perhaps even by our scan) will be cheaper than writing it now and
> > writing it again after it's pruned.  When the page is clean, the cost
> > of pruning is significantly higher.
> 
> "We" aren't going to have to write the page, but someone will.

If it's already dirty that doesn't change at all. *Not* pruning in that
moment actually will often *increase* the total amount of writes to the
OS. Because now the pruning will happen on the next write access or
vacuum - when the page already might have been undirtied.

I don't really see the downside to this suggestion.

> The actions you suggest are reasonable and should ideally be the role
> of a background process. But that doesn't mean in the absence of that
> we should pay the cost in the foreground.

I'm not sure that's true. A background process will either cause
additional read IO to find worthwhile pages, or it'll not find
worthwhile pages because they're already paged out.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to