On 4/22/15 1:51 PM, Kevin Grittner wrote:
(1) WAL log the insert.
(2) Write the tuple.
(3) Hint and rewrite the tuple.
(4) WAL log the freeze of the tuple.
(5) Rewrite the frozen tuple.
(6) WAL-log the delete.
(7) Rewrite the deleted tuple.
(8) Prune and rewrite the page.
(9) Free line pointers and rewrite the page.

If I was lucky some of the writes could be combined in cache
because they happened close enough together. Also, one could hope
that not too much of the WAL-logging involved full page writes to
the WAL -- again, keeping steps close together in time helps with
that.

This is why I like the idea of methods that tell us where we need to do cleanup... they provide us with a rough ability to track what tuples are in what part of their lifecycle. The VM helps with this a small amount, but really it only applies after 1 and 6; it doesn't help us with any other portions.

Having a way to track recently created tuples would allow us to be much more efficient with 1-3, and with aggressive freezing, 1-5. A way to track recently deleted tuples would help with 6-7, possibly 6-9 if no indexes.

If we doubled the size of the VM, that would let us track 4 states for each page:

- Page has newly inserted tuples
- Page has newly deleted tuples
- Page is all visible
- Page is frozen

though as discussed elsewhere, we could probably combine all visible and frozen.

The win from doing this would be easily knowing what pages need hinting (newly inserted) and pruning (newly deleted). Unfortunately we still wouldn't know whether we could do real work without visiting the page itself, but I suspect that for many workloads just having newly inserted/deleted would be a serious win.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to