On 04/23/2015 05:52 PM, Jim Nasby wrote:
On 4/23/15 2:42 AM, Heikki Linnakangas wrote:
On 04/22/2015 09:24 PM, Robert Haas wrote:
Yeah.  We have a serious need to reduce the size of our on-disk
format.  On a TPC-C-like workload Jan Wieck recently tested, our data
set was 34% larger than another database at the beginning of the test,
and 80% larger by the end of the test.  And we did twice the disk
writes.  See "The Elephants in the Room.pdf" at
https://sites.google.com/site/robertmhaas/presentations

Meh. Adding an 8-byte header to every 8k block would add 0.1% to the
disk size. No doubt it would be nice to reduce our disk footprint, but
the page header is not the elephant in the room.

I've often wondered if there was some way we could consolidate XMIN/XMAX
from multiple tuples at the page level; that could be a big win for OLAP
environments where most of your tuples belong to a pretty small range of
XIDs. In many workloads you could have 80%+ of the tuples in a table
having a single inserting XID.

It would be doable for xmin - IIRC someone even posted a patch for that years ago - but xmax (and ctid) is difficult. When a tuple is inserted, Xmax is basically just a reservation for the value that will be put there later. You have no idea what that value is, and you can't influence it, and when it's time to delete/update the row, you *must* have the space for that xmax. So we can't opportunistically use the space for anything else, or compress them or anything like that.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to