On 23/04/15 17:24, Heikki Linnakangas wrote:
On 04/23/2015 05:52 PM, Jim Nasby wrote:
On 4/23/15 2:42 AM, Heikki Linnakangas wrote:
On 04/22/2015 09:24 PM, Robert Haas wrote:
Yeah.  We have a serious need to reduce the size of our on-disk
format.  On a TPC-C-like workload Jan Wieck recently tested, our data
set was 34% larger than another database at the beginning of the test,
and 80% larger by the end of the test.  And we did twice the disk
writes.  See "The Elephants in the Room.pdf" at
https://sites.google.com/site/robertmhaas/presentations

Meh. Adding an 8-byte header to every 8k block would add 0.1% to the
disk size. No doubt it would be nice to reduce our disk footprint, but
the page header is not the elephant in the room.

I've often wondered if there was some way we could consolidate XMIN/XMAX
from multiple tuples at the page level; that could be a big win for OLAP
environments where most of your tuples belong to a pretty small range of
XIDs. In many workloads you could have 80%+ of the tuples in a table
having a single inserting XID.

It would be doable for xmin - IIRC someone even posted a patch for that
years ago - but xmax (and ctid) is difficult. When a tuple is inserted,
Xmax is basically just a reservation for the value that will be put
there later. You have no idea what that value is, and you can't
influence it, and when it's time to delete/update the row, you *must*
have the space for that xmax. So we can't opportunistically use the
space for anything else, or compress them or anything like that.


That depends, if we are going to change page format we can move the xmax to be some map of ctid->xmax in the header (with no values for tuples with no xmax) or have bitmap there of tuples that have xmax etc. Basically not saving xmax (and potentially other info) inline for each tuple but have some info in header only for tuples that need it. That might have bad performance side effects of course, but there are definitely some potential ways of doing things differently which we could explore.

--
 Petr Jelinek                  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to