Re: [HACKERS] Replication identifiers, take 4

Heikki Linnakangas Fri, 17 Apr 2015 10:13:28 -0700

On 04/17/2015 12:04 PM, Simon Riggs wrote:

On 17 April 2015 at 09:54, Andres Freund <[email protected]> wrote:

Hrmpf. Says the person that used a lot of padding, without much
discussion, for the WAL level infrastructure making pg_rewind more
maintainable.


Sounds bad. What padding are we talking about?

In the new WAL format, the data chunks are stored unaligned, withoutpadding, to save space. The new format is quite different to the oldone, so it's not straightforward to compare how much that saved. Thefixed-size XLogRecord header is 8 bytes shorter in the new format,because it doesn't have the xl_len field anymore. But the sameinformation is stored elsewhere in the record, where it takes 2 or 5bytes (XLogRecordDataHeaderShort/Long).

But it's a fair point that we could've just made small adjustments tothe old format, without revamping every record type and the way theblock information is stored, and that the space saving of the new formatshould be compared with that instead, for a fair comparison.

As an example, one simple thing we could've done with the old format:remove xl_len, and store the length in place of the two unused paddingbytes instead, as long as it fits in 16 bits. For longer records, set aflag and store it right after XLogRecord header. For practically all WALrecords, that would've shrunk XLogRecord from 32 to 24 bytes, and madeeach record 8 bytes shorter.

I ran the same pgbench test Andres used, with scale 10, and 50000transactions, and compared the WAL size between master and 9.4:


master: 20738352
9.4: 23915800

According to pg_xlogdump, there were 301153 WAL records. If you take the9.4 figure, and imagine that we had saved those 8 bytes on each WALrecord, 9.4 would've been 21506576 bytes instead. So yeah, we could'veachieved much of the WAL savings with that much smaller change. That's auseful thing to compare with.

BTW, those numbers are with wal_level=minimal. With wal_level=logical,the WAL size from the same test on master was 26503520 bytes. That'squite a bump. Looking at pg_xlogdump output, it seems that it's allbecause the commit records are wider.


- Heikki



--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Replication identifiers, take 4

Reply via email to