Re: [HACKERS] XLogInsert scaling, revisited

Heikki Linnakangas Mon, 08 Jul 2013 02:18:12 -0700

Ok, I've committed this patch now. Finally, phew!

I think I've addressed all your comments about the comments. I movedsome of the comments around: I split up the large one near the top ofthe file, moving its paragraphs closer to the code where they apply.

Regarding your performance-related worries: you have good thoughts onhow to improve things, but let's wait until we see some evidence thatthere is a problem, before any further optimizations.

I fixed one bug related to aligning the WAL buffers. The patch assumesWAL buffers to be aligned at a full XLOG_BLCKSZ boundary, but did notenforce it. That was already happening on platforms with O_DIRECT, whichis why I didn't notice that in testing, but it would've failed on others.

I just remembered one detail that I'm not sure has been mentioned on themailing list yet. Per the commit message:

This has one user-visible change: switching to a new WAL segment with
pg_switch_xlog() now fills the remaining unused portion of the
segment with zeros. This potentially adds some overhead, but it has
been a very common practice by DBA's to clear the "tail" of the
segment with an external pg_clearxlogtail utility anyway, to make the
WAL files compress better. With this patch, it's no longer necessary
to do that.

I simplified the handling of xlogInsertingAt per discussion, and addedthe memory barrier to GetXLogBuffer(). I ran again the pgbench tests Idid earlier with the now-committed version of the patch (except for somecomment changes). The results are here:


http://hlinnaka.iki.fi/xloginsert-scaling/xloginsert-scale-26/

I tested three different workloads. with different numbers of "slots",ranging from 1 to 1000. The tests were run on a 32-core machine, in aVM. As the baseline, I used a fresh checkout from master branch, withthis one-line patch:http://www.postgresql.org/message-id/519a938a.1070...@vmware.com. Thatpatch adjusts the spinlock delay loop, which happens to make a bigdifference on this box. We'll have to revisit and apply that patchseparately, but I think that's the correct baseline to test thisxloginsert scaling patch against.


nobranch
--------

This is the "pgbench -N" workload. Throughput is mainly limited byflushing the WAL at commits. The patch makes no big difference here,which is good. The point of the test is to demonstrate that the patchdoesn't make WAL flushing noticeably more expensive.


nobranch-sync-commit-off
------------------------

Same as above, but with synchronous_commit=off. Here the patch somewhat.WAL insertion doesn't seem to be the biggest limiting factor in thistest, but it's nice to see some benefit.


xlogtest
--------

The workload in this test is a single INSERT statement that inserts alot of rows: "INSERT INTO foo:client_id SELECT 1 FROMgenerate_series(1,100) a, generate_series(1,100) b". Each client insertsto a separate table, to eliminate as much lock contention as possible,making the WAL insertion bottleneck as serious as possible (although I'mnot sure how much difference that makes). This pretty much a best casescenario for this patch.

This test shows a big gain from the patch, as it should. The peakperformance goes from about 35 TPS to 100 TPS. With the patch, I suspectthe test saturates the I/O subsystem at that point. I think it could gohigher with better I/O hardware.

All in all, I'm satisfied enough with this to commit. The default numberof insertion slots, 8, seems to work fine for all the workloads on thisbox. We may have to adjust that or other details later, but what itneeds now is more testing by people with different hardware.

Thanks to everyone involved for the review and testing! And if you can,please review the patch as committed once more.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] XLogInsert scaling, revisited

Reply via email to