Re: [HACKERS] sorted writes for checkpoints

Greg Smith Fri, 29 Oct 2010 16:32:23 -0700

Itagaki Takahiro wrote:

When I submitted the patch, I tested it on disk-based RAID-5 machine:
http://archives.postgresql.org/pgsql-hackers/2007-06/msg00541.php
But there were no additional benchmarking reports at that time. We still
need benchmarking before we re-examine the feature. For example, SSD and
SSD-RAID was not popular at that time, but now they might be considerable.

I did multiple rounds of benchmarking that, just none of it showed anyimprovement so I didn't bother reporting them in detail. I haverecently figured out why the performance testing I did of that earlierpatch probably failed to produce useful results on my system when I wastesting it back then though. It relates to trivia around how ext3handles fsync that's well understood now (the whole cache flushes outwhen one comes in), but wasn't back then yet.

We have a working set of patches here that both rewrite the checkpointlogic to avoid several larger problems with how it works now, as well asadding instrumentation that makes it possible to directly measure andgraph whether methods such as sorting writes provide any improvement ornot to the process. My hope is to have those all ready for initialsubmission as part of CommitFest 2010-11, as the main feature additionfrom myself toward improving 9.1.

I have a bunch of background information about this I'm presenting atPGWest next week, after which I'll start populating the wiki with moredetails and begin packaging the code too. I had hoped to revisit thecheckpoint sorting details after that. Jeff or yourself are welcome totry your own tests in that area, I could use the help. But I think mymeasurement patches will help you with that considerably once I releasethem in another couple of weeks. Seeing a graph of latency sync timesfor each file is very informative for figuring out whether a change didsomething useful, more so than just staring at total TPS results. Suchlatency graphs are what I've recently started to do here, with someserver-side changes that then feed into gnuplot.

The idea of making something like the sorting logic into a pluggablehook seems like a waste of time to me, particulary given that theearlier implementation really needed to be allocated a dedicated blockof shared memory to work well IMHO (and I believe that's still thecase). That area isn't where the real problems are at here anyway,especially on large memory systems. How the sync logic works is theincreasingly troublesome part of the checkpoint code, because theproblem it has to deal with grows proportionately to the size of thewrite cache on the system. Typical production servers I deal with haveabout 8X as much RAM now as they did in 2007 when I last investigatedwrite sorting. Regular hard drives sure haven't gotten 8X faster sincethen, and battery-backed caches (which used to have enough memory toabsorb a large portion of a checkpoint burst) have at best doubled in size.


--
Greg Smith, 2ndQuadrant US g...@2ndquadrant.com Baltimore, MD
PostgreSQL Training, Services and Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sorted writes for checkpoints

Reply via email to