On Mon, Jan 28, 2013 at 12:16:24AM +0000, Peter Geoghegan wrote:
> On 27 January 2013 02:31, Noah Misch <n...@leadboat.com> wrote:
> > I did a few more benchmarks along the spectrum.
> 
> > So that's a nice 27-53% improvement, fairly similar to the pattern for your
> > laptop pgbench numbers.
> 
> I presume that this applies to a tpc-b benchmark (the pgbench
> default). Note that the really compelling numbers that I reported in
> that blog post (where there is an increase of over 80% in transaction
> throughput at lower client counts) occur with an insert-based
> benchmark (i.e. a maximally commit-bound workload).

Correct.  The pgbench default workload is already rather friendly toward
commit_delay, so I wanted to stay away from even-friendlier tests.

Would you commit to the same git repository the pgbench-tools data for the
graphs appearing in that blog post?  I couldn't readily tell what was
happening below 16 clients due to the graphed data points blending together.

> >> !   <para>
> >> !    Since the purpose of <varname>commit_delay</varname> is to allow
> >> !    the cost of each flush operation to be more effectively amortized
> >> !    across concurrently committing transactions (potentially at the
> >> !    expense of transaction latency), it is necessary to quantify that
> >> !    cost when altering the setting.  The higher that cost is, the more
> >> !    effective <varname>commit_delay</varname> is expected to be in
> >> !    increasing transaction throughput.  The
> >
> > That's true for spinning disks, but I suspect it does not hold for storage
> > with internal parallelism, notably virtualized storage.  Consider an iSCSI
> > configuration with high bandwidth and high latency.  When network latency is
> > the limiting factor, will sending larger requests less often still help?
> 
> Well, I don't like to speculate about things like that, because it's
> just too easy to be wrong. That said, it doesn't immediately occur to
> me why the statement that you've highlighted wouldn't be true of
> virtualised storage that has the characteristics you describe. Any
> kind of latency at flush time means that clients idle, which means
> that the CPU is potentially not kept fully busy for a greater amount
> of wall time, where it might otherwise be kept more busy.

On further reflection, I retract the comment.  Regardless of internal
parallelism of the storage, PostgreSQL issues WAL fsyncs serially.

> > One would be foolish to run a performance-sensitive workload like those in
> > question, including the choice to have synchronous_commit=on, on spinning
> > disks with no battery-backed write cache.  A cloud environment is more
> > credible, but my benchmark showed no gain there.
> 
> In an everyday sense you are correct. It would typically be fairly
> senseless to run an application that was severely limited by
> transaction throughput like this, when a battery-backed cache could be
> used at the cost of a couple of hundred dollars. However, it's quite
> possible to imagine a scenario in which the economics favoured using
> commit_delay instead. For example, I am aware that at Facebook, a
> similar Facebook-flavoured-MySQL setting (sync_binlog_timeout_usecs)
> is used. Furthermore, it might not be obvious that fsync speed is an
> issue in practice. Setting commit_delay to 4,000 has seemingly no
> downside on my laptop - it *positively* affects both average and
> worse-case transaction latency - so with spinning disks, it probably
> would actually be sensible to set it and forget it, regardless of
> workload.

I agree that commit_delay is looking like a safe bet for spinning disks.

> I attach a revision that I think addresses your concerns. I've
> polished it a bit further too - in particular, my elaborations about
> commit_delay have been concentrated at the end of wal.sgml, where they
> belong. I've also removed the reference to XLogInsert, because, since
> all XLogFlush call sites are now covered by commit_delay, XLogInsert
> isn't particularly relevant.

I'm happy with this formulation.

> I have also increased the default time that pg_test_fsync runs - I
> think that the kind of variability commonly seen in its output, that
> you yourself have reported, justifies doing so in passing.

On the EBS configuration with volatile fsync timings, the variability didn't
go away with 15s runs.  On systems with stable fsync times, 15s was no better
than 2s.  Absent some particular reason to believe 5s is better than 2s, I
would leave it alone.

I'm marking this patch Ready for Committer, qualified with a recommendation to
adopt only the wal.sgml changes.

Thanks,
nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to