Re: [PERFORM] strange pgbench results (as if blocked at the end)

2011-08-15 Thread Greg Smith

On 08/14/2011 01:33 PM, Tom Lane wrote:

(The changes in question were purely for performance, and involved a
conversion from write barriers in the block layer to flush+fua, whatever
that is.)  Furthermore, this affected every filesystem not only ext4,
so it really entirely fails to support Greg's argument.
   


FUA is Force Unit Access, shorthand for forcing write-through.  If 
you're staring at this from the perspective I am, where I assume that 
every line of new filesystem code always takes a while from release 
until it's stable, it still concerns me.  I suspect that both of these 
code paths--the write barrier one and the flush+FUA one--are fine on 
XFS.  One of the reasons XFS was so terrible in its early years was 
because it took a long time to get those parts in particular correct, 
which was complicated by how badly drives lied about everything back then.


If these changes shift around which section of the ext4 write path code 
are exercised, it could easily be the case that it moves fsync related 
work (which has been one of the buggiest parts of that filesystem) from 
code that has been tested heavily to parts that haven't been hit as hard 
yet by users.  My paranoid raving about this particular bug isn't 
strongly supported; I'll concede that with the additional data you've 
provided.  But I don't think it's completely unfounded either.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] strange pgbench results (as if blocked at the end)

2011-08-14 Thread Scott Marlowe
On Sun, Aug 14, 2011 at 6:51 AM,  t...@fuzzy.cz wrote:

 I've increased the test duration to 10 minutes, decreased the
 checkpoint timeout to 4 minutes and a checkpoint is issued just before
 the pgbench. That way the starting position should be more or less the
 same for all runs.

Also look at increasing checkpoint completion target to something
closer to 1. 0.8 is a nice starting place.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] strange pgbench results (as if blocked at the end)

2011-08-14 Thread tv
On Sun, 14 Aug 2011 07:15:00 -0600, Scott Marlowe
scott.marl...@gmail.com wrote:
 On Sun, Aug 14, 2011 at 6:51 AM,  t...@fuzzy.cz wrote:

 I've increased the test duration to 10 minutes, decreased the
 checkpoint timeout to 4 minutes and a checkpoint is issued just before
 the pgbench. That way the starting position should be more or less the
 same for all runs.
 
 Also look at increasing checkpoint completion target to something
 closer to 1. 0.8 is a nice starting place.

Yes, I've increased that already:

checkpoints_segments=64 
checkpoints_completion_target=0.9

Tomas

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] strange pgbench results (as if blocked at the end)

2011-08-14 Thread Tom Lane
t...@fuzzy.cz writes:
 On 13 Srpen 2011, 5:09, Greg Smith wrote:
 And I keep seeing too many data corruption issues on ext4 to recommend
 anyone use it yet for PostgreSQL, that's why I focused on XFS.  ext4
 still needs at least a few more months before all the bug fixes it's
 gotten in later kernels are backported to the 2.6.32 versions deployed
 in RHEL6 and Debian Squeeze, the newest Linux distributions my customers
 care about right now.  On RHEL6 for example, go read
 http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/6.1_Technical_Notes/kernel.html
 , specifically BZ#635199, and you tell me if that sounds like it's
 considered stable code yet or not.  The block layer will be updated in
 future kernels to provide this more efficient mechanism of ensuring
 ordering...these future block layer improvements will change some kernel
 interfaces...  Yikes, that does not inspire confidence to me.

 XFS is naturally much more mature / stable than EXT4, but I'm not quite
 sure I want to judge the stability of code based on a comment in release
 notes. As I understand it, the comment says something like things are
 not working as efficiently as it should, we'll improve that in the
 future and it relates to the block layer as a whole, not just specific
 file systems. But I don't have access to the bug #635199, so maybe I
 missed something.

I do ;-).  The reason for the tech note was to point out that RHEL6.1
would incorporate backports of upstream kernel changes that broke the
ABI for loadable kernel modules, compared to what it had been in
RHEL6.0.  That's of great interest to third-party software developers
who add device or filesystem drivers to RHEL, but I don't think it
speaks at all to whether the code is unstable from a user's standpoint.
(The changes in question were purely for performance, and involved a
conversion from write barriers in the block layer to flush+fua, whatever
that is.)  Furthermore, this affected every filesystem not only ext4,
so it really entirely fails to support Greg's argument.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] strange pgbench results (as if blocked at the end)

2011-08-14 Thread Greg Smith

On 08/14/2011 08:51 AM, t...@fuzzy.cz wrote:

I've increased the test duration to 10 minutes, decreased the
checkpoint timeout to 4 minutes and a checkpoint is issued just before
the pgbench. That way the starting position should be more or less the
same for all runs.
   


That's basically what I settled on for pgbench-tools.  Force a 
checkpoint just before the test, so the beginning of each run is aligned 
more consistently, then run for long enough that you're guaranteed at 
least one checkpoint finishes[1] (and you might see more than one if you 
fill checkpoint_segments fast enough).  I never bothered trying to 
compress that test cycle down by decreasing checkpoint_timeout.  There's 
already too many things you need to do in order to get this test working 
well, and I didn't want to include a change I'd never recommend people 
make on a production server in the mix.


[1] If your checkpoint behavior goes pathological, for example the 
extended checkpoints possible when the background writer fsync queue 
fills, it's not actually guaranteed that the checkpoint will finish 
within 5 minutes after it starts.  So a 10 minute run doesn't assure 
you'll a checkpoint begin and end in all circumstances, but it is the 
expected case.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] strange pgbench results (as if blocked at the end)

2011-08-12 Thread Tomas Vondra
Hi,

I've run a lot of pgbench tests recently (trying to compare various fs,
block sizes etc.), and I've noticed several really strange results.

Eeach benchmark consists of three simple steps:

1) set-up the database
2) read-only run (10 clients, 5 minutes)
3) read-write run (10 clients, 5 minutes)

with a short read-only warm-up (1 client, 1 minute) before each run.

I've run nearly 200 of these, and in about 10 cases I got something that
looks like this:

http://www.fuzzy.cz/tmp/pgbench/tps.png
http://www.fuzzy.cz/tmp/pgbench/latency.png

i.e. it runs just fine for about 3:40 and then something goes wrong. The
bench should take 5:00 minutes, but it somehow locks, does nothing for
about 2 minutes and then  all the clients end at the same time. So instead
of 5 minutes the run actually takes about 6:40.

The question is what went wrong - AFAIK there's nothing else running on
the machine that could cause this. I'm looking for possible culprits -
I'll try to repeat this run and see if it happens again.

The pgbench log is available here (notice the 10 lines at the end, those
are the 10 blocked clients) along with the postgres.log

http://www.fuzzy.cz/tmp/pgbench/pgbench.log.gz
http://www.fuzzy.cz/tmp/pgbench/pg.log

Ignore the immediate shutdown request warning (once the benchmark is
over, I don't need it anymore. Besides that there's just a bunch of
pgstat wait timeout warnings (which makes sense, because the pgbench run
does a lot of I/O).

I'd understand a slowdown, but why does it block?

I'm using PostgreSQL 9.0.4, the machine has 2GB of RAM and 1GB of shared
buffers. I admit the machine might be configured a bit differently (e.g.
smaller shared buffers) but I've seen just about 10 such strange results
out of 200 runs, so I doubt this is the cause.

I was thinking about something like autovacuum, but I'd expect that to
happen much more frequently (same config, same workload, etc.). And it
happens with just some file systems.

For example for ext3/writeback, the STDDEV(latency) looks like this
(x-axis represents PostgreSQL block size, y-axis fs block size):

  http://www.fuzzy.cz/tmp/pgbench/ext3-writeback.png

while for ext4/journal:

  http://www.fuzzy.cz/tmp/pgbench/ext4-journal.png

thanks
Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] strange pgbench results (as if blocked at the end)

2011-08-12 Thread Craig Ringer

On 13/08/2011 7:37 AM, Tomas Vondra wrote:


I'd understand a slowdown, but why does it block?


My first guess is that you're having checkpoint issues. Try enabling 
logging of checkpoints and checkpoint timings and see if anything there 
lines up with the pause you encounter.


--
Craig Ringer


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] strange pgbench results (as if blocked at the end)

2011-08-12 Thread Greg Smith

On 08/12/2011 07:37 PM, Tomas Vondra wrote:

I've run nearly 200 of these, and in about 10 cases I got something that
looks like this:

http://www.fuzzy.cz/tmp/pgbench/tps.png
http://www.fuzzy.cz/tmp/pgbench/latency.png

i.e. it runs just fine for about 3:40 and then something goes wrong. The
bench should take 5:00 minutes, but it somehow locks, does nothing for
about 2 minutes and then  all the clients end at the same time. So instead
of 5 minutes the run actually takes about 6:40.
   


You need to run tests like these for 10 minutes to see the full cycle of 
things; then you'll likely see them on most runs, instead of only 5%.  
It's probably the case that some of your tests are finishing before the 
first checkpoint does, which is why you don't see the bad stuff every time.


The long pauses are most likely every client blocking once the 
checkpoint sync runs.  When those fsync calls go out, Linux will freeze 
for quite a while there on ext3.  In this example, the drop in TPS/rise 
in latency at around 50:30 is either the beginning of a checkpoint or 
the dirty_background_ratio threshold in Linux being exceeded; they tend 
to happen around the same time.  It executes the write phase for a bit, 
then gets into the sync phase around 51:40.  You can find a couple of 
examples just like this on my giant test set around what was committed 
as the fsync compaction feature in 9.1, all at 
http://www.2ndquadrant.us/pgbench-results/index.htm


The one most similar to your case is 
http://www.2ndquadrant.us/pgbench-results/481/index.html  Had that test 
only run for 5 minutes, it would have looked just like yours, ending 
after the long pause that's in the middle on my run.  The freeze was 
over 3 minutes long in that example.  (My server has a fairly fast disk 
subsystem, probably faster than what you're testing, but it also has 8GB 
of RAM that it can dirty to more than make up for it).


In my tests, I switched from ext3 to XFS to get better behavior.  You 
got the same sort of benefit from ext4.  ext3 just doesn't handle its 
write cache filling and then having fsync calls execute very well.  I've 
given up on that as an unsolvable problem; improving behavior on XFS and 
ext4 are the only problems worth worrying about now to me.


And I keep seeing too many data corruption issues on ext4 to recommend 
anyone use it yet for PostgreSQL, that's why I focused on XFS.  ext4 
still needs at least a few more months before all the bug fixes it's 
gotten in later kernels are backported to the 2.6.32 versions deployed 
in RHEL6 and Debian Squeeze, the newest Linux distributions my customers 
care about right now.  On RHEL6 for example, go read 
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/6.1_Technical_Notes/kernel.html 
, specifically BZ#635199, and you tell me if that sounds like it's 
considered stable code yet or not.  The block layer will be updated in 
future kernels to provide this more efficient mechanism of ensuring 
ordering...these future block layer improvements will change some kernel 
interfaces...  Yikes, that does not inspire confidence to me.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us