(2013/06/28 0:08), Robert Haas wrote:
On Tue, Jun 25, 2013 at 4:28 PM, Heikki Linnakangas
<hlinnakan...@vmware.com> wrote:
I'm pretty sure Greg Smith tried it the fixed-sleep thing before and
it didn't work that well. I have also tried it and the resulting
behavior was unimpressive. It makes checkpoints take a long time to
complete even when there's very little data to flush out to the OS,
which is annoying; and when things actually do get ugly, the sleeps
aren't long enough to matter. See the timings Kondo-san posted
downthread: 100ms delays aren't going let the system recover in any
useful way when the fsync can take 13 s for one file. On a system
that's badly weighed down by I/O, the fsync times are often
*extremely* long - 13 s is far from the worst you can see. You have
to give the system a meaningful time to recover from that, allowing
other processes to make meaningful progress before you hit it again,
or system performance just goes down the tubes. Greg's test, IIRC,
used 3 s sleeps rather than your proposal of 100 ms, but it still
wasn't enough.
Yes. In write phase, checkpointer writes numerous 8KB dirty pages in each
SyncOneBuffer(), therefore it can be well for tiny(100ms) sleep time. But
in fsync phase, checkpointer writes scores of relation files in each fsync(),
therefore it can not be well for tiny sleep. It shoud need longer sleep time
for recovery IO performance. If we know its best sleep time, we had better use
previous fsync time. And if we want to prevent fast long fsync time, we had
better change relation file size which is 1GB in default max size to smaller.
Go back to the subject. Here is our patches test results. Fsync + write patch was
not good result in past result, so I retry benchmark in same condition. It seems
to get good perfomance than past result.
* Performance result in DBT-2 (WH340)
| TPS 90%tile Average Maximum
---------------+---------------------------------------
original_0.7 | 3474.62 18.348328 5.739 36.977713
original_1.0 | 3469.03 18.637865 5.842 41.754421
fsync | 3525.03 13.872711 5.382 28.062947
write | 3465.96 19.653667 5.804 40.664066
fsync + write | 3586.85 14.459486 4.960 27.266958
Heikki's patch | 3504.3 19.731743 5.761 38.33814
* HTML result in DBT-2
http://pgstatsinfo.projects.pgfoundry.org/RESULT/
In attached text, I also describe in each checkpoint time. fsync patch was seemed
to have longer time than not fsync patch. However, checkpoint schedule is on time
in checkpoint_timeout and allowable time. I think that it is most important
things in fsync phase that fast finished checkpoint is not but definitely and
assurance write pages in end of checkpoint. So my fsync patch is not wrong
working any more.
My write patch seems to have lot of riddle, so I try to investigate objective
result and theory of effect.
Best regards,
--
Mitsumasa KONDO
NTT Open Source Software Center
* Performance result
| TPS 90%tile Average Maximum
---------------+---------------------------------------
original_0.7 | 3474.62 18.348328 5.739 36.977713
original_1.0 | 3469.03 18.637865 5.842 41.754421
fsync | 3525.03 13.872711 5.382 28.062947
write | 3465.96 19.653667 5.804 40.664066
fsync + write | 3586.85 14.459486 4.960 27.266958
Heikki's patch | 3504.3 19.731743 5.761 38.33814
* Checkpoint duration
# original_0.7
instid | start | flags | num_buffers | xlog_added |
xlog_removed | xlog_recycled | write_duration | sync_duration | total_duration
--------+----------------------------+-------+-------------+------------+--------------+---------------+----------------+---------------+----------------
14 | 2013-06-19 15:15:24.658+09 | xlog | 281 | 0 |
0 | 0 | 29.038 | 2.69 | 31.798
14 | 2013-06-19 15:17:13.212+09 | xlog | 177 | 0 |
0 | 300 | 17.9 | 0.886 | 18.818
14 | 2013-06-19 15:18:45.525+09 | xlog | 306 | 0 |
0 | 300 | 30.72 | 4.011 | 35.11
14 | 2013-06-19 15:20:26.951+09 | xlog | 215 | 0 |
0 | 300 | 21.952 | 2.148 | 24.197
14 | 2013-06-19 15:21:56.425+09 | xlog | 182 | 0 |
0 | 300 | 18.527 | 6.323 | 25.069
14 | 2013-06-19 15:27:26.074+09 | xlog | 15770 | 0 |
0 | 300 | 335.431 | 80.269 | 420.405
14 | 2013-06-19 15:42:26.272+09 | time | 82306 | 0 |
0 | 300 | 209.34 | 119.159 | 333.762
14 | 2013-06-19 15:57:26.025+09 | time | 88965 | 0 |
0 | 247 | 211.095 | 125.846 | 340.752
14 | 2013-06-19 16:12:26.328+09 | time | 89523 | 0 |
0 | 285 | 229.02 | 108.515 | 345.551
# original_1.0
instid | start | flags | num_buffers | xlog_added |
xlog_removed | xlog_recycled | write_duration | sync_duration | total_duration
--------+----------------------------+-------+-------------+------------+--------------+---------------+----------------+---------------+----------------
15 | 2013-06-19 18:51:03.262+09 | xlog | 274 | 1 |
0 | 0 | 27.935 | 4.88 | 32.973
15 | 2013-06-19 18:53:03.651+09 | xlog | 177 | 0 |
0 | 300 | 18.018 | 0.065 | 18.101
15 | 2013-06-19 18:54:46.626+09 | xlog | 193 | 0 |
0 | 300 | 20.092 | 2.958 | 23.354
15 | 2013-06-19 18:56:20.763+09 | xlog | 175 | 0 |
0 | 300 | 18.014 | 1.408 | 19.554
15 | 2013-06-19 18:58:01.892+09 | xlog | 312 | 0 |
0 | 300 | 31.865 | 5.587 | 37.808
15 | 2013-06-19 19:03:40.925+09 | xlog | 13634 | 0 |
0 | 300 | 388.832 | 109.608 | 500.519
15 | 2013-06-19 19:18:40.411+09 | time | 70392 | 0 |
0 | 300 | 253.666 | 128.721 | 384.249
15 | 2013-06-19 19:33:40.637+09 | time | 76265 | 0 |
0 | 248 | 283.764 | 119.091 | 404.647
15 | 2013-06-19 19:48:40.506+09 | time | 75496 | 0 |
0 | 287 | 274.34 | 114.556 | 391.385
# fsync
instid | start | flags | num_buffers | xlog_added |
xlog_removed | xlog_recycled | write_duration | sync_duration | total_duration
--------+----------------------------+-------+-------------+------------+--------------+---------------+----------------+---------------+----------------
11 | 2013-06-18 19:56:21.866+09 | xlog | 282 | 0 |
0 | 0 | 28.33 | 9.91 | 38.316
11 | 2013-06-18 19:58:20.465+09 | xlog | 177 | 0 |
0 | 300 | 17.864 | 0.038 | 17.923
11 | 2013-06-18 19:59:59.295+09 | xlog | 308 | 0 |
0 | 300 | 32.72 | 0.955 | 33.847
11 | 2013-06-18 20:01:36.861+09 | xlog | 176 | 0 |
0 | 300 | 18.316 | 1.252 | 19.671
11 | 2013-06-18 20:03:20.633+09 | xlog | 193 | 0 |
0 | 300 | 19.377 | 5.9 | 25.397
11 | 2013-06-18 20:08:54.581+09 | xlog | 16237 | 0 |
0 | 300 | 323.383 | 117.579 | 443.089
11 | 2013-06-18 20:23:55.079+09 | time | 84550 | 0 |
0 | 300 | 216.88 | 253.37 | 471.428
11 | 2013-06-18 20:38:55.476+09 | time | 90258 | 0 |
0 | 252 | 212.919 | 251.897 | 467.43
11 | 2013-06-18 20:53:55.197+09 | time | 90143 | 0 |
0 | 290 | 221.421 | 182.39 | 405.84
# write
instid | start | flags | num_buffers | xlog_added |
xlog_removed | xlog_recycled | write_duration | sync_duration | total_duration
--------+----------------------------+-------+-------------+------------+--------------+---------------+----------------+---------------+----------------
13 | 2013-06-19 11:06:14.782+09 | xlog | 276 | 0 |
0 | 0 | 55.824 | 1.354 | 57.442
13 | 2013-06-19 11:08:14.969+09 | xlog | 177 | 0 |
0 | 300 | 35.937 | 1.377 | 37.551
13 | 2013-06-19 11:10:01.292+09 | xlog | 191 | 0 |
0 | 300 | 39.748 | 0.902 | 40.815
13 | 2013-06-19 11:11:34.999+09 | xlog | 176 | 0 |
0 | 300 | 35.57 | 2.307 | 38.144
13 | 2013-06-19 11:13:13.473+09 | xlog | 318 | 0 |
0 | 300 | 59.634 | 4.84 | 65.094
13 | 2013-06-19 11:18:55.673+09 | xlog | 15074 | 0 |
0 | 300 | 308.682 | 61.291 | 372.143
13 | 2013-06-19 11:33:55.581+09 | time | 81561 | 0 |
0 | 300 | 205.58 | 127.541 | 335.132
13 | 2013-06-19 11:48:55.744+09 | time | 86785 | 0 |
0 | 250 | 205.029 | 125.171 | 332.581
13 | 2013-06-19 12:03:55.158+09 | time | 85341 | 0 |
0 | 279 | 224.362 | 115.16 | 341.441
# fsync + write
instid | start | flags | num_buffers | xlog_added |
xlog_removed | xlog_recycled | write_duration | sync_duration | total_duration
--------+----------------------------+-------+-------------+------------+--------------+---------------+----------------+---------------+----------------
29 | 2013-06-28 17:42:14.596+09 | xlog | 382 | 0 |
0 | 0 | 76.914 | 1.556 | 78.798
29 | 2013-06-28 17:44:14.659+09 | xlog | 204 | 0 |
0 | 300 | 41.202 | 0.04 | 41.45
29 | 2013-06-28 17:45:51.112+09 | xlog | 177 | 0 |
0 | 300 | 36.303 | 7.832 | 44.212
29 | 2013-06-28 17:47:30.505+09 | xlog | 191 | 0 |
0 | 300 | 38.345 | 9.071 | 47.498
29 | 2013-06-28 17:49:04.617+09 | xlog | 176 | 0 |
0 | 300 | 35.583 | 5.21 | 40.842
29 | 2013-06-28 17:54:23.963+09 | xlog | 14687 | 0 |
0 | 300 | 329.002 | 156.741 | 487.057
29 | 2013-06-28 18:09:23.696+09 | time | 70923 | 0 |
0 | 300 | 252.866 | 225.503 | 485.679
29 | 2013-06-28 18:24:23.115+09 | time | 73688 | 0 |
0 | 266 | 261.14 | 201.537 | 463.719
29 | 2013-06-28 18:39:23.687+09 | time | 74176 | 0 |
0 | 290 | 282.175 | 204.735 | 490.374
# Heikki's patch
instid | start | flags | num_buffers | xlog_added |
xlog_removed | xlog_recycled | write_duration | sync_duration | total_duration
--------+----------------------------+-------+-------------+------------+--------------+---------------+----------------+---------------+----------------
27 | 2013-06-27 15:13:53.649+09 | xlog | 281 | 0 |
0 | 0 | 28.439 | 7.57 | 37.868
27 | 2013-06-27 15:16:04.97+09 | xlog | 190 | 0 |
0 | 300 | 19.35 | 10.778 | 30.192
27 | 2013-06-27 15:17:35.991+09 | xlog | 298 | 0 |
0 | 300 | 30.429 | 9.752 | 40.345
27 | 2013-06-27 15:19:12.277+09 | xlog | 176 | 0 |
0 | 300 | 18.628 | 4.87 | 23.545
27 | 2013-06-27 15:20:53.464+09 | xlog | 199 | 0 |
0 | 300 | 20.091 | 6.127 | 26.81
27 | 2013-06-27 15:26:23.164+09 | xlog | 16858 | 0 |
0 | 300 | 304.246 | 69.067 | 375.417
27 | 2013-06-27 15:41:24.004+09 | time | 89007 | 0 |
0 | 300 | 221.157 | 112.179 | 334.745
27 | 2013-06-27 15:56:24.85+09 | time | 95096 | 0 |
0 | 253 | 198.5 | 114.686 | 314.726
27 | 2013-06-27 16:11:24.974+09 | time | 89461 | 0 |
0 | 281 | 192.671 | 111.444 | 306.373
27 | 2013-06-27 16:26:24.401+09 | time | 133734 | 0 |
0 | 285 | 445.744 | 39.818 | 486.122
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers