Hello Tomas,

At on of the pgcon unconference sessions a couple days ago, I presented
a bunch of benchmark results comparing performance with different
data/WAL block size. Most of the OLTP results showed significant gains
(up to 50%) with smaller (4k) data pages.

You wrote something about SSD a long time ago, but the link is now dead:

http://www.fuzzy.cz/en/articles/ssd-benchmark-results-read-write-pgbench/

See also:

http://www.cybertec.at/postgresql-block-sizes-getting-started/
http://blog.coelho.net/database/2014/08/08/postgresql-page-size-for-SSD.html

[...]

The other important factor is the native SSD page, which is similar to
sectors on HDD. SSDs however don't allow in-place updates, and have to
reset/rewrite of the whole native page. It's actually more complicated,
because the reset happens at a much larger scale (~8MB block), so it
does matter how quickly we "dirty" the data. The consequence is that
using data pages smaller than the native page (depends on the device,
but seems 4K is the common value) either does not help or actually hurts
the write performance.

All the SSD results show this behavior - the Optane and Samsung nicely
show that 4K is much better (in random write IOPS) than 8K, but 1-2K
pages make it worse.

Yep. ISTM that uou should also consider the underlying FS block size. Ext4 uses 4 KiB by default, so if you write 2 KiB it will write 4 KiB anyway.

There is no much doubt that with SSD we should reduce the default page size. There are some negative impacts (eg more space is lost because of headers and the number of tuples that can be fitted), but I guess the should be an overall benefit. It would help a lot if it would be possible to initdb with a different block size, without recompiling.

--
Fabien.


Reply via email to