Re: WAL prefetch

Konstantin Knizhnik Fri, 15 Jun 2018 00:40:28 -0700



On 15.06.2018 07:36, Amit Kapila wrote:

On Fri, Jun 15, 2018 at 12:16 AM, Stephen Frost <[email protected]> wrote:

I have tested wal_prefetch at two powerful servers with 24 cores, 3Tb NVME
RAID 10 storage device and 256Gb of RAM connected using InfiniBand.
The speed of synchronous replication between two nodes is increased from 56k
TPS to 60k TPS (on pgbench with scale 1000).

I'm also surprised that it wasn't a larger improvement.

Seems like it would make sense to implement in core using
posix_fadvise(), perhaps in the wal receiver and in RestoreArchivedFile
or nearby..  At least, that's the thinking I had when I was chatting w/
Sean.

Doing in-core certainly has some advantage such as it can easily reuse
the existing xlog code rather trying to make a copy as is currently
done in the patch, but I think it also depends on whether this is
really a win in a number of common cases or is it just a win in some
limited cases.

I am completely agree. It was my mail concern: on which use cases thisprefetch will be efficient.If "full_page_writes" is on (and it is safe and default value), thenfirst update of a page since last checkpoint will be written in WAL asfull page and applying it will not require reading any data from disk.If this pages is updated multiple times in subsequent transactions, thenmost likely it will be still present in OS file cache, unless checkpointinterval exceeds OS cache size (amount of free memory in the system). Soif this conditions are satisfied then looks like prefetch is not needed.And it seems to be true for most real configurations: checkpointinterval is rarely set larger than hundred of gigabytes and modernservers usually have more RAM.

But once this condition is not satisfied and lag is larger than size ofOS cache, then prefetch can be not efficient because prefetched pagesmay be thrown away from OS cache before them are actually accessed byredo process. In this case extra synchronization between prefetch andreplay processes is needed so that prefetch is not moving too far awayfrom replayed LSN.

It is not a problem to integrate this code in Postgres core and run itin background worker. I do not think that performing prefetch in walreceiver process itself is good idea: it may slow down speed ofreceiving changes from master. And in this case I really can throw awaycut&pasted code. But it is easier to experiment with extension ratherthan with patch to Postgres core.And I have published this extension to make it possible to performexperiments and check whether it is useful on real workloads.



--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: WAL prefetch

Reply via email to