Hi, On 2023-01-27 14:24:51 +0900, Masahiko Sawada wrote: > If I'm understanding this result correctly, it seems to me that your > patch works well with the WAL DIO patch (WALDIO vs. WAL DIO & WAL > BUFFERS READ), but there seems no visible performance gain with only > your patch (HEAD vs. WAL BUFFERS READ). So it seems to me that your > patch should be included in the WAL DIO patch rather than applying it > alone. Am I missing something?
We already support using DIO for WAL - it's just restricted in a way that makes it practically not usable. And the reason for that is precisely that walsenders need to read the WAL. See get_sync_bit(): /* * Optimize writes by bypassing kernel cache with O_DIRECT when using * O_SYNC and O_DSYNC. But only if archiving and streaming are disabled, * otherwise the archive command or walsender process will read the WAL * soon after writing it, which is guaranteed to cause a physical read if * we bypassed the kernel cache. We also skip the * posix_fadvise(POSIX_FADV_DONTNEED) call in XLogFileClose() for the same * reason. * * Never use O_DIRECT in walreceiver process for similar reasons; the WAL * written by walreceiver is normally read by the startup process soon * after it's written. Also, walreceiver performs unaligned writes, which * don't work with O_DIRECT, so it is required for correctness too. */ if (!XLogIsNeeded() && !AmWalReceiverProcess()) o_direct_flag = PG_O_DIRECT; Even if that weren't the case, splitting up bigger commits in incrementally committable chunks is a good idea. Greetings, Andres Freund