On 2023-06-30 18:37:39 -0400, Bruce Momjian wrote: > On Sat, Jul 1, 2023 at 12:21:03AM +0200, Tomas Vondra wrote: > > On 6/30/23 23:53, Bruce Momjian wrote: > > > For a 4kB write, to say it is not partially written would be to require > > > the operating system to guarantee that the 4kB write is not split into > > > smaller writes which might each be atomic because smaller atomic writes > > > would not help us. > > > > Right, that's the dance we do to protect against torn pages. But Andres > > suggested that if you have modern storage and configure it correctly, > > writing with 4kB pages would be atomic. So we wouldn't need to do this > > FPI stuff, eliminating pretty significant source of write amplification. > > I agree the hardware is atomic for 4k writes, but do we know the OS > always issues 4k writes?
When using a sector size of 4K you *can't* make smaller writes via normal paths. The addressing unit is in sectors. The details obviously differ between storage protocol, but you pretty much always just specify a start sector and a number of sectors to be operated on. Obviously the kernel could read 4k, modify 512 bytes in-memory, and then write 4k back, but that shouldn't be a danger here. There might also be debug interfaces to allow reading/writing in different increments, but that'd not be something happening during normal operation.