g...@lexort.com (Greg Troxel) writes: >When you run dd with bs=64k and then bs=1m, how different are the >results? (I believe raw requests happen accordingly, vs MAXPHYS for fs >etc. access.)
'raw requests' are split into MAXPHYS size chunks. While using bs=1m reduces the syscall overhead somewhat, the major effect is that the system will issue requests for all 16 chunks (1M / MAXPHYS) concurrently. 16 chunks is also the maximum, so between bs=1m and bs=2m the difference is only the reduced syscall overhead. The filesystem can do something similar, asynchronous writes are also issued in parallel, for reading it may chose to read-ahead blocks to optimize I/O requests, also for up to 16 chunks. In reality, large contigous I/O rarely happens and the current UVM overhead (e.g. mapping buffers) becomes more significant, the faster your drive is. A larger MAXPHYS also reduces SATA command overhead, that's up to 10% for SATA3 (6Gbps) that you might gain, assuming that you manage to do large contigous I/O. NVME is a different thing. While the hardware command overhead is neglible, you can mitigate software overhead by using larger chunks for I/O and the gain can be much higher, at least for raw I/O.