On Tue, 10 Jun 2025, Nathan Bossart wrote:
I also wrote a couple of test programs to show the difference between fseeko-ing and fread-ing through a file with various sizes. On a Linux machine, I see this:log2(n) | fseeko | fread ---------+---------+------- 1 | 109.288 | 5.528 2 | 54.881 | 2.848 3 | 27.65 | 1.504 4 | 13.953 | 0.834 5 | 7.1 | 0.49 6 | 3.665 | 0.322 7 | 1.944 | 0.244 8 | 1.085 | 0.201 9 | 0.658 | 0.185 10 | 0.443 | 0.175 11 | 0.253 | 0.171 12 | 0.102 | 0.162 13 | 0.075 | 0.13 14 | 0.061 | 0.114 15 | 0.054 | 0.1 So, fseeko() starts winning around 4096 bytes. On macOS, the differences aren't quite as dramatic, but 4096 bytes is the break-even point there, too. I imagine there's a buffer around that size somewhere...
Thank you for benchmarking! Before answering in more depth, I'm curious, what read-seek pattern do you see on the system call level (as shown by strace)? In pg_restore it was a constant loop of read(4K)-lseek(8-16K).
Dimitris
