Hi, On 2023-07-11 09:09:43 +0200, Jakub Wartak wrote: > On Mon, Jul 10, 2023 at 6:24 PM Andres Freund <and...@anarazel.de> wrote: > > > > Hi, > > > > On 2023-07-03 11:53:56 +0200, Jakub Wartak wrote: > > > Out of curiosity I've tried and it is reproducible as you have stated : > > > XFS > > > @ 4.18.0-425.10.1.el8_7.x86_64: > > >... > > > According to iostat and blktrace -d /dev/sda -o - | blkparse -i - output , > > > the XFS issues sync writes while ext4 does not, xfs looks like constant > > > loop of sync writes (D) by kworker/2:1H-kblockd: > > > > That clearly won't go well. It's not reproducible on newer systems, > > unfortunately :(. Or well, fortunately maybe. > > > > > > I wonder if a trick to avoid this could be to memorialize the fact that we > > bulk extended before and extend by that much going forward? That'd avoid the > > swapping back and forth. > > I haven't seen this thread [1] "Question on slow fallocate", from XFS > mailing list being mentioned here (it was started by Masahiko), but I > do feel it contains very important hints even challenging the whole > idea of zeroing out files (or posix_fallocate()). Please especially > see Dave's reply.
I think that's just due to the reproducer being a bit too minimal and the actual problem being addressed not being explained. > He also argues that posix_fallocate() != fallocate(). What's interesting is > that it's by design and newer kernel versions should not prevent such > behaviour, see my testing result below. I think the problem there was that I was not targetting a different file between the different runs, somehow assuming the test program would be taking care of that. I don't think the test program actually tests things in a particularly useful way - it does fallocate()s in 8k chunks - which postgres never does. > All I can add is that this those kernel versions (4.18.0) seem to very > popular across customers (RHEL, Rocky) right now and that I've tested > on most recent available one (4.18.0-477.15.1.el8_8.x86_64) using > Masahiko test.c and still got 6-7x slower time when using XFS on that > kernel. After installing kernel-ml (6.4.2) the test.c result seems to > be the same (note it it occurs only when 1st allocating space, but of > course it doesnt if the same file is rewritten/"reallocated"): test.c really doesn't reproduce postgres behaviour in any meaningful way, using it as a benchmark is a bad idea. Greetings, Andres Freund