On Sat, Mar 30, 2024 at 10:39 AM Thomas Munro <thomas.mu...@gmail.com> wrote: > On Sat, Mar 30, 2024 at 4:53 AM Tomas Vondra > <tomas.von...@enterprisedb.com> wrote: > > ... Maybe there should be some flag to force > > issuing fadvise even for sequential patterns, perhaps at the tablespace > > level? ... > > Yeah, I've wondered about trying harder to "second guess" the Linux > RA. At the moment, read_stream.c detects *exactly* sequential reads > (see seq_blocknum) to suppress advice, but if we knew/guessed the RA > window size, we could (1) detect it with the same window that Linux > will use to detect it, and (2) [new realisation from yesterday's > testing] we could even "tickle" it to wake it up in certain cases > where it otherwise wouldn't, by temporarily using a smaller > io_combine_limit if certain patterns come along. I think that sounds > like madness (I suspect that any place where the latter would help is > a place where you could turn RA up a bit higher for the same effect > without weird kludges), or another way to put it would be to call it > "overfitting" to the pre-existing quirks; but maybe it's a future > research idea...
I guess I missed a step when responding that suggestion: I don't think we could have an "issue advice always" flag, because it doesn't seem to work out as well as letting the kernel do it, and a global flag like that would affect everything else including sequential scans (once the streaming seq scan patch goes in). But suppose we could do that, maybe even just for BHS. In my little test yesterday had to issue a lot of them, patched eic=4, to beat the kernel's RA with unpatched eic=0: eic unpatched patched 0 4172 9572 1 30846 10376 2 18435 5562 4 18980 3503 So if we forced fadvise to be issued with a GUC, it still wouldn't be good enough in this case. So we might need to try to understand what exactly is waking the RA up for unpatched but not patched, and try to tickle it by doing a little less I/O combining (for example just setting io_combine_limit=1 gives the same number for eic=0, a major clue), but that seems to be going down a weird path, and tuning such a copying algorithm seems too hard.