On Sat, Mar 30, 2024 at 10:39 AM Thomas Munro <thomas.mu...@gmail.com> wrote:
> On Sat, Mar 30, 2024 at 4:53 AM Tomas Vondra
> <tomas.von...@enterprisedb.com> wrote:
> > ... Maybe there should be some flag to force
> > issuing fadvise even for sequential patterns, perhaps at the tablespace
> > level? ...
>
> Yeah, I've wondered about trying harder to "second guess" the Linux
> RA.  At the moment, read_stream.c detects *exactly* sequential reads
> (see seq_blocknum) to suppress advice, but if we knew/guessed the RA
> window size, we could (1) detect it with the same window that Linux
> will use to detect it, and (2) [new realisation from yesterday's
> testing] we could even "tickle" it to wake it up in certain cases
> where it otherwise wouldn't, by temporarily using a smaller
> io_combine_limit if certain patterns come along.  I think that sounds
> like madness (I suspect that any place where the latter would help is
> a place where you could turn RA up a bit higher for the same effect
> without weird kludges), or another way to put it would be to call it
> "overfitting" to the pre-existing quirks; but maybe it's a future
> research idea...

I guess I missed a step when responding that suggestion: I don't think
we could have an "issue advice always" flag, because it doesn't seem
to work out as well as letting the kernel do it, and a global flag
like that would affect everything else including sequential scans
(once the streaming seq scan patch goes in).  But suppose we could do
that, maybe even just for BHS.  In my little test yesterday had to
issue a lot of them, patched eic=4, to beat the kernel's RA with
unpatched eic=0:

eic unpatched patched
0        4172    9572
1       30846   10376
2       18435    5562
4       18980    3503

So if we forced fadvise to be issued with a GUC, it still wouldn't be
good enough in this case.  So we might need to try to understand what
exactly is waking the RA up for unpatched but not patched, and try to
tickle it by doing a little less I/O combining (for example just
setting io_combine_limit=1 gives the same number for eic=0, a major
clue), but that seems to be going down a weird path, and tuning such a
copying algorithm seems too hard.


Reply via email to