Hi, On 2/15/26 01:13, Alexandre Felipe wrote: > Hi, > I decided to test this PR. > > I didn't take much time to go through the thread or the code in detail > yet. But I have my first benchmark results and I would like to share. >
I'm quite confused by the scripts you shared, it seems incomplete. The run_regression.py is meant to call purge_cache.sh (which is missing), and the run_benchmark tries to call all sorts of missing .sql scripts. So how do we use that? > EXPERIMENT > > I tested [CF 4351] v10 - Index Prefetching > > I created a table with 100k rows and > Sequential is, as guessed, 1,2,3,4 (indexed) > Periodic is a quasi random (i * jump) % num_rows, where gcd(jump, > num_rows) = 1, guarantee that there are no repeated entries (indexed) > Random is a `row_number() over (order by random())` (indexed) > The payload is a fixed 200 character long string, just to make it more > realistic. > > For the tests, I disable sorting, sequential scans, index only scans and > bitmap scans. > Since buffer cache always has a significant impact on the query > performance, I shuffled the tests, and tried to adjust for the number of > buffer hit/read, but later I found that the best way to control that was > to use a table small enough to be entirely held in cache, and evict the > buffers. > That seems a bit bizarre. The whole point of index prefetching is better I/O scheduling (ahead of time), but if you "control" the impact of cache by making sure everything is cached, that kinda defeats the whole thing. A table that is just 24MB and fits into buffers is a bit useless. It means that even with random pattern (which is generally about the best for prefetching), only about ~1/30 of pages will require I/O. Each page has ~32 items, but only the first item from each page will incur an I/O. > * off: buffers are kept in cache > * pg: buffers evicted from postgres pg_buffercache_evict from > pg_buffercache extension. > * os: supported only in python, I separated the buffer eviction in > purge_cache as it requires sudo (tested only in MacOS). > > I varied > * max_parallel_workers_per_gather (although I guess it wasn't exploited), > * enable_index_prefetch > * the column used as sorting key and, as a result, the index used. > * and buffer eviction mode. > > Running from python with psycopg > On what kind of hardware? How much variance is in the results? regards -- Tomas Vondra
