On Fri, Jun 21, 2019 at 1:21 AM Thomas Munro <thomas.mu...@gmail.com> wrote:
> I ran into someone with a system where big queries scanning 8GB+ of
> all-in-cache data took consistently ~2.5x longer on a primary server
> than on a replica.  Both servers had concurrent activity on them but
> plenty of spare capacity and similar specs.  After some investigation
> it turned out that on the primary there were (1) some select()
> syscalls waiting for 1ms, which might indicate contended
> SpinLockAcquire() back-offs, and (2) a huge amount of time spent in:
>
> + 93,31% 0,00% postgres postgres [.] index_getnext
> + 93,30% 0,00% postgres postgres [.] index_fetch_heap
> + 81,66% 0,01% postgres postgres [.] heap_page_prune_opt
> + 75,85% 0,00% postgres postgres [.] TransactionIdLimitedForOldSnapshots
> + 75,83% 0,01% postgres postgres [.] RelationHasUnloggedIndex
> + 75,79% 0,00% postgres postgres [.] RelationGetIndexList
> + 75,79% 75,78% postgres postgres [.] list_copy

On my laptop, all prewarmed, no concurrency, the mere existence of 10
brin indexes causes a sequential scan to take ~5% longer and an
uncorrelated index scan to take ~45% longer (correlated index scans
don't suffer).  Here's a draft patch for v13 that fixes that problem
by caching the result of RelationHasUnloggedIndex().

Reproducer scripts also attached.  I ran them with shared_buffers=8GB,
old_snapshot_threshold=10s and pg_prewarm installed.

I didn't try to look into the complaint about suspected spinlock contention.

-- 
Thomas Munro
https://enterprisedb.com

Attachment: 0001-Cache-the-result-of-RelationHasUnloggedIndex.patch
Description: Binary data

Attachment: test-seqscan.sql
Description: Binary data

Attachment: test-indexscan.sql
Description: Binary data

Reply via email to