Hi, On 2020-06-09 19:24:15 -0400, Tom Lane wrote: > Robert Haas <robertmh...@gmail.com> writes: > > On Tue, Jun 9, 2020 at 1:59 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > >> When I went through the existing spinlock stanzas, the only thing that > >> really made me acutely uncomfortable was the chunk in pg_stat_statement's > >> pgss_store(), lines 1386..1438 in HEAD. > > > I mean, what would be wrong with having an LWLock per pgss entry?
+1 > Hmm, maybe nothing. I'm accustomed to thinking of them as being > significantly more expensive than spinlocks, but maybe we've narrowed > the gap enough that that's not such a problem. They do add a few cycles (IIRC ~30 or so, last time I measured a specific scenario) of latency to acquisition, but it's not a large amount. The only case where acquisition is noticably slower, in my experiments, is when there's "just the right amount" of contention. There spinning instead of entering the kernel can be good. I've mused about adding a small amount of spinning to lwlock acquisition before. But so far working on reducing contention seemed the better route. Funnily enough lwlock *release*, even when there are no waiters, has a somewhat noticable performance difference on x86 (and other TSO platforms) compared to spinlock release. For spinlock release we can just use a plain write and a compiler barrier, whereas lwlock release needs to use an atomic operation. I think that's hard, but not impossible, to avoid for an userspace reader-writer lock. It would be a nice experiment to make spinlocks a legacy wrapper around rwlocks. I think if we added 2-3 optimizations (optimize for exclusive-only locks, short amount of spinning, possibly inline functions for "fast path" acquisitions/release) that'd be better for nearly all situations. And in the situations where it's not, the loss would be pretty darn small. Greetings, Andres Freund