On 2018-05-16 22:11:22 -0400, Tom Lane wrote: > David Rowley <david.row...@2ndquadrant.com> writes: > > On 17 May 2018 at 11:00, Andres Freund <and...@anarazel.de> wrote: > >> Wonder if we shouldn't just cache an estimated relation size in the > >> relcache entry till then. For planning purposes we don't need to be > >> accurate, and usually activity that drastically expands relation size > >> will trigger relcache activity before long. Currently there's plenty > >> workloads where the lseeks(SEEK_END) show up pretty prominently. > > > While I'm in favour of speeding that up, I think we'd get complaints > > if we used a stale value. > > Yeah, that scares me too. We'd then be in a situation where (arguably) > any relation extension should force a relcache inval. Not good. > I do not buy Andres' argument that the value is noncritical, either --- > particularly during initial population of a table, where the size could > go from zero to something-significant before autoanalyze gets around > to noticing.
I don't think every extension needs to force a relcache inval. It'd instead be perfectly reasonable to define a rule that an inval is triggered whenever crossing a 10% relation size boundary. Which'll lead to invalidations for the first few pages, but much less frequently later. > I'm a bit skeptical of the idea of maintaining an accurate relation > size in shared memory, too. AIUI, a lot of the problem we see with > lseek(SEEK_END) has to do with contention inside the kernel for access > to the single-point-of-truth where the file's size is kept. Keeping > our own copy would eliminate kernel-call overhead, which can't hurt, > but it won't improve the contention angle. A syscall is several hundred instructions. An unlocked read - which'll be be sufficient in many cases, given that the value can quickly be out of date anyway - is a few cycles. Even with a barrier you're talking a few dozen cycles. So I can't see how it'd not improve the contention. But the main reason for keeping it in shmem is less the lseek avoidance - although that's nice, context switches aren't great - but to make relation extension need far less locking. Greetings, Andres Freund