On 2/27/24 14:58, James Pang wrote: > Postgresql 14.8, Redhat8. looks like have to create extend statistics > on indexed and joined columns to make join filters pushed down to secondary > index scan in nestloop, and the shared buffer hits show big difference. > > is it expected ? >
It's hard to say what exactly is happening in the example query (I'd have to do some debugging, but that's impossible without a reproducer), but I think it's mostly expected. My guess is that without the stats the optimizer sees this: -> Index Scan using idx_mtgattndlstext_mmmuuid_uid on mtgxxxxxxxext b (cost=0.56..2.78 rows=1 width=133) (actual time=0.016..0.698 rows=2142 loops=2142) and so decides there's no point in pushing down more conditions to the index scan (because it already returns just 1 row). But there's some sort of correlation / skew, and it returns 2142 rows. With the extended stats it realizes pushing down more conditions makes sense, because doing that in index scan is cheaper than having to read the heap pages etc. So it does that. So yeah, this seems exactly the improvement you'd expect from extended stats. Why do you think this would not be expected? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company