Every now and again, I will encounter an unexplained long-running query.

It’s a head scratcher moment, because this query that is still running for
20 minutes (not blocking) can be run independently in about 500ms

I can only assume that the problem query ran against the table(s) at a time
when it was perhaps in need of a vacuum analyze...  I’m guessing here, that
the table had seen some amount of change and simply had out of date
statistics.

How can I avoid this?
The auto-vacuum daemon is doing it’s thing, but there is always going to be
an opportunity for a query to sneak in against a table that has recently
seen large change, but not yet been analysed.

On the application side, we can explicitly issue a VACUUM ANALYZE after
each bulk operation - and often that is precisely what happens..

But - I am keenly aware that this cannot be performed within a transaction.
That means there is always a (small) window in which a query can still
execute in this scenario.

Are there any other best practices that can mitigate this kind of problem?

It’s rare, sure - but I don’t like sweeping these under the rug.

I’m on PG 9.6.. perhaps there are planner improvements since then that
might reduce the incidence of these (rare) issues.

Any advice appreciated, thanks.

Reply via email to