When there are a heavily updated table and a large table at the same time,
ANALYZE against the large table disturbs other autovacuums and HOT updates.
In my workload, ANALYZE takes long time (1min at statistics_target = 10,
and 5min at 100), but the updated table needs to be vacuumed every 30 seconds
because seqscans run on the table repeatedly.

ANALYZE is a "transaction". As long as long transactions are alive,
VACUUM cannot remove dead tuples deleted after those transaction began.
HOT also cannot work under long transactions. We will be happy if VACUUM
can get along with long transactions, but it requires some kinds of complex
managements of vacuum horizon. I have no idea for it...


So, I'm targeting only ANALZYE for now by changing ANALYZE to be ignored
by VACUUM. It is just same as VACUUM that has already been ignored by other
VACUUMs since version 8.2.

My proposal is splitting ANALYZEs with use_own_xacts (by VACUUM ANALYZE,
autovacuum or database-wide analyze) to two transactions:
   T1: acquire_sample_rows()
   T2: compute_stats() and update_attstats()
and set PROC_IN_VACUUM during T1.
T1 takes long time because read pages to sample rows.
T2 is relatively short because stats are calculated in memory, without i/o.
T2 is needed for consistent snapshot because user-defined functions
in expression indexes might use it.

Is it an acceptable approach? Or am I missing something?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Reply via email to