Re: [HACKERS] ANALYZE to be ignored by VACUUM

2008-02-20 Thread Dawid Kuroczko
On Feb 19, 2008 8:31 AM, ITAGAKI Takahiro
[EMAIL PROTECTED] wrote:
 Tom Lane [EMAIL PROTECTED] wrote:

  ITAGAKI Takahiro [EMAIL PROTECTED] writes:
   In my workload, ANALYZE takes long time (1min at statistics_target = 10,
   and 5min at 100), but the updated table needs to be vacuumed every 30 
   seconds
   because seqscans run on the table repeatedly.
 
  There is something *seriously* wrong with that.  If vacuum can complete
  in under 30 seconds, how can analyze take a minute?  (I'm also wondering
  whether you'll still need such frantic vacuuming with HOT...)

 There are two tables here:
   [S] A small table, that is frequently updated and seqscan-ed
   [L] A large table, that takes a long time to be analyzed

 The table [S] should be vacuumed every 30 seconds, because dead tuples
 affects the performance of seqscan seriously. HOT and autovacuum are
 very useful here *unless* long transactions begins.
 Analyzing [L] takes 1-5 minutes and both HOT and vacuums don't work
 during it. I want to use statistics_target = 100 at heart for more
 accurate statistics, but I'm using 10 instead because of avoiding
 long transactions by analyze.

I am sure the idea is not original, yet still I would like to know how hard
would it be to support local (per table) oldest visible XIDs.

I mean, when transaction start you need to keep all tuples with xmin =
oldest_xid in all tables, because who knows what table will that transaction
like to touch.

But then again, there is relatively large list of cases when we don't need
to hold vacuum on _all_ relations.  These include:

SELECTs in auto-commit mode -- provided the SELECT is not something
fancy (not immutable PL-functions), we just need to keep a snapshot of
affected tables.

DMLs in auto-commit mode -- provided no PL-functions or triggers are in
effect.

WITH HOLD CURSORS.  Yes, I know, WITH HOLD cursor on first COMMIT
will create a copy of rows to be returned (which can take a looong time in
some cases), but perhaps it could be possible to just lock the table from
vacuuming and skip the temporary store.

And lots of other, when done in auto-commit.  Like ALTER TABLEs, CREATE
TABLE AS SELECT, COPY, etc...

I am sure that such an idea isn't original.  What are the main obstacles
in making it happen except timemoney? :)

   Regards,
  Dawid

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] ANALYZE to be ignored by VACUUM

2008-02-20 Thread ITAGAKI Takahiro

Dawid Kuroczko [EMAIL PROTECTED] wrote:

 I am sure the idea is not original, yet still I would like to know how hard
 would it be to support local (per table) oldest visible XIDs.
 
 I mean, when transaction start you need to keep all tuples with xmin =
 oldest_xid in all tables, because who knows what table will that transaction
 like to touch.

Per-table oldest XID management sounds good! You mean transactions
that touch no tables does not affect vacuums at all, right?
If so, the solution can resolve pg_start_backup problem, too.

I feel it is enough for standard maintenance commands.
Another solution might need for user defined long transactions, though.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] ANALYZE to be ignored by VACUUM

2008-02-19 Thread Gregory Stark
ITAGAKI Takahiro [EMAIL PROTECTED] writes:

 4. ANALYZE finishes in a short time.
It is ok that VACUUM takes a long time because it is not a transaction,
but ANALYZE should not. It requres cleverer statistics algorithm.
Sampling factor 10 is not enough for pg_stats.n_distinct. We seems to
estimate n_distinct too low for clustered (ordered) tables.

Unfortunately no constant size sample is going to be enough for reliable
n_distinct estimates. To estimate n_distinct you really have to see a
percentage of the table, and to get good estimates that percentage has to be
fairly large.

There was a paper with a nice algorithm posted a while back which required
only constant memory but it depended on scanning the entire table. I think to
do n_distinct estimates we'll need some statistics which are either gathered
opportunistically whenever a seqscan happens or maintained by an index.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] ANALYZE to be ignored by VACUUM

2008-02-19 Thread ITAGAKI Takahiro

Gregory Stark [EMAIL PROTECTED] wrote:

  4. ANALYZE finishes in a short time.
 There was a paper with a nice algorithm posted a while back which required
 only constant memory but it depended on scanning the entire table. I think to
 do n_distinct estimates we'll need some statistics which are either gathered
 opportunistically whenever a seqscan happens or maintained by an index.

VACUUM would be another good timing for the alogrithm, because it does
a seqscan. If we do so, we need to separate an analyzing transaction
into sampling and updating-stats transactions to keep vacuums as
ignorable transactions.

However, VACUUM will not do a seqscan when we have Dead Space Map or
Segment Visibility Map. We will need incremental statistics updating
if reliable n_distinct estimation requires many samples.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] ANALYZE to be ignored by VACUUM

2008-02-18 Thread ITAGAKI Takahiro
Tom Lane [EMAIL PROTECTED] wrote:

 ITAGAKI Takahiro [EMAIL PROTECTED] writes:
  In my workload, ANALYZE takes long time (1min at statistics_target = 10,
  and 5min at 100), but the updated table needs to be vacuumed every 30 
  seconds
  because seqscans run on the table repeatedly.
 
 There is something *seriously* wrong with that.  If vacuum can complete
 in under 30 seconds, how can analyze take a minute?  (I'm also wondering
 whether you'll still need such frantic vacuuming with HOT...)

There are two tables here:
  [S] A small table, that is frequently updated and seqscan-ed
  [L] A large table, that takes a long time to be analyzed

The table [S] should be vacuumed every 30 seconds, because dead tuples
affects the performance of seqscan seriously. HOT and autovacuum are
very useful here *unless* long transactions begins.
Analyzing [L] takes 1-5 minutes and both HOT and vacuums don't work
during it. I want to use statistics_target = 100 at heart for more
accurate statistics, but I'm using 10 instead because of avoiding
long transactions by analyze.

Basically, the above is based on avoiding needless long transactions.
Aside from ANALYZE, pg_start_backup() is also a long transactional 
command. It takes checkpoint_timeout * checkpoint_completion_target
(2.5- min.) at worst. Users could avoid their own long transactions,
but have no choice but to use those provided maintenance commands.


  So, I'm targeting only ANALZYE for now by changing ANALYZE to be ignored
  by VACUUM.
 
 I think we need to understand what the real problem is with your test
 case.  This proposal seems very messy/ugly to me, and I'm unconvinced
 that it solves anything.

I think there are some direct or indirect solutions:

1. VACUUM removes recently dead tuples under some circumstances.
   For example, tuples updated twice after a long transaction begins.
   The oldest tuple can be seen by the old long transaction and
   the newest can be seen new transactions. However, the intermediate
   tuple is invisible all transactions.

2. ANALYZE don't disturb vacuuming of other tables. (my first proposal)
   We know ANALYZE don't touch other tables during sampling phases.
   We can treat analyzing transactions as same as PROC_IN_VACUUM xacts.
   The same can be said for pg_start_backup; non-transactinal starting
   backup command might be better.

3. Recover density of tuples; i.e, auto-CLUSTER.
   If the performance recovers after long transactions, the problem
   will not be so serious. It would be better that autovacuum invokes
   CLUSTER if required and we could run CLUSTER concurrently.

4. ANALYZE finishes in a short time.
   It is ok that VACUUM takes a long time because it is not a transaction,
   but ANALYZE should not. It requres cleverer statistics algorithm.
   Sampling factor 10 is not enough for pg_stats.n_distinct. We seems to
   estimate n_distinct too low for clustered (ordered) tables.
   There might be a matter of research in calculation of n_distinct.
   Also, this cannot resolve the problem in pg_start_backup.


1 or 3 might be more universal approaches, but I think 2 will be
an independent improvement from them.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] ANALYZE to be ignored by VACUUM

2008-02-15 Thread ITAGAKI Takahiro
When there are a heavily updated table and a large table at the same time,
ANALYZE against the large table disturbs other autovacuums and HOT updates.
In my workload, ANALYZE takes long time (1min at statistics_target = 10,
and 5min at 100), but the updated table needs to be vacuumed every 30 seconds
because seqscans run on the table repeatedly.

ANALYZE is a transaction. As long as long transactions are alive,
VACUUM cannot remove dead tuples deleted after those transaction began.
HOT also cannot work under long transactions. We will be happy if VACUUM
can get along with long transactions, but it requires some kinds of complex
managements of vacuum horizon. I have no idea for it...


So, I'm targeting only ANALZYE for now by changing ANALYZE to be ignored
by VACUUM. It is just same as VACUUM that has already been ignored by other
VACUUMs since version 8.2.

My proposal is splitting ANALYZEs with use_own_xacts (by VACUUM ANALYZE,
autovacuum or database-wide analyze) to two transactions:
   T1: acquire_sample_rows()
   T2: compute_stats() and update_attstats()
and set PROC_IN_VACUUM during T1.
T1 takes long time because read pages to sample rows.
T2 is relatively short because stats are calculated in memory, without i/o.
T2 is needed for consistent snapshot because user-defined functions
in expression indexes might use it.

Is it an acceptable approach? Or am I missing something?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] ANALYZE to be ignored by VACUUM

2008-02-15 Thread Tom Lane
ITAGAKI Takahiro [EMAIL PROTECTED] writes:
 In my workload, ANALYZE takes long time (1min at statistics_target = 10,
 and 5min at 100), but the updated table needs to be vacuumed every 30 seconds
 because seqscans run on the table repeatedly.

There is something *seriously* wrong with that.  If vacuum can complete
in under 30 seconds, how can analyze take a minute?  (I'm also wondering
whether you'll still need such frantic vacuuming with HOT...)

 So, I'm targeting only ANALZYE for now by changing ANALYZE to be ignored
 by VACUUM.

I think we need to understand what the real problem is with your test
case.  This proposal seems very messy/ugly to me, and I'm unconvinced
that it solves anything.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend