Re: [PERFORM] [GENERAL] how to get accurate values in pg_statistic

Bruce Momjian Fri, 05 Sep 2003 13:53:28 -0700

Mary Edie Meredith wrote:
> I certainly don't claim that it is appropriate to force customers into a
> full analysis, particularly if random sampling versus a full scan of the
> data reveals little to no performance differences in the plans.  Being
> able to sample accurately is _very nice for large tables.
> 
> For our testing purposes, however, consistent results are extremely
> important. We have observed that small difference in one plan for one of
> 22 queries can cause a difference in the DBT-3 results.  If this
> happens, a small change in performance runs between two Linux kernels
> may appear to be due to the kernels, when in fact it is due to the plan
> change.  
> 
> We know that the plans are _exactly the same if the data in the
> pg_statistics table is the same from run to run (all other things being
> equal).  So what we need to have is identical optimizer costs
> (pg_statistics) for the same table data for each.
> 
> I feel certain that the pg_statistics table will be identical from run
> to run if analyze looks at every row.   Thus our hope to find a way to
> get that.



Actually, if you are usig GEQO (many tables in a join) the optimizer
itself will randomly try plans --- even worse than random statistics.

We do have:

        #geqo_random_seed = -1          # -1 = use variable seed

that lets you force a specific random seed for testing purposes.  I
wonder if that could be extended to control VACUUM radomization too. 
Right now, it just controls GEQO and in fact gets reset on every
optimizer run.

I wonder if you could just poke a srandom(10) in
src/backend/command/analyze.c.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  [EMAIL PROTECTED]               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] [GENERAL] how to get accurate values in pg_statistic

Reply via email to