Re: [PERFORM] 500x speed-down: Wrong statistics!

2006-01-11 Thread Alessandro Baretta

Tom Lane wrote:

Alessandro Baretta [EMAIL PROTECTED] writes:


I have no clue as to how or why the statistics were wrong
yesterday--as I vacuum-analyzed continuously out of lack of any better
idea--and I was stupid enough to re-timestamp everything before
selecting from pg_stats.



Too bad.  I would be interested to find out how, if the stats were
up-to-date, the thing was still getting the row estimate so wrong.
If you manage to get the database back into its prior state please
do send along the pg_stats info.


I have some more information on this issue, which clears PostgreSQL's planner of 
all suspects. I am observing severe corruption of the bookkeeping fields managed 
by the xdbs rule/trigger complex. I am unable to pinpoint the cause, right 
now, but the effect is that after running a few hours' test on the end-user 
application (which never interacts directly with xdbs_* fields, and thus cannot 
possibly mangle them) most tuples (the older ones, apparently) get thei 
timestamps set to NULL. Before vacuum-analyzing the table, yesterday's 
statistics were in effect, and the planner used the appropriate indexes. Now, 
after vacuum-analyzing the table, the pg_stats row for the xdbs_modified field 
no longer exists (!), and the planner has reverted to the Nested Loop Seq Scan 
join strategy. Hence, all the vacuum-analyzing I was doing when complaining 
against the planner was actually collecting completely screwed statistics, and 
this is why the ALTER TABLE ... SET STATISTICS 1000 did not help at all!


Ok. I plead guilty and ask for the clemency of the court. I'll pay my debt with 
society with a long term of pl/pgsql code debugging...


Alex

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] 500x speed-down: Wrong statistics!

2006-01-10 Thread Tom Lane
Alessandro Baretta [EMAIL PROTECTED] writes:
 I have no clue as to how or why the statistics were wrong
 yesterday--as I vacuum-analyzed continuously out of lack of any better
 idea--and I was stupid enough to re-timestamp everything before
 selecting from pg_stats.

Too bad.  I would be interested to find out how, if the stats were
up-to-date, the thing was still getting the row estimate so wrong.
If you manage to get the database back into its prior state please
do send along the pg_stats info.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] 500x speed-down: Wrong statistics!

2006-01-09 Thread Alessandro Baretta

Tom Lane wrote:

Alessandro Baretta [EMAIL PROTECTED] writes:


Tom Lane wrote:


I'm curious to see how many rows the planner thinks this will produce,
and whether it will use the index or not.
dmd-freerp-1-alex=# EXPLAIN ANALYZE SELECT * FROM articolo WHERE 
articolo.xdbs_modified  '2006-01-08 18:25:00+01';

QUERY PLAN
---
 Index Scan using articolo_modified_index on articolo  (cost=0.00..3914.91 
rows=17697 width=653) (actual time=0.032..0.032 rows=0 loops=1)

   Index Cond: (xdbs_modified  '2006-01-08 18:25:00'::timestamp without time 
zone)
 Total runtime: 0.150 ms
(3 rows)



Well, there's your problem: 17697 rows estimated vs 0 actual.  With a
closer row estimate it would've probably done the right thing for the
join problem.


Hmmm, what you are telling me is very interesting, Tom. So, let me see if I got 
this straight: the first 'rows=... in the result from EXPLAIN ANALYZE gives me 
estimates, while the second gives the actual cardinality of the selected record 
set. Correct? If this is true, two questions arise: why is the estimated number 
of rows completele wrong, and why, given such a large estimated record set does 
PostgreSQL schedule an Index Scan as opposed to a Seq Scan?




 How many rows are really in the table, anyway?  Could we see the
 pg_stats row for articolo.xdbs_modified?

dmd-freerp-1-alex=# select count(*) from articolo;
 count
---
 53091
(1 row)

dmd-freerp-1-alex=# explain analyze select * from articolo;
   QUERY PLAN
-
 Seq Scan on articolo  (cost=0.00..1439.91 rows=53091 width=653) (actual 
time=0.013..151.189 rows=53091 loops=1)

 Total runtime: 295.152 ms
(2 rows)

Now let me get the pg_stats for xdbs_modified.

dmd-freerp-1-alex=# select * from pg_stats where tablename = 'articolo' and 
attname = 'xdbs_modified';
 schemaname | tablename |attname| null_frac | avg_width | n_distinct | 
   most_common_vals| most_common_freqs | histogram_bounds | correlation

+---+---+---+---+++---+--+-
 public | articolo  | xdbs_modified | 0 | 8 |  1 | 
{2006-01-10 08:12:58.605327} | {1}   |   |   1

(1 row)

For sake of simplicity I have re-timestamped all tuples in the table with the 
current timestamp, as you can see above. Now, obviously, the planner must 
estimate ~0 rows for queries posing a selection condition on xdbs_modified, for 
any value other than 2006-01-10 08:12:58.605327. Let me try selecting from 
articolo first.


dmd-freerp-1-alex=# EXPLAIN ANALYZE SELECT * FROM articolo WHERE 
articolo.xdbs_modified  '2006-01-10 18:25:00+01';

 QUERY PLAN

 Index Scan using articolo_modified_index on articolo  (cost=0.00..2.01 rows=1 
width=653) (actual time=0.139..0.139 rows=0 loops=1)

   Index Cond: (xdbs_modified  '2006-01-10 18:25:00'::timestamp without time 
zone)
 Total runtime: 0.257 ms
(3 rows)

The planner produces a sensible estimate of the number of rows and consequently 
chooses the appropriate query plan. Now, the join.


dmd-freerp-1-alex=# explain analyze SELECT *  FROM articolo JOIN ubicazione 
USING  (id_ente, id_produttore, id_articolo) WHERE articolo.id_ente =  'dmd' AND 
allarme IS NULL AND manutenzione IS NULL AND articolo.xdbs_modified  
'2006-01-10 18:25:00+01';


QUERY PLAN

 Nested Loop  (cost=0.00..5.05 rows=1 width=1146) (actual time=0.043..0.043 
rows=0 loops=1)
   -  Index Scan using articolo_modified_index on articolo  (cost=0.00..2.02 
rows=1 width=653) (actual time=0.035..0.035 rows=0 loops=1)
 Index Cond: (xdbs_modified  '2006-01-10 18:25:00'::timestamp without 
time zone)

 Filter: (id_ente = 'dmd'::text)
   -  Index Scan using ubicazione_fkey_articolo on ubicazione 
(cost=0.00..3.02 rows=1 width=536) (never executed)
 Index Cond: (('dmd'::text = ubicazione.id_ente) AND 
(outer.id_produttore = ubicazione.id_produttore) AND (outer.id_articolo = 
ubicazione.id_articolo))

 Filter: ((allarme IS NULL) AND (manutenzione IS NULL))
 Total runtime: 0.382 ms
(8 rows)

Dear Tom, you're my hero! I have no clue as to how or why the statistics were 
wrong yesterday--as I vacuum-analyzed continuously out of lack of any