Re: [GENERAL] help understanding analyze

2006-12-09 Thread Luca Ferrari
On Saturday 09 December 2006 03:48 Tom Lane's cat, walking on the keyboard, 
wrote:
 Well, CLUSTER does not guarantee that the data remains sorted --- as
 soon as you do any updates it won't be anymore.  So the planner can
 never assume that a plain seqscan delivers correctly sorted output.

And when the cluster is rebuilt? I mean, in theory, a clustered index should 
be sorted at any point in time, that means I've got much cost for 
insert/update cause I need to sort again the index when I'm performing the 
insert/update, isn't it? For me, at least in theory, a clustered index is 
always sorted. Now, assuming that my table is not changing (the number of 
people hired/fired is very low!), it makes sense to me use a clustered index 
cause I should not have the cost of insert/update but should have better 
performances. Maybe I cannot understand something...


 The real question you should be asking in the above case is why it
 didn't use an indexscan on that index, and the answer is probably
 that you didn't ANALYZE.  VACUUM does not update the statistics
 about index correlation.


I did run analyze, and the explain shows me the seq scan and then a sort. The 
only difference I've seen between a only vacuum and a analyze is that the 
seq. scan cost changes, but the final cost (i.e., seq. scan and sort) is the 
same either with or without the index. This is the point I cannot understand. 
And of course, as you stated, the problem is that the system is not 
considering the created index (of course I can suggest it within the select 
statement), and I don't know why.
Any explaination?

Thanks,
Luca

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] help understanding analyze

2006-12-09 Thread Martijn van Oosterhout
On Sat, Dec 09, 2006 at 11:35:39AM +0100, Luca Ferrari wrote:
 And when the cluster is rebuilt? I mean, in theory, a clustered index should 
 be sorted at any point in time, that means I've got much cost for 
 insert/update cause I need to sort again the index when I'm performing the 
 insert/update, isn't it? For me, at least in theory, a clustered index is 
 always sorted. Now, assuming that my table is not changing (the number of 
 people hired/fired is very low!), it makes sense to me use a clustered index 
 cause I should not have the cost of insert/update but should have better 
 performances. Maybe I cannot understand something...

I think you're confused about what CLUSTER does. There's is no such
thing as a clustered index. An index is always organised in some way,
if it's a b-tree index it has the information of the key in sorted
order. When you cluster a table it rearranges the data so it is in the
same order as the index. But it's not kept that way. The index is kept
sorted but the data is not.

 I did run analyze, and the explain shows me the seq scan and then a sort. The 
 only difference I've seen between a only vacuum and a analyze is that the 
 seq. scan cost changes, but the final cost (i.e., seq. scan and sort) is the 
 same either with or without the index. This is the point I cannot understand. 

If it doesn't say Index Scan it's not using the index.

At a guess your table is not big enough to make an index worthwhile. If
your table is only a few pages long, it's just not efficient to lookup
an index first.

If you post the results of EXPLAIN ANALYZE we can tell you for sure.

Have a ncie day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [GENERAL] help understanding analyze

2006-12-09 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes:
 At a guess your table is not big enough to make an index worthwhile. If
 your table is only a few pages long, it's just not efficient to lookup
 an index first.
 If you post the results of EXPLAIN ANALYZE we can tell you for sure.

Actually, it would be interesting to see EXPLAIN ANALYZE for the query
both with enable_sort = on and enable_sort = off.  That would show what
the planner thinks the relative costs are as well as what the true costs
are.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] help understanding analyze

2006-12-08 Thread Bill Moran
Luca Ferrari [EMAIL PROTECTED] wrote:

 Hi all,
 excuse me for this trivial question, but here's my doubt:
 create table person(varchar id, varchar surname, varchar name)
 with id primary key. Now, the query:
 select * from person order by surname,name
 provide me an explaination that is sequential scan + sort, as I expected. 
 After that I build an index on surname,name (clustered) and run vacuum to 
 update statistics. Then I ran again the query and got the same results (scan 
 + sort) with the same time.
 Now my trivial question is: why another sort? The index is clustered so the 
 database should not need to sort the output, or am I using wrong the tools?
 Someone can explain me that?

I doubt that the planner has any way to know that the table, at any point
in time, is still 100% clustered.  If even one row has been added since
the cluster was done, the table will need resorted.

Might be an optimization that could be done, except that I expect there
will be very few cases where it will actually make a difference.  How
often do you have a table that never changes and can always be assured
of being in index order?

-Bill

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] help understanding analyze

2006-12-08 Thread Tom Lane
Luca Ferrari [EMAIL PROTECTED] writes:
 excuse me for this trivial question, but here's my doubt:
 create table person(varchar id, varchar surname, varchar name)
 with id primary key. Now, the query:
 select * from person order by surname,name
 provide me an explaination that is sequential scan + sort, as I expected. 
 After that I build an index on surname,name (clustered) and run vacuum to 
 update statistics. Then I ran again the query and got the same results (scan 
 + sort) with the same time.
 Now my trivial question is: why another sort? The index is clustered so the 
 database should not need to sort the output, or am I using wrong the tools?

Well, CLUSTER does not guarantee that the data remains sorted --- as
soon as you do any updates it won't be anymore.  So the planner can
never assume that a plain seqscan delivers correctly sorted output.

The real question you should be asking in the above case is why it
didn't use an indexscan on that index, and the answer is probably
that you didn't ANALYZE.  VACUUM does not update the statistics
about index correlation.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq