Re: [GENERAL] table full scan or index full scan?

2009-10-12 Thread Martijn van Oosterhout
On Sun, Oct 11, 2009 at 10:01:52PM -0500, Peter Hunsberger wrote:
 This is interesting, I just ran a similar issue the other day.
 Clearly there is a wide range of read / write scenarios that Postgres
 should be able to cover.  These days, I have a lot of designs leaning
 more toward the data warehouse side of the operational spectrum as
 opposed to the high transaction scenario and I specifically design DB
 management strategies with the knowledge that writes will happen far
 less than reads in our applications.  Is this an area where
 optimizations are considered hard in Postrgres or hopefully, just
 something that is on the todo list but just no one has gotten around
 to yet?  

We consider any optimisation that is feasible. Unfortunatly, the
number of rows in a table is a fairly hard number to get in the
general case because it depends on who is asking (different
transactions may get different answers).

Similarly, are accurate table summary stats possible someday
 or are they considered close to impossible in order to eliminate race
 conditions and lock contention scenarios?

It is possible, it's just not cheap in the general case. The usual
approach is to keep a table that tracks the number of rows. By using
deltas you can make it lockfree. These are however costs most
applications do't need. If you know in your case that the data never
changes, just cache the result somewhere. That will be infinitly more
efficient than any other method.

If you are happy with estimates, they are there and are kept reasonably
uptodate.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [GENERAL] table full scan or index full scan?

2009-10-12 Thread peixubin
 I understood , thanks.
--- 09年10月12日,周一, Martijn van Oosterhout klep...@svana.org 写道:

发件人: Martijn van Oosterhout klep...@svana.org
主题: Re: [GENERAL] table full scan or index full scan?
收件人: Peter Hunsberger peter.hunsber...@gmail.com
抄送: Scott Marlowe scott.marl...@gmail.com,  ?? 
peixu...@yahoo.com.cn, pgsql-general@postgresql.org
日期: 2009年10月12日,周一,下午2:25

On Sun, Oct 11, 2009 at 10:01:52PM -0500, Peter Hunsberger wrote:
 This is interesting, I just ran a similar issue the other day.
 Clearly there is a wide range of read / write scenarios that Postgres
 should be able to cover.  These days, I have a lot of designs leaning
 more toward the data warehouse side of the operational spectrum as
 opposed to the high transaction scenario and I specifically design DB
 management strategies with the knowledge that writes will happen far
 less than reads in our applications.  Is this an area where
 optimizations are considered hard in Postrgres or hopefully, just
 something that is on the todo list but just no one has gotten around
 to yet?  

We consider any optimisation that is feasible. Unfortunatly, the
number of rows in a table is a fairly hard number to get in the
general case because it depends on who is asking (different
transactions may get different answers).

Similarly, are accurate table summary stats possible someday
 or are they considered close to impossible in order to eliminate race
 conditions and lock contention scenarios?

It is possible, it's just not cheap in the general case. The usual
approach is to keep a table that tracks the number of rows. By using
deltas you can make it lockfree. These are however costs most
applications do't need. If you know in your case that the data never
changes, just cache the result somewhere. That will be infinitly more
efficient than any other method.

If you are happy with estimates, they are there and are kept reasonably
uptodate.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.



  ___ 
  好玩贺卡等你发,邮箱贺卡全新上线! 
http://card.mail.cn.yahoo.com/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] table full scan or index full scan?

2009-10-11 Thread 旭斌 裴
I have a 30,000,000 records table, counts the record number to need for 40 
seconds. 
The table has a primary key on column id;

perf=# explain select count(*) from test;
...
-
Aggregate (cost=603702.80..603702.81 rows=1 width=0)
  - Seq scan on test (cost=0.00..527681.04 rows=30408704 width=0)
...
perf=# select count(*) from test;
count

30408704

perf=#


The
postgresql database uses the table full scan.but in oracle, the similar
SQL uses the index full scanning,speed quickly many than postgresql.  

postgresql's optimizer whether to have the necessity to make the adjustment? 
 

postgresql version:8.3.7

OS : ubuntu 9

kernel:2.6.28-15-generic x86_64







  ___ 
  好玩贺卡等你发,邮箱贺卡全新上线! 
http://card.mail.cn.yahoo.com/

Re: [GENERAL] table full scan or index full scan?

2009-10-11 Thread Greg Smith

On Mon, 12 Oct 2009, ?? ? wrote:


perf=# select count(*) from test;


In PostgreSQL, if you're selecting every record from the table for a count 
of them, you have to visit them all no matter what.  The most efficient 
way to do that is with a full table scan.  Using an index instead requires 
more disk I/O, because you have to read both the index blocks and the disk 
blocks.



The postgresql database uses the table full scan.but in oracle, the similar SQL 
uses the index full
scanning,speed quickly many than postgresql.  


Some other database systems can do just an index scan instead to compute 
aggregates like count, but even there the rules are pretty complicated; 
http://www.jlcomp.demon.co.uk/faq/count_rows.html covers a lot of the 
material there for Oracle's implementation.  Unfortunately this particular 
optimization isn't available in Postgres yet, and you'll only switch to an 
index scan if you're running a query that only selects a small number of 
records where an index on the condition you're checking for exists.


There's some information about alternative ways to solve this problem at 
http://wiki.postgresql.org/wiki/Slow_Counting


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] table full scan or index full scan?

2009-10-11 Thread Scott Marlowe
Real quick, plain text is preferred on these lists over html.  I don't
care myself so much.

On Sun, Oct 11, 2009 at 7:17 PM, 旭斌 裴 peixu...@yahoo.com.cn wrote:

 I have a 30,000,000 records table, counts the record number to need for 40 
 seconds.
 The table has a primary key on column id;

 perf=# explain select count(*) from test;
 ...
 -
 Aggregate (cost=603702.80..603702.81 rows=1 width=0)
   - Seq scan on test (cost=0.00..527681.04 rows=30408704 width=0)
 ...
 perf=# select count(*) from test;
 count
 
 30408704

 perf=#


 The postgresql database uses the table full scan.but in oracle, the similar 
 SQL uses the index full scanning,speed quickly many than postgresql.

Yep, PostgreSQL isn't Oracle.  It's a trade off.  In pgsql indexes
don't contain visibility info, so all index lookups have to eventually
hit the table itself.  So you either do indexlookup - table lookup,
repeat as many times as you have index lookups or you just hit the
table since you gotta go there anyway.

On the  bright side, this makes updates faster since you don't have to
lock both table and index and write to both at the same time anymore.

 postgresql's optimizer whether to have the necessity to make the adjustment?

Sorry, it's an architectural difference.  Are you testing in a
realistic scenario including both reads and writes to the database to
see if postgresql is faster overall and identify problem areas that
pop up there?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] table full scan or index full scan?

2009-10-11 Thread Peter Hunsberger
2009/10/11 Scott Marlowe scott.marl...@gmail.com:
 The postgresql database uses the table full scan.but in oracle, the similar 
 SQL uses the index full scanning,speed quickly many than postgresql.

 Yep, PostgreSQL isn't Oracle.  It's a trade off.  In pgsql indexes
 don't contain visibility info, so all index lookups have to eventually
 hit the table itself.  So you either do indexlookup - table lookup,
 repeat as many times as you have index lookups or you just hit the
 table since you gotta go there anyway.

 On the  bright side, this makes updates faster since you don't have to
 lock both table and index and write to both at the same time anymore.

 postgresql's optimizer whether to have the necessity to make the adjustment?

 Sorry, it's an architectural difference.  Are you testing in a
 realistic scenario including both reads and writes to the database to
 see if postgresql is faster overall and identify problem areas that
 pop up there?


This is interesting, I just ran a similar issue the other day.
Clearly there is a wide range of read / write scenarios that Postgres
should be able to cover.  These days, I have a lot of designs leaning
more toward the data warehouse side of the operational spectrum as
opposed to the high transaction scenario and I specifically design DB
management strategies with the knowledge that writes will happen far
less than reads in our applications.  Is this an area where
optimizations are considered hard in Postrgres or hopefully, just
something that is on the todo list but just no one has gotten around
to yet?  Similarly, are accurate table summary stats possible someday
or are they considered close to impossible in order to eliminate race
conditions and lock contention scenarios?

-- 
Peter Hunsberger

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general