Re: [GENERAL] table full scan or index full scan?

2009-10-12 Thread peixubin
 I understood , thanks.
--- 09年10月12日,周一, Martijn van Oosterhout  写道:

发件人: Martijn van Oosterhout 
主题: Re: [GENERAL] table full scan or index full scan?
收件人: "Peter Hunsberger" 
抄送: "Scott Marlowe" , " ??" 
, pgsql-general@postgresql.org
日期: 2009年10月12日,周一,下午2:25

On Sun, Oct 11, 2009 at 10:01:52PM -0500, Peter Hunsberger wrote:
> This is interesting, I just ran a similar issue the other day.
> Clearly there is a wide range of read / write scenarios that Postgres
> should be able to cover.  These days, I have a lot of designs leaning
> more toward the data warehouse side of the operational spectrum as
> opposed to the high transaction scenario and I specifically design DB
> management strategies with the knowledge that writes will happen far
> less than reads in our applications.  Is this an area where
> optimizations are considered hard in Postrgres or hopefully, just
> something that is on the todo list but just no one has gotten around
> to yet?  

We consider any optimisation that is feasible. Unfortunatly, "the
number of rows in a table" is a fairly hard number to get in the
general case because it depends on who is asking (different
transactions may get different answers).

>Similarly, are accurate table summary stats possible someday
> or are they considered close to impossible in order to eliminate race
> conditions and lock contention scenarios?

It is possible, it's just not cheap in the general case. The usual
approach is to keep a table that tracks the number of rows. By using
deltas you can make it lockfree. These are however costs most
applications do't need. If you know in your case that the data never
changes, just cache the result somewhere. That will be infinitly more
efficient than any other method.

If you are happy with estimates, they are there and are kept reasonably
uptodate.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while 
> boarding. Thank you for flying nlogn airlines.



  ___ 
  好玩贺卡等你发,邮箱贺卡全新上线! 
http://card.mail.cn.yahoo.com/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] table full scan or index full scan?

2009-10-11 Thread Martijn van Oosterhout
On Sun, Oct 11, 2009 at 10:01:52PM -0500, Peter Hunsberger wrote:
> This is interesting, I just ran a similar issue the other day.
> Clearly there is a wide range of read / write scenarios that Postgres
> should be able to cover.  These days, I have a lot of designs leaning
> more toward the data warehouse side of the operational spectrum as
> opposed to the high transaction scenario and I specifically design DB
> management strategies with the knowledge that writes will happen far
> less than reads in our applications.  Is this an area where
> optimizations are considered hard in Postrgres or hopefully, just
> something that is on the todo list but just no one has gotten around
> to yet?  

We consider any optimisation that is feasible. Unfortunatly, "the
number of rows in a table" is a fairly hard number to get in the
general case because it depends on who is asking (different
transactions may get different answers).

>Similarly, are accurate table summary stats possible someday
> or are they considered close to impossible in order to eliminate race
> conditions and lock contention scenarios?

It is possible, it's just not cheap in the general case. The usual
approach is to keep a table that tracks the number of rows. By using
deltas you can make it lockfree. These are however costs most
applications do't need. If you know in your case that the data never
changes, just cache the result somewhere. That will be infinitly more
efficient than any other method.

If you are happy with estimates, they are there and are kept reasonably
uptodate.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while 
> boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [GENERAL] table full scan or index full scan?

2009-10-11 Thread Peter Hunsberger
2009/10/11 Scott Marlowe :
>> The postgresql database uses the table full scan.but in oracle, the similar 
>> SQL uses the index full scanning,speed quickly many than postgresql.
>
> Yep, PostgreSQL isn't Oracle.  It's a trade off.  In pgsql indexes
> don't contain visibility info, so all index lookups have to eventually
> hit the table itself.  So you either do indexlookup -> table lookup,
> repeat as many times as you have index lookups or you just hit the
> table since you gotta go there anyway.
>
> On the  bright side, this makes updates faster since you don't have to
> lock both table and index and write to both at the same time anymore.
>
>> postgresql's optimizer whether to have the necessity to make the adjustment?
>
> Sorry, it's an architectural difference.  Are you testing in a
> realistic scenario including both reads and writes to the database to
> see if postgresql is faster overall and identify problem areas that
> pop up there?
>

This is interesting, I just ran a similar issue the other day.
Clearly there is a wide range of read / write scenarios that Postgres
should be able to cover.  These days, I have a lot of designs leaning
more toward the data warehouse side of the operational spectrum as
opposed to the high transaction scenario and I specifically design DB
management strategies with the knowledge that writes will happen far
less than reads in our applications.  Is this an area where
optimizations are considered hard in Postrgres or hopefully, just
something that is on the todo list but just no one has gotten around
to yet?  Similarly, are accurate table summary stats possible someday
or are they considered close to impossible in order to eliminate race
conditions and lock contention scenarios?

-- 
Peter Hunsberger

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] table full scan or index full scan?

2009-10-11 Thread Scott Marlowe
Real quick, plain text is preferred on these lists over html.  I don't
care myself so much.

On Sun, Oct 11, 2009 at 7:17 PM, 旭斌 裴  wrote:
>
> I have a 30,000,000 records table, counts the record number to need for 40 
> seconds.
> The table has a primary key on column id;
>
> perf=# explain select count(*) from test;
> ...
> -
> Aggregate (cost=603702.80..603702.81 rows=1 width=0)
>   -> Seq scan on test (cost=0.00..527681.04 rows=30408704 width=0)
> ...
> perf=# select count(*) from test;
> count
> 
> 30408704
>
> perf=#
>
>
> The postgresql database uses the table full scan.but in oracle, the similar 
> SQL uses the index full scanning,speed quickly many than postgresql.

Yep, PostgreSQL isn't Oracle.  It's a trade off.  In pgsql indexes
don't contain visibility info, so all index lookups have to eventually
hit the table itself.  So you either do indexlookup -> table lookup,
repeat as many times as you have index lookups or you just hit the
table since you gotta go there anyway.

On the  bright side, this makes updates faster since you don't have to
lock both table and index and write to both at the same time anymore.

> postgresql's optimizer whether to have the necessity to make the adjustment?

Sorry, it's an architectural difference.  Are you testing in a
realistic scenario including both reads and writes to the database to
see if postgresql is faster overall and identify problem areas that
pop up there?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] table full scan or index full scan?

2009-10-11 Thread Greg Smith

On Mon, 12 Oct 2009, ?? ? wrote:


perf=# select count(*) from test;


In PostgreSQL, if you're selecting every record from the table for a count 
of them, you have to visit them all no matter what.  The most efficient 
way to do that is with a full table scan.  Using an index instead requires 
more disk I/O, because you have to read both the index blocks and the disk 
blocks.



The postgresql database uses the table full scan.but in oracle, the similar SQL 
uses the index full
scanning,speed quickly many than postgresql.  


Some other database systems can do just an index scan instead to compute 
aggregates like count, but even there the rules are pretty complicated; 
http://www.jlcomp.demon.co.uk/faq/count_rows.html covers a lot of the 
material there for Oracle's implementation.  Unfortunately this particular 
optimization isn't available in Postgres yet, and you'll only switch to an 
index scan if you're running a query that only selects a small number of 
records where an index on the condition you're checking for exists.


There's some information about alternative ways to solve this problem at 
http://wiki.postgresql.org/wiki/Slow_Counting


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general