Re: [PERFORM] Optimizer internals

2006-06-23 Thread Bruno Wolff III
On Thu, Jun 15, 2006 at 15:38:32 -0400,
  John Vincent [EMAIL PROTECTED] wrote:
 Any suggestions? FYI the original question wasn't meant as a poke at
 comparing PG to MySQL to DB2. I'm not making an yvalue judgements either
 way. I'm just trying to understand how we can use it the best way possible.
 
 
 Actually we just thought about something. With PG, we can create an index
 that is a SUM of the column where indexing, no? We're going to test this in
 a few hours. Would that be able to be satisfied by an index scan?

No, that won't work. While you can make indexes on functions of a row, you
can't make indexes on aggregate functions.

You might find making a materialized view of the information you want can
help with performance. The issues with sum are pretty much the same ones
as with count. You can find a couple different ways of doing materialized
views for count in the archives. There is a simple way of doing it that
doesn't work well with lots of concurrent updates and a more complicated
method that does work well with lots of concurrent updates.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Optimizer internals

2006-06-22 Thread John Vincent
Any suggestions? FYI the original question wasn't meant as a poke at comparing PG to MySQL to DB2. I'm not making an yvalue judgements either way. I'm just trying to understand how we can use it the best way possible.
Actually we just thought about something. With PG, we can create an index that is a SUM of the column where indexing, no? We're going to test this in a few hours. Would that be able to be satisfied by an index scan?



Re: [PERFORM] Optimizer internals

2006-06-17 Thread Jim Nasby

On Jun 16, 2006, at 8:43 AM, Jonah H. Harris wrote:

Yes, this is certainly the most noticible case.  This is one reason
I'm behind the freespace patch.  Unfortunately, a lot of inexperienced
people use VACUUM FULL and don't understand why VACUUM is *generally*
better.(to free up block-level freespace and update FSM) assuming they
have enough hard disk space for the database.


Another reason to turn autovac on by default in 8.2...


That and of course the visibility bitmap that has been
much-discussed

I'd certainly like to see it.


What's the hold-up on this? I thought there were some technical  
issues that had yet to be resolved?


BTW, I'll point out that DB2 and MSSQL didn't switch to MVCC until  
their most recent versions.

--
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Optimizer internals

2006-06-16 Thread Greg Stark
Mark Lewis [EMAIL PROTECTED] writes:

 On Thu, 2006-06-15 at 14:05 -0400, John Vincent wrote:
  Now I've been told by our DBA that we should have been able to wholy
  satisfy that query via the indexes.
 
 DB2 can satisfy the query using only indexes because DB2 doesn't do
 MVCC.

Well it's more subtle than that. DB2 most certainly does provide MVCC
semantics as does Oracle and MSSQL and any other serious SQL implementation.

But there are different ways to implement MVCC and every database makes
decisions that have pros and cons. Postgres's implementation has some big
benefits over others (no rollback segments, no expensive recovery operations,
fast inserts and updates) but it also has disadvantages (periodic vacuums and
indexes don't cover the data).

The distinction you're looking for here is sometimes called optimistic
versus pessimistic space management. (Not locking, that's something else.)
Postgres is pessimistic -- treats every transaction as if it might be rolled
back. Oracle and most others are optimistic assumes every transaction will
be committed and stores information elsewhere to implement MVCC And recover in
case it's rolled back. The flip side is that Oracle and others like it have to
do a lot of extra footwork to do if you query data that hasn't been committed
yet. That footwork has performance implications.

-- 
greg


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Optimizer internals

2006-06-16 Thread Jonah H. Harris

On 16 Jun 2006 07:23:26 -0400, Greg Stark [EMAIL PROTECTED] wrote:

The flip side is that Oracle and others like it have to
do a lot of extra footwork to do if you query data
that hasn't been committed yet. That footwork
has performance implications.


Not disagreeing here at all, but considering that Oracle, DB2, and SQL
Server, et al have proven themselves to perform extremely well under
heavy load (in multiple benchmarks), the overhead of an UNDO
implementation has a calculable break even point.

Feel free to debate it, but the optimistic approach adopted by nearly
every commercial database vendor is *generally* a better approach for
OLTP.

Consider Weikum  Vossen (p. 442):

We also need to consider the extra work that the recovery algorithm
incurs during normal operation.  This is exactly the catch with the
class of no-undo/no-redo algorithms.  By and large, they come at the
expense of a substantial overhead during normal operations that may
increase the execution cost per transaction by a factor of two or even
higher.  In other words, it reduces the achievable transaction
throughput of a given server configuration by a factor of two or more.

Now, if we're considering UPDATES (the worst case for PostgreSQL's
current MVCC architecture), then this is (IMHO) a true statement.
There aren't many *successful* commercial databases that incur the
additional overhead of creating another version of the record, marking
the old one as having been updated, inserting N-number of new index
entries to point to said record, and having to WAL-log all
aforementioned changes.  I have yet to see any successful commercial
RDBMS using some sort of no-undo algorithm that doesn't follow the,
factor of two or more performance reduction.  However, if you
consider an INSERT or DELETE in PostgreSQL, those are implemented much
better than in most commercial database systems due to PostgreSQL's
MVCC design.  I've done a good amount of research on enhancing
PostgreSQL's MVCC in UPDATE conditions and believe there is a nice
happy medium for us.

/me waits for the obligatory and predictable, the benchmarks are
flawed response.

--
Jonah H. Harris, Software Architect | phone: 732.331.1300
EnterpriseDB Corporation| fax: 732.331.1301
33 Wood Ave S, 2nd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Optimizer internals

2006-06-16 Thread Greg Stark
Jonah H. Harris [EMAIL PROTECTED] writes:

 Now, if we're considering UPDATES (the worst case for PostgreSQL's
 current MVCC architecture), then this is (IMHO) a true statement.
 There aren't many *successful* commercial databases that incur the
 additional overhead of creating another version of the record, marking
 the old one as having been updated, inserting N-number of new index
 entries to point to said record, and having to WAL-log all
 aforementioned changes.  

Well Oracle has to do almost all that same work, it's just doing it in a
separate place called a rollback segment. There are pros and cons especially
where it comes to indexes, but also where it comes to what happens when the
new record is larger than the old one.

 I've done a good amount of research on enhancing PostgreSQL's MVCC in UPDATE
 conditions and believe there is a nice happy medium for us.

IMHO the biggest problem Postgres has is when you're updating a lot of records
in a table with little free space. Postgres has to keep jumping back and forth
between the old records it's reading in and the new records it's writing out.
That can in theory turn a simple linear update scan into a O(n^2) operation.
In practice read-ahead and caching should help but I'm not clear to what
extent.

That and of course the visibility bitmap that has been much-discussed that
might make vacuum not have to visit every page and allow index scans to skip
checking visibility info for some pages would be major wins.

 /me waits for the obligatory and predictable, the benchmarks are
 flawed response.

I wouldnt' say the benchmarks are flawed but I also don't think you can point
to any specific design feature and say it's essential just on the basis of
bottom-line results. You have to look at the actual benefit the specific wins.

Oracle and the others all implement tons of features intended to optimize
applications like the benchmarks (and the benchmarks specifically of course:)
that have huge effects on the results. Partitioned tables, materialized views,
etc allow algorithmic improvements that do much more than any low level
optimizations can do.

-- 
greg


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Optimizer internals

2006-06-16 Thread Jonah H. Harris

On 16 Jun 2006 09:21:01 -0400, Greg Stark [EMAIL PROTECTED] wrote:

Well Oracle has to do almost all that same work, it's just doing it in a
separate place called a rollback segment.


Well, it's not really the same work.  The process by which Oracle
manages UNDO is actually pretty simple and efficient, but complex in
its implementation.  There has also been some significant performance
improvements in this area in both 9i and 10g.


There are pros and cons especially where it comes
to indexes, but also where it comes to what happens
when the new record is larger than the old one.


Certainly, you want to avoid row chaining at all costs; which is why
PCTFREE is there.  I have researched update-in-place for PostgreSQL
and can avoid row-chaining... so I think we can get the same benefit
without the management and administration cost.


IMHO the biggest problem Postgres has is when you're
updating a lot of records in a table with little free space.


Yes, this is certainly the most noticible case.  This is one reason
I'm behind the freespace patch.  Unfortunately, a lot of inexperienced
people use VACUUM FULL and don't understand why VACUUM is *generally*
better.(to free up block-level freespace and update FSM) assuming they
have enough hard disk space for the database.


That and of course the visibility bitmap that has been
much-discussed


I'd certainly like to see it.


I wouldnt' say the benchmarks are flawed but I also
don't think you can point to any specific design
feature and say it's essential just on the basis of
bottom-line results. You have to look at the actual
benefit the specific wins.


True.

--
Jonah H. Harris, Software Architect | phone: 732.331.1300
EnterpriseDB Corporation| fax: 732.331.1301
33 Wood Ave S, 2nd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Optimizer internals

2006-06-15 Thread Mark Lewis
On Thu, 2006-06-15 at 14:05 -0400, John Vincent wrote:
 Now I've been told by our DBA that we should have been able to wholy
 satisfy that query via the indexes.

DB2 can satisfy the query using only indexes because DB2 doesn't do
MVCC.

Although MVCC is generally a win in terms of making the database easier
to use and applications less brittle, it also means that the database
must inspect the visibility information for each row before it can
answer a query.  For most types of queries this isn't a big deal, but
for count(*) type queries, it slows things down.

Since adding the visibility information to indexes would make them
significantly more expensive to use and maintain, it isn't done.
Therefore, each row has to be fetched from the main table anyway.

Since in this particular query you are counting all rows of the
database, PG must fetch each row from the main table regardless, so the
sequential scan is much faster because it avoids traversing the index
and performing random read operations.

-- Mark Lewis

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Optimizer internals

2006-06-15 Thread John Vincent
On 6/15/06, Mark Lewis [EMAIL PROTECTED] wrote:
DB2 can satisfy the query using only indexes because DB2 doesn't doMVCC.Although MVCC is generally a win in terms of making the database easierto use and applications less brittle, it also means that the database
must inspect the visibility information for each row before it cananswer a query.For most types of queries this isn't a big deal, butfor count(*) type queries, it slows things down.
Mark,Thanks for the answer. My DBAs just got this look on thier face when I showed. It's not like the couldn't have investigated this information themselves but I think the light finally came on.One question that we came up with is how does this affect other aggregate functions like MAX,MIN,SUM and whatnot? Being that this is our data warehouse, we use these all the time. As I've said previously, I didn't know a human could generate some of the queries we've passed through this system.
Since adding the visibility information to indexes would make themsignificantly more expensive to use and maintain, it isn't done.
Therefore, each row has to be fetched from the main table anyway.Since in this particular query you are counting all rows of thedatabase, PG must fetch each row from the main table regardless, so thesequential scan is much faster because it avoids traversing the index
and performing random read operations.-- Mark Lewis


Re: [PERFORM] Optimizer internals

2006-06-15 Thread Mark Lewis
On Thu, 2006-06-15 at 14:46 -0400, John Vincent wrote:

 One question that we came up with is how does this affect other
 aggregate functions like MAX,MIN,SUM and whatnot? Being that this is
 our data warehouse, we use these all the time. As I've said
 previously, I didn't know a human could generate some of the queries
 we've passed through this system. 

Previously, MIN and MAX would also run slowly, for the same reason as
COUNT(*).  But there really isn't a need for that, since you can still
get a big speedup by scanning the index in order, looking up each row
and stopping as soon as you find a visible one.

This has been fixed so newer versions of PG will run quickly and use the
index for MIN and MAX.  I don't remember which version had that change;
it might not be until 8.2.  You can dig the archives to find out for
sure. 

For older versions of PG before the fix, you can make MIN and MAX run
quickly by rewriting them in the following form:

SELECT column FROM table ORDER BY column LIMIT 1;

Unfortunately SUM is in the same boat as COUNT; in order for it to
return a meaningful result it must inspect visibility information for
all of the rows.

-- Mark

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Optimizer internals

2006-06-15 Thread John Vincent
On 6/15/06, Mark Lewis [EMAIL PROTECTED] wrote:
Unfortunately SUM is in the same boat as COUNT; in order for it toreturn a meaningful result it must inspect visibility information forall of the rows.-- MarkWe'll this is interesting news to say the least. We went with PostgreSQL for our warehouse because we needed the advanced features that MySQL didn't have at the time (views/sprocs).
It sounds like we almost need another fact table for the places that we do SUM (which is not a problem just an additional map. If I'm interpreting this all correctly, we can't force PG to bypass a sequence scan even if we know our data is stable because of the MVCC aspect. In our case, as with most warehouses (except those that do rolling loads during the day), we only write data to it for about 5 hours at night in batch. 
Any suggestions? FYI the original question wasn't meant as a poke at comparing PG to MySQL to DB2. I'm not making an yvalue judgements either way. I'm just trying to understand how we can use it the best way possible.
If anyone from the bizgres team is watching, have they done any work in this area? Thanks.John


Re: [PERFORM] Optimizer internals

2006-06-15 Thread Scott Marlowe
On Thu, 2006-06-15 at 14:21, John Vincent wrote:
 On 6/15/06, Mark Lewis [EMAIL PROTECTED] wrote:
 Unfortunately SUM is in the same boat as COUNT; in order for
 it to
 return a meaningful result it must inspect visibility
 information for
 all of the rows.
 
 -- Mark
 
 We'll this is interesting news to say the least. We went with
 PostgreSQL for our warehouse because we needed the advanced features
 that MySQL didn't have at the time (views/sprocs). 
 
 It sounds like we almost need another fact table for the places that
 we do SUM (which is not a problem just an additional map. If I'm
 interpreting this all correctly, we can't force PG to bypass a
 sequence scan even if we know our data is stable because of the MVCC
 aspect. In our case, as with most warehouses (except those that do
 rolling loads during the day), we only write data to it for about 5
 hours at night in batch. 
 
 Any suggestions? FYI the original question wasn't meant as a poke at
 comparing PG to MySQL to DB2. I'm not making an yvalue judgements
 either way. I'm just trying to understand how we can use it the best
 way possible. 
 
 If anyone from the bizgres team is watching, have they done any work
 in this area? 

This might help:

http://jonathangardner.net/PostgreSQL/materialized_views/matviews.html

Since you're doing a data warehouse, I would think materialized views
would be a natural addition anyway.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Optimizer internals

2006-06-15 Thread John Vincent
Any suggestions? FYI the original question wasn't meant as a poke at comparing PG to MySQL to DB2. I'm not making an yvalue judgements either way. I'm just trying to understand how we can use it the best way possible.
If anyone from the bizgres team is watching, have they done any work in this area? Thanks.John

Actually we just thought about something. With PG, we can create
an index that is a SUM of the column where indexing, no? We're going to
test this in a few hours. Would that be able to be satisfied by an
index scan?
Also, we're looking at the link provided for the materialized views in PG.Thanks.


Re: [PERFORM] Optimizer internals

2006-06-15 Thread Jim C. Nasby
On Thu, Jun 15, 2006 at 03:43:09PM -0400, John Vincent wrote:
 Any suggestions? FYI the original question wasn't meant as a poke at
 comparing PG to MySQL to DB2. I'm not making an yvalue judgements either
 way. I'm just trying to understand how we can use it the best way possible.
 
 If anyone from the bizgres team is watching, have they done any work in
 this area?
 
 Thanks.
 John
 
 
 Actually we just thought about something. With PG, we can create an index
 that is a SUM of the column where indexing, no? We're going to test this in
 a few hours. Would that be able to be satisfied by an index scan?
 
 Also, we're looking at the link provided for the materialized views in PG.
 
 Thanks.

decibel=# create index test on i ( sum(i) );
ERROR:  cannot use aggregate function in index expression
decibel=# 

BTW, there have been a number of proposals to negate the effect of not
having visibility info in indexes. Unfortunately, none of them have come
to fruition yet, mostly because it's a very difficult problem to solve.
But it is something that the community would like to see happen.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Optimizer internals

2006-06-15 Thread John Vincent
decibel=# create index test on i ( sum(i) );ERROR:cannot use aggregate function in index _expression_
decibel=#BTW, there have been a number of proposals to negate the effect of nothaving visibility info in indexes. Unfortunately, none of them have cometo fruition yet, mostly because it's a very difficult problem to solve.
But it is something that the community would like to see happen.--Jim C. Nasby, Sr. Engineering Consultant
[EMAIL PROTECTED]Pervasive Software
http://pervasive.comwork: 512-231-6117vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
Yeah we got the same thing when we tried it.
I thought about the whole thing on the way home and the downside is that we might have to ditch pgsql.As far as implementing it, it might make sense to translate READ UNCOMMITTED to that new functionality. If the default isolation level stays the current level, the people who need it can use it via WITH UR or somesuch.
I know it's not that easy but it's an idea. I'm also thinking that the table inheritance we're going to be taking advantage of in 8.1 on the new server might make the sequence scan less of an issue. The only reason the sequence scan really blows is that we have a single table with 220M rows and growing.




Re: [PERFORM] Optimizer internals

2006-06-15 Thread Mischa Sandberg

Mark Lewis wrote:

On Thu, 2006-06-15 at 14:05 -0400, John Vincent wrote:

Now I've been told by our DBA that we should have been able to wholy
satisfy that query via the indexes.



DB2 can satisfy the query using only indexes because DB2 doesn't do
MVCC.


You can get pretty much the same effect with materialized views.
Create a table that LOOKS like the index (just those columns),
with a foreign key relationship to the original table (cascade delete),
and have the after-insert trigger on the main table write a row to the derived 
table.
Now (index and) query the skinny table.

Advantage of these tables: you can cluster them regularily,
because it doesn't hard-lock the main table.

--
Engineers think that equations approximate reality.
Physicists think that reality approximates the equations.
Mathematicians never make the connection.

---(end of broadcast)---
TIP 6: explain analyze is your friend