Re: [PERFORM] planner/optimizer question

2004-05-02 Thread Jochem van Dieten
Manfred Koizar said: > > As Tom has explained in a nearby message his concern is that -- > unlike dead-to-all -- visible-to-all starts as false, is set to true > at some point in time, and is eventually set to false again. > Problems arise if one backend wants to set visible-to-all to true > while

Re: [PERFORM] planner/optimizer question

2004-05-02 Thread Manfred Koizar
On Sat, 01 May 2004 13:18:04 +0200, Jochem van Dieten <[EMAIL PROTECTED]> wrote: >Tom Lane wrote: >> Oh really? I think you need to think harder about the transition >> conditions. Indeed. >> >> Dead-to-all is reasonably safe to treat as a hint bit because *it does >> not ever need to be undone

Re: [PERFORM] planner/optimizer question

2004-05-01 Thread Tom Lane
Jochem van Dieten <[EMAIL PROTECTED]> writes: > The moment the heap tuple is updated/deleted the visible-to-all > flag needs to be set to false again in all indexes. This is > critical, Exactly. This gets you out of the hint-bit semantics and into a ton of interesting problems, such as race con

Re: [PERFORM] planner/optimizer question

2004-05-01 Thread Gary Doades
On 1 May 2004 at 13:18, Jochem van Dieten wrote: > Yes, really :-) > > When a tuple is inserted the visible-to-all flag is set to false. > The effect of this is that every index scan that finds this tuple > has to visit the heap to verify visibility. If it turns out the > tuple is not only vis

Re: [PERFORM] planner/optimizer question

2004-05-01 Thread Jochem van Dieten
Tom Lane wrote: Manfred Koizar <[EMAIL PROTECTED]> writes: Yes, the visible-to-all flag would be set as a by-product of an index scan, if the heap tuple is found to be visible to all active transactions. This update is non-critical Oh really? I think you need to think harder about the transition

Re: [PERFORM] planner/optimizer question

2004-04-30 Thread Tom Lane
Manfred Koizar <[EMAIL PROTECTED]> writes: > Yes, the visible-to-all flag would be set as a by-product of an index > scan, if the heap tuple is found to be visible to all active > transactions. This update is non-critical Oh really? I think you need to think harder about the transition condition

Re: [PERFORM] planner/optimizer question

2004-04-30 Thread Manfred Koizar
On Fri, 30 Apr 2004 19:46:24 +0200, Jochem van Dieten <[EMAIL PROTECTED]> wrote: >> While the storage overhead could be reduced to 1 bit (not a joke) > >You mean adding an isLossy bit and only where it is set the head >tuple has to be checked for visibility, if it is not set the head >tuple does

Re: [PERFORM] planner/optimizer question

2004-04-30 Thread Kris Jurka
On Fri, 30 Apr 2004, Gary Doades wrote: > Yes, you're right and I have done this just to prove to myself that it > is the index scan that is the bottleneck. I have some complex SQL that > executes very quickly with Postgres, similar to MSSQL, but the index > scans in most of those only touch a f

Re: [PERFORM] planner/optimizer question

2004-04-30 Thread Gary Doades
On 30 Apr 2004 at 9:37, Kevin Barnard wrote: > > I was always under the impression that MSSQL used leaf and row level locking and > therefore > was not a concurrent, in the same sense that postgres is, database. It would still > allow for > concurrent connections and such but updates will get

Re: [PERFORM] planner/optimizer question

2004-04-30 Thread Gary Doades
On 30 Apr 2004 at 8:32, Jeff wrote: > > A better comparision query may be a simple "select a from mytable > where a between foo and bar" to get an index scan. In that case its a > straight up, vanilla index scan. Nothing else getting in the way. > Yes, you're right and I have done this ju

Re: [PERFORM] planner/optimizer question

2004-04-30 Thread Jochem van Dieten
Manfred Koizar wrote: On Wed, 28 Apr 2004 09:05:04 -0400, Tom Lane <[EMAIL PROTECTED]> wrote: [ ... visibility information in index tuples ... ] Storing that information would at least double the overhead space used for each index tuple. The resulting index bloat would significantly slow index ope

Re: [PERFORM] planner/optimizer question

2004-04-30 Thread Jeff
On Apr 30, 2004, at 3:01 AM, Gary Doades wrote: [ pg query plan, etc ] I wonder if other parts of the plan are affecting the speed. I've recently run into a case where a merge join plan was chosen for this query, which took 11 seconds to execute. Forcing it to pick a nested loop join dropped it

Re: [PERFORM] planner/optimizer question

2004-04-30 Thread Gary Doades
On 30 Apr 2004 at 7:26, Dennis Bjorklund wrote: > On Fri, 30 Apr 2004, Gary Doades wrote: > > > I should have also pointed out that MSSQL reported that same index scan > > as taking 65% of the overall query time. It was just "faster". The > > overall query took 103ms in MSSQL. > > Are your results

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Gary Doades
On 29 Apr 2004 at 19:17, Tom Lane wrote: > Josh Berkus <[EMAIL PROTECTED]> writes: > > Certainly the fact that MSSQL is essentially a single-user database makes > > things easier for them. > > Our recent testing (cf the "Xeon" thread) says that the interlocking we > do to make the world safe for

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Dennis Bjorklund
On Fri, 30 Apr 2004, Gary Doades wrote: > I should have also pointed out that MSSQL reported that same index scan > as taking 65% of the overall query time. It was just "faster". The > overall query took 103ms in MSSQL. Are your results based on a single client accessing the database and no conc

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Tom Lane
Josh Berkus <[EMAIL PROTECTED]> writes: > Certainly the fact that MSSQL is essentially a single-user database makes > things easier for them. Our recent testing (cf the "Xeon" thread) says that the interlocking we do to make the world safe for multiple backends has a fairly high cost (at least on

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Gary Doades
> > Having picked out an index scan as being the highest time user I > concentrated on that in this case and compared the same index scan on > MSSQL. At least MSSQL reported it as an index scan on the same index > for the same number of rows. > I should have also pointed out that MSSQL rep

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Gary Doades
On 29 Apr 2004 at 17:54, Tom Lane wrote: > "Gary Doades" <[EMAIL PROTECTED]> writes: > > In this example the statistics don't matter. > > Don't they? > > A prior poster mentioned that he thought MSSQL tries to keep all its > indexes in memory.  I wonder whether you are giving Postgres a fair

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Tom Lane
"Gary Doades" <[EMAIL PROTECTED]> writes: > In this example the statistics don't matter. Don't they? A prior poster mentioned that he thought MSSQL tries to keep all its indexes in memory. I wonder whether you are giving Postgres a fair chance to do the same. What postgresql.conf settings are y

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Gary Doades
On 29 Apr 2004 at 13:54, Josh Berkus wrote: > Gary, > > > It's also quite possble the MSSQL simply has more efficient index scanning > implementation that we do.They've certainly had incentive; their storage > system sucks big time for random lookups and they need those fast indexes. > (

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Josh Berkus
Gary, > In this example the statistics don't matter. The plans used were the same for > MSSQL and Postgres. I was trying to eliminate the difference in plans > between the two, which obviously does make a difference, sometimes in > MSSQL favour and sometimes the other way round. Both systems,

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Gary Doades
On 29 Apr 2004 at 15:35, Kenneth Marshall wrote: > Did you try to cluster based on the index? > > --Ken Yes, This speeds up the index scan a little (12%). This to me just reinforces the overhead that subsequently having to go and fetch the data tuple actually has on the performance. Cheers, G

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Gary Doades
> It's also entirely possible your indices are using inaccurate > statistical information. Have you ANALYZEd recently? > In this example the statistics don't matter. The plans used were the same for MSSQL and Postgres. I was trying to eliminate the difference in plans between the two, which o

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Rosser Schwarz
while you weren't looking, Gary Doades wrote: > Recently I have been looking at raw performance (CPU, IO) > rather than the plans. I have some test queries that (as far > as I can determine) use the same access plans on PostgreSQL > and SQLServer. Getting to the detail, an index scan of an > ind

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Rod Taylor
> I would be nice to get a feel for how much performance loss would be incurred in > maintaining the index flags against possible performance gains for getting the data > back > out again. I guess the real question is, why maintain index flags and not simply drop the index entry altogether? A

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Gary Doades
> > I guess the real question is, why maintain index flags and not simply > drop the index entry altogether? > > A more interesting case would be to have the backend process record > index tuples that it would invalidate (if committed), then on commit > send that list to a garbage collection proc

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Gary Doades
On 29 Apr 2004 at 19:03, Manfred Koizar wrote: > While the storage overhead could be reduced to 1 bit (not a joke) we'd > still have the I/O overhead of locating and updating index tuples for > every heap tuple deleted/updated. But this is what a lot of DBMSs do and seem to do well enough. I can

Re: [PERFORM] planner/optimizer question

2004-04-29 Thread Manfred Koizar
On Wed, 28 Apr 2004 09:05:04 -0400, Tom Lane <[EMAIL PROTECTED]> wrote: >> [ ... visibility information in index tuples ... ] >Storing that information would at least double the overhead space used >for each index tuple. The resulting index bloat would significantly >slow index operations by requ

Re: [PERFORM] planner/optimizer question

2004-04-28 Thread Tom Lane
Manfred Koizar <[EMAIL PROTECTED]> writes: > On Wed, 28 Apr 2004 07:35:41 +0100, "Gary Doades" <[EMAIL PROTECTED]> > wrote: >> Why is there an entry in the index for a row if the row is not valid? > Because whether a row is seen as valid or not lies in the eye of the > transaction looking at it.

Re: [PERFORM] planner/optimizer question

2004-04-28 Thread Manfred Koizar
On Wed, 28 Apr 2004 07:35:41 +0100, "Gary Doades" <[EMAIL PROTECTED]> wrote: >Why is there an entry in the index for a row if the row is not valid? Because whether a row is seen as valid or not lies in the eye of the transaction looking at it. Full visibility information is stored in the heap tu

Re: [PERFORM] planner/optimizer question

2004-04-28 Thread Christopher Kings-Lynne
I can understand the performance loss on non-selects for keeping the index validity state tracking the row validity, but would that outweigh the performance gains on selects? Depends on your mix of selects to non selects I guess, but other database systems seem to imply that keeping the index o

Re: [PERFORM] planner/optimizer question

2004-04-28 Thread Gary Doades
I can understand the performance loss on non-selects for keeping the index validity state tracking the row validity, but would that outweigh the performance gains on selects? Depends on your mix of selects to non selects I guess, but other database systems seem to imply that keeping the index o

Re: [PERFORM] planner/optimizer question

2004-04-28 Thread Christopher Kings-Lynne
Why is there an entry in the index for a row if the row is not valid? Wouldn't it be better for the index entry validity to track the row validity. If a particular data value for a query (join, where etc.) can be satisfied by the index entry itself this would be a big performance gain. For SELEC

Re: [PERFORM] planner/optimizer question

2004-04-27 Thread Gary Doades
I know you will shoot me down, but... Why is there an entry in the index for a row if the row is not valid? Wouldn't it be better for the index entry validity to track the row validity. If a particular data value for a query (join, where etc.) can be satisfied by the index entry itself this wou

Re: [PERFORM] planner/optimizer question

2004-04-27 Thread Tom Lane
[EMAIL PROTECTED] writes: > ... Wouldn't the most efficient plan be to scan the index regardless > of crm_id because the only columns needed are in the index? No. People coming from other databases often have the misconception that queries can be answered by looking only at an index. That is nev

Re: [PERFORM] planner/optimizer question

2004-04-27 Thread Atesz
Hi, You should try the next queries: select support_person_id from ticket_crm_map where crm_id = 7 GROUP BY support_person_id; select support_person_id from ticket_crm_map where crm_id = 1 GROUP BY support_person_id; It can use the 'ticket_crm_map_crm_id_suppid' index. Generally the Postgres u

[PERFORM] planner/optimizer question

2004-04-26 Thread brad-pgperf
Hi, I have a query which I think should be using an index all of the time but postgres only uses the index part of the time. The index (ticket_crm_map_crm_id_suppid) has the where clause column (crm_id) listed first followed by the selected column (support_person_id). Wouldn't the most effi