Reuven--
a few years back this same scenario happened for me working on a project with a
large Oracle DB which held clinical data
We needed to implement a join condition but the column we were using wa not
using an Index scan
so since the cardinality of the results was extraordinary we re-implemented the
column to use EXISTS (for large result sets)
*following the omnipresent example used by Oracle books where IN is used for
columns with low cardinality *
and following the docs from EXISTS where the EXISTS clause ALWAYS has to return
something
In our case we were joining on a doctorid with IN (which made no sense as there
were millions of PatientIDs) to find all patients whose doctors
were in PPO's so to increase performance we changed the IN clause for the
column with high cardinality (doctorid) to EXISTS
select p.PATIENT_NAME from PATIENT p where p.doctorid in (select doctorid from
ppo_table)
/******there are many doctorids in this ppo table so we will change to
EXISTS********/
select p.PATIENT_NAME from PATIENT p
where exists (select 0 from ppo_table ppo where p.doctorID =
ppo.doctorID);
Shalom/
Martin--
---------------------------------------------------------------------------
This e-mail message (including attachments, if any) is intended for the use of
the individual or entity to which it is addressed and may contain information
that is privileged, proprietary , confidential and exempt from disclosure. If
you are not the intended recipient, you are notified that any dissemination,
distribution or copying of this communication is strictly prohibited.
---------------------------------------------------------------------------
Le présent message électronique (y compris les pièces qui y sont annexées, le
cas échéant) s'adresse au destinataire indiqué et peut contenir des
renseignements de caractère privé ou confidentiel. Si vous n'êtes pas le
destinataire de ce document, nous vous signalons qu'il est strictement interdit
de le diffuser, de le distribuer ou de le reproduire.
----- Original Message -----
From: "Alvaro Herrera" <[EMAIL PROTECTED]>
To: "Reuven M. Lerner" <[EMAIL PROTECTED]>
Cc: "Webb Sprague" <[EMAIL PROTECTED]>; <[email protected]>
Sent: Tuesday, March 06, 2007 4:19 PM
Subject: Re: [GENERAL] Database slowness -- my design, hardware, or both?
> Reuven M. Lerner escribió:
>> Hi, Webb Sprague. You wrote:
>> >... but I see two seq scans in your explain in a loop -- this is
>> >probably not good. If you can find a way to rewrite the IN clause
>> >(either de-normalizing through triggers to save whatever you need on
>> >an insert and not have to deal with a set, or by using except in the
>> >query, or someing else more creative)...
>> I would normally agree that an IN clause is a place to worry -- except
>> that I'm using IN on a very small table, with about 4-5 rows. That
>> might indeed affect things, and I could certainly pull out these values
>> into a Perl variable that I then interpolate literally into the SQL.
>> However, I have to assume that this wouldn't affect things all that much.
>
> Don't assume -- measure. I had a query which ran orders of magnitude
> faster because I interpolated the constant list in the big query. The
> table from which the interpolated values were being extracted had about
> 30 rows or so.
>
> --
> Alvaro Herrera http://www.CommandPrompt.com/
> The PostgreSQL Company - Command Prompt, Inc.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>
---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?
http://archives.postgresql.org/