Re: [SQL] query speed joining tables

Tomasz Myrta Mon, 13 Jan 2003 13:58:33 -0800

Christopher Smith wrote:

I have 4 tables that I need to query... 3 of the tables are links by the
field userid.


below are table sql fragments. as well as the query.  The subquery
causes very high

CPU usages.  It typically returns ~3000 matches. Is there another way to
rewrite this?

SELECT user_login.userid FROM user_login,user_details_p,user_match_details
 WHERE user_login.userid = user_details_p.userid AND
user_details_p.userid = user_match_details.userid AND
user_details_p.gender ='W' AND
user_details_p.seekgender ='M' AND
user_details_p.age >=18 AND
user_details_p.age <=50 AND
user_match_details.min_age <= 30 AND
user_match_details.max_age >= 30 AND
user_details_p.ethnictype = 'Caucasian (White)' AND
strpos(user_match_details.ethnicity,'Asian') !=0 AND
user_details_p.zipcode in (select zips_max.destination from zips_max
where zips_max.origin='90210' )
order by user_login.last_login desc;

explicit joins show better idea of your query and helps postgres choosing indexing.

select userid
from
user_login
join user_details using (userid)
join user_match_details using (userid)
where
user_details_p.gender ='W' AND
user_details_p.seekgender ='M' AND
user_details_p.age between 18 and 50 and
user_match_details.min_age <= 30 AND
user_match_details.max_age >= 30 AND
user_details_p.ethnictype = 'Caucasian (White)' AND
strpos(user_match_details.ethnicity,'Asian') !=0 AND
user_details_p.zipcode in (select zips_max.destination from zips_max
where zips_max.origin='90210' )
order by user_login.last_login desc;

How can I help you with subquery if you didn't write even zips_max definition?!?

If origin is unique value in that table, you can change subquery into join on "from" list.

               Table "public.user_login"
   Column   |           Type           |   Modifiers
------------+--------------------------+---------------
 userid     | character varying(30)    | not null
 password   | character varying(30)    | not null
 email      | character varying(50)    | not null
 last_login | timestamp with time zone | not null
 Indexes: user_login_pkey primary key btree (userid),

Do you really need userid as varchar?
indexing on int4 or int8 would be much faster than varchar

Why do you have 3 tables? It looks like only one table would be enough. Remember, that null values don't take too much space.




               Table "public.user_details_p"
     Column      |           Type           |   Modifiers
-----------------+--------------------------+---------------
 userid          | character varying(30)    | not null
 gender          | character varying(1)     |
 age             | integer                  |
 height          | character varying(10)    |
 ethnicty pe      | character varying(30)    |
 education       | character varying(30)    |
 createdate      | timestamp with time zone | default now()
 zipcode         | character varying(5)     |
 birthdate       | date                     | default now()
 zodiac          | character varying(40)    |
 seekgender      | character varying(2)     |
Indexes: user_details_p_pkey primary key btree (userid),
         user_details_p_age_idx btree (age),
      &nb sp;  user_details_p_ethnic_idx btree (ethnictype),
         user_details_p_gender_idx btree (gender),

         user_details_p_last_login_idx btree (last_login),
         user_details_p_seekgender_idx btree (seekgender),
         user_details_p_state_idx btree (state)

There is too many indexes -
index on gender (2 possible values) is useless,
index on ethnic (how many values - I think not too many?) is possibly useless

Consider creating single index on several fields for queries like this:
select
...
where
user_details_p.gender ='W' AND
user_details_p.seekgender ='M' AND
user_details_p.age between 18 and 50 and

index on (age,seekgender,gender)

Foreign Key constraints: $1 FOREIGN KEY (userid) REFERENCES
user_login(userid) ON UPDATE NO ACTI
ON ON DELETE CASCADE


           Table "public.user_match_details"
      Column      |          Type          | Modifiers
------------------+------------------------+-----------
 userid           | character varying(30)  | not null
 soughtmate       | character varying(200) |
 ethnicity        | character varying(200) |
 marital_status   | character varying(200) |
 min_age          | integer                |
 max_age          | integer   &nbs p;            |
 city             | character varying(50)  |
 state            | character varying(2)   |
 zipcode          | integer                |
 match_distance   | integer                |
Indexes: user_match_details_pkey primary key btree (userid)
Foreign Key constraints: $1 FOREIGN KEY (userid) REFERENCES
user_login(userid) ON UPDATE NO ACTION ON DELETE CASCADE

            Table "public.zips_300"
   Column    |         Type         | Modifiers
-------------+----------------------+-----------
 origin      | character varying(5) |
 destination | character varying(5) |
Indexes: zips_300_origin_idx btree (origin)

If you need more specific answer, you have to add more information - how many records do you have in your tables and how many possible values do you use for example for zipcodes, ethnicity etc.

Regards,
Tomasz Myrta

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [SQL] query speed joining tables

Reply via email to