I've got 2 tables, one with a list of users (has only about 5000 
entries) and another with a list of userids that have already been 
sent letters. I'm trying to efficiently join these two so I get every 
user who hasn't been sent a letter. The problem is, coupled with the 
5 other joins on the user table the query can take about 5 minutes to 
run. It's performing the joins and then reducing the list. 

The reduction of the list doesn't seem to be terribly efficient. Here 
are some strategies I've been looking at:

select id from users WHERE 
  id not in (select userid from sentletters where lettertype=1) AND
  aclgroup IN (1,2);                       

Seq Scan on users  (cost=0.00..217751.39 rows=5369 width=4)
  SubPlan
    ->  Seq Scan on sentletters  (cost=0.00..81.06 rows=4405 width=4)

select id from users WHERE not exists 
  (select userid from sentletters where lettertype=1 AND userid=id) 
  AND aclgroup IN (1,2);

Seq Scan on users  (cost=0.00..10980.07 rows=1 width=4)
  SubPlan
    ->  Index Scan using sentletters_userid_key on sentletters  
(cost=0.00..2.02 rows=1 width=4)

select id from users AS u 
  LEFT JOIN sentletters AS sl ON (lettertype=1 AND sl.userid=u.id) 
where sl.userid IS NULL AND u.aclgroup IN (1,2);

Hash Join  (cost=81.06..550.18 rows=5322 width=12)
  ->  Seq Scan on users u  (cost=0.00..152.53 rows=5322 width=4)
  ->  Hash  (cost=70.05..70.05 rows=4405 width=8)
        ->  Seq Scan on sentletters sl  (cost=0.00..70.05 rows=4405 
width=8)

All but the last which is an odd way to do it have nasty query plan 
and this isn't even talking about the joins.

I then need to join these results with a table that has about 200,000 
rows and is joined on the userid and some other key elements.

Any suggestions on this? The full query causing the problem is:
select u.id,u.firstname,u.surname,f2.strval,f3.strval,f4.strval,
       f5.strval,u2.firstname,u2.surname,m.strval 
from 
  users as u 
  JOIN dft_formdata as f1 ON 
    (u.id=f1.userid AND u.aclgroup IN (1,2) AND f1.formid=1 AND 
     f1.fieldid=1) 
  LEFT JOIN dft_formdata as f2 ON 
    (u.id=f2.userid AND f2.formid=1 AND f2.fieldid=2) 
  JOIN dft_formdata as f3 on 
    (u.id=f3.userid AND f3.formid=1 AND f3.fieldid=3) 
  JOIN dft_formdata as f4 on 
    (u.id=f4.userid AND f4.formid=1 AND f4.fieldid=5) 
  JOIN relations as r on 
    (u.id=r.relatedto AND r.type=2) 
  JOIN users as u2 on 
    (u2.id=r.userid AND u2.aclgroup=200) 
  JOIN dft_formdata as f5 on 
    (u.id=f5.userid AND f5.formid=1 AND f5.fieldid=4) 
  JOIN dft_formmvl as m on 
    (m.id=f5.intval and m.mvlid=1) 
  JOIN ft_formdata as f6 on 
    (u.id=f6.userid AND f6.formid=1 AND f6.fieldid=155 AND 
     f6.intval=3) 
  WHERE not exists 
    (select userid from sentletters WHERE userid=u.id);        

Nested Loop  (cost=0.00..11280.10 rows=1 width=164)
 ->  Nested Loop  (cost=0.00..11276.36 rows=1 width=160)
  ->  Nested Loop  (cost=0.00..11274.33 rows=1 width=144)
   ->  Nested Loop  (cost=0.00..11270.59 rows=1 width=124)
    ->  Nested Loop  (cost=0.00..11268.56 rows=1 width=96)
     ->  Nested Loop  (cost=0.00..10981.55 rows=1 width=88)
      ->  Nested Loop  (cost=0.00..10977.82 rows=1 width=72)
       ->  Nested Loop  (cost=0.00..10974.10 rows=1 width=56)
        ->  Nested Loop  (cost=0.00..10970.37 rows=1 width=32)
         ->  Seq Scan on users u  (cost=0.00..10966.65 rows=1 
width=28)
          SubPlan
           ->  Index Scan using sentletters_userid_key on 
sentletters  (cost=0.00..2.01 rows=1 width=4)
         ->  Index Scan using dft_formdata_userid_field on 
dft_formdata f1  (cost=0.00..3.71 rows=1 width=4)
        ->  Index Scan using dft_formdata_userid_field on 
dft_formdata f2  (cost=0.00..3.71 rows=1 width=24)
       ->  Index Scan using dft_formdata_userid_field on dft_formdata 
f3  (cost=0.00..3.71 rows=1 width=16)
      ->  Index Scan using dft_formdata_userid_field on dft_formdata 
f4  (cost=0.00..3.71 rows=1 width=16)
     ->  Seq Scan on relations r  (cost=0.00..185.43 rows=5079 
width=8)
    ->  Index Scan using users_pkey on users u2  (cost=0.00..2.02 
rows=1 width=28)
   ->  Index Scan using dft_formdata_userid_field on dft_formdata f5  
(cost=0.00..3.71 rows=1 width=20)
  ->  Index Scan using dft_formmvl_pkey on dft_formmvl m  
(cost=0.00..2.02 rows=1 width=16)
 ->  Index Scan using dft_formdata_userid_field on dft_formdata f6  
(cost=0.00..3.71 rows=1 width=4)


Yes I know the query itself is really nasty but I think 5 minutes is 
a little excessive.

-Michael
_________________________________________________________________
     http://fastmail.ca/ - Fast Free Web Email for Canadians
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Reply via email to