It looks like it is just a difference in data volume.  We are re-working the 
query to see what that will do.

Thanks for the suggestions.

-K


On 5/14/10 2:23 PM, "Adams, Keaton" <keaton_ad...@mcafee.com> wrote:

OK,

So if I change the NOT IN clause the query will run with a MERGE JOIN and 
complete in about 20 seconds.  Have a look at the logic I am following and see 
if it makes sense.  Might this just be a case where because there is more data 
in one DB compared to another (even though the counts are "close"), that's the 
difference in this whole thing?

-K


postgres@> more badquery.sql
EXPLAIN SELECT substring(users.email from '^.*\@') || domains.domain as email,
                users.customer_id,
                users.password,
                p.policy_set_id,
                users.user_id,
                domains.auth_type
         FROM
               (
                 SELECT d.customer_id, d.domain, d.domain_id, d.auth_type, 
d.active from mxl_domain d
                 UNION ALL
                 SELECT d.customer_id, da.domain, da.domain_id, d.auth_type, 
da.active from mxl_domain d, mxl_domain_alias da WHERE da.domain_id = 
d.domain_id
               ) as domains,
               (
                 SELECT u.email, u.customer_id, u.user_id, u.domain_id, 
u.password, u.active from mxl_user u
                 UNION ALL
                 SELECT ua.email, u.customer_id, u.user_id, u.domain_id, 
u.password, ua.active from mxl_user u, mxl_user_alias ua WHERE ua.user_id = 
u.user_id
               ) as users,
                wds_policy_set p
         WHERE  users.customer_id  = p.id
         AND    users.customer_id  = domains.customer_id
         AND    users.domain_id    = domains.domain_id
         AND    p.default_flag = 1
         AND    p.web_access_flag = 1
         AND    p.scope  = 3
         AND    domains.active = 1
         AND    users.active != 0
         AND    p.active = 1
         AND    users.user_id NOT IN (SELECT user_id FROM mxl_user_group mug 
WHERE mug.user_id = users.user_id);


postgres@ time psql -Upostgres -dmxl -fbadquery.sql
                                                                  QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=3522288.96..3534654.13 rows=245730 width=1564)
   Merge Cond: ((p.id = u.customer_id) AND (d.domain_id = u.domain_id))
   ->  Sort  (cost=38160.35..38471.30 rows=124381 width=536)
         Sort Key: p.id, d.domain_id
         ->  Hash Join  (cost=578.15..10203.83 rows=124381 width=536)
               Hash Cond: (d.customer_id = p.id)
               ->  Append  (cost=0.00..4989.45 rows=77541 width=528)
                     ->  Seq Scan on mxl_domain d  (cost=0.00..1810.88 
rows=64390 width=28)
                           Filter: (active = 1)
                     ->  Subquery Scan "*SELECT* 2"  (cost=2454.78..3178.57 
rows=13151 width=528)
                           ->  Hash Join  (cost=2454.78..3047.06 rows=13151 
width=33)
                                 Hash Cond: (da.domain_id = d.domain_id)
                                 ->  Seq Scan on mxl_domain_alias da  
(cost=0.00..296.39 rows=13151 width=25)
                                       Filter: (active = 1)
                                 ->  Hash  (cost=1649.90..1649.90 rows=64390 
width=12)
                                       ->  Seq Scan on mxl_domain d  
(cost=0.00..1649.90 rows=64390 width=12)
               ->  Hash  (cost=525.74..525.74 rows=4193 width=8)
                     ->  Seq Scan on wds_policy_set p  (cost=0.00..525.74 
rows=4193 width=8)
                           Filter: ((default_flag = 1) AND (web_access_flag = 
1) AND (active = 1) AND (scope = 3))
   ->  Materialize  (cost=3484128.61..3497039.21 rows=1032848 width=1044)
         ->  Sort  (cost=3484128.61..3486710.73 rows=1032848 width=1044)
               Sort Key: u.customer_id, u.domain_id
               ->  Append  (cost=0.00..2826808.61 rows=1032848 width=1044)
                     ->  Index Scan using mxl_user_domain_id_idx on mxl_user u  
(cost=0.00..2416377.66 rows=906921 width=52)
                           Filter: ((active <> 0) AND (NOT (subplan)))
                           SubPlan
                             ->  Index Scan using mxl_user_group_uid_idx on 
mxl_user_group mug  (cost=0.00..2.47 rows=1 width=4)
                                   Index Cond: (user_id = $0)
                     ->  Subquery Scan "*SELECT* 2"  (cost=8.23..410430.95 
rows=125927 width=1044)
                           Filter: (NOT (subplan))
                           ->  Merge Join  (cost=8.23..95536.63 rows=251854 
width=55)
                                 Merge Cond: (u.user_id = ua.user_id)
                                 ->  Index Scan using mxl_user_pkey on mxl_user 
u  (cost=0.00..77679.47 rows=1881318 width=28)
                                 ->  Index Scan using mxl_user_alias_uid_idx on 
mxl_user_alias ua  (cost=0.00..10109.21 rows=251854 width=31)
                                       Filter: (ua.active <> 0)
                           SubPlan
                             ->  Index Scan using mxl_user_group_uid_idx on 
mxl_user_group mug  (cost=0.00..2.47 rows=1 width=4)
                                   Index Cond: (user_id = $0)
(38 rows)


postgres@ time psql -Upostgres -dmxl -fbadquery.sql -okda.out

real    0m22.645s
user    0m1.565s
sys    0m0.246s
postgres@> wc -l kda.out
285563 kda.out




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to