On Fri, Nov 20, 2009 at 11:41:50AM +0000, Dominic Hargreaves wrote:
> I'm migrating from an RT 2 install to an RT 3.8 install with around
> 170 privileged users (and around 90,000 total users).
> 
> I've done some initial testing with RT 3.8.6 and have observed
> that building the list of privileged users (in the "create new
> ticket" (ticket owner), "display ticket" (reminder owner) and
> "ticket search" (ticket owner) pages) takes around 20-30 seconds to run.
> 
> It performs two queries which are logged by my slow query logger:
> 
> LOG:  duration: 30570.089 ms  statement: SELECT DISTINCT main.* FROM Users 
> main CROSS JOIN ACL ACL_4 JOIN Principals Principals_1  ON ( Principals_1.id 
> = main.id ) JOIN CachedGroupMembers CachedGroupMembers_2  ON ( 
> CachedGroupMembers_2.MemberId = Principals_1.id ) JOIN Groups Groups_3  ON ( 
> Groups_3.id = CachedGroupMembers_2.GroupId )  WHERE (Principals_1.Disabled = 
> '0') AND (ACL_4.PrincipalType = Groups_3.Type) AND (Principals_1.id != '1') 
> AND (Principals_1.PrincipalType = 'User') AND (ACL_4.RightName = 'OwnTicket') 
> AND (Groups_3.Domain = 'RT::System-Role') AND ((ACL_4.ObjectType = 
> 'RT::Queue' AND ACL_4.ObjectId = 3) OR (ACL_4.ObjectType = 'RT::System'))  
> ORDER BY main.Name ASC 
> 
> LOG:  duration: 824.692 ms  statement: SELECT DISTINCT main.* FROM Users main 
> CROSS JOIN ACL ACL_2 JOIN Principals Principals_1  ON ( Principals_1.id = 
> main.id ) JOIN CachedGroupMembers CachedGroupMembers_3  ON ( 
> CachedGroupMembers_3.MemberId = Principals_1.id )  WHERE 
> (Principals_1.Disabled = '0') AND (ACL_2.PrincipalId = 
> CachedGroupMembers_3.GroupId) AND (Principals_1.id != '1') AND 
> (ACL_2.PrincipalType = 'Group') AND (Principals_1.PrincipalType = 'User') AND 
> (ACL_2.RightName = 'OwnTicket') AND ((ACL_2.ObjectType = 'RT::Queue' AND 
> ACL_2.ObjectId   = 3) OR (ACL_2.ObjectType = 'RT::System'))  ORDER BY 
> main.Name ASC 
> 
> The first, on our database, returns no entries at all, whilst the
> second returns the expected number of entries (the list of privileged
> users).
> 
> Concentrating on this very slow one and asking the query planner
> with EXPLAIN ANALYZE:
> 
>  Unique  (cost=147226.19..147269.59 rows=496 width=3382) (actual 
> time=30888.166..30888.166 rows=0 loops=1)
>    ->  Sort  (cost=147226.19..147227.43 rows=496 width=3382) (actual 
> time=30888.164..30888.164 rows=0 loops=1)
>          Sort Key: main.name, main.id, main.password, main.comments, 
> main.signature, main.emailaddress, main.freeformcontactinfo, 
> main.organization, main.realname, main.nickname, main.lang, 
> main.emailencoding, main.webencoding, main.externalcontactinfoid, 
> main.contactinfosystem, main.externalauthid, main.authsystem, main.gecos, 
> main.homephone, main.workphone, main.mobilephone, main.pagerphone, 
> main.address1, main.address2, main.city, main.state, main.zip, main.country, 
> main.timezone, main.pgpkey, main.creator, main.created, main.lastupdatedby, 
> main.lastupdated
>          Sort Method:  quicksort  Memory: 25kB
>          ->  Hash Join  (cost=76985.91..147203.98 rows=496 width=3382) 
> (actual time=30888.088..30888.088 rows=0 loops=1)
>                Hash Cond: ((groups_3.type)::text = 
> (acl_4.principaltype)::text)
>                ->  Hash Join  (cost=76933.22..147046.33 rows=3077 width=3388) 
> (actual time=30887.657..30887.657 rows=0 loops=1)
>                      Hash Cond: (cachedgroupmembers_2.groupid = groups_3.id)
>                      ->  Hash Join  (cost=57544.03..126361.67 rows=126470 
> width=3386) (actual time=1509.645..29963.426 rows=1278161 loops=1)
>                            Hash Cond: (cachedgroupmembers_2.memberid = 
> main.id)
>                            ->  Seq Scan on cachedgroupmembers 
> cachedgroupmembers_2  (cost=0.00..54962.02 rows=3357602 width=8) (actual 
> time=0.026..2268.758 rows=3357455 loops=1)
>                            ->  Hash  (cost=57500.57..57500.57 rows=3477 
> width=3386) (actual time=1509.465..1509.465 rows=92315 loops=1)
>                                  ->  Hash Join  (cost=52857.90..57500.57 
> rows=3477 width=3386) (actual time=993.708..1360.215 rows=92315 loops=1)
>                                        Hash Cond: (main.id = principals_1.id)
>                                        ->  Seq Scan on users main  
> (cost=0.00..3223.16 rows=92316 width=3382) (actual time=0.022..75.080 
> rows=92316 loops=1)
>                                        ->  Hash  (cost=51835.42..51835.42 
> rows=81798 width=4) (actual time=993.584..993.584 rows=92316 loops=1)
>                                              ->  Seq Scan on principals 
> principals_1  (cost=0.00..51835.42 rows=81798 width=4) (actual 
> time=0.036..914.611 rows=92316 loops=1)
>                                                    Filter: ((id <> 1) AND 
> (disabled = 0) AND ((principaltype)::text = 'User'::text))
>                      ->  Hash  (cost=18759.68..18759.68 rows=50361 width=10) 
> (actual time=0.092..0.092 rows=4 loops=1)
>                            ->  Bitmap Heap Scan on groups groups_3  
> (cost=1907.17..18759.68 rows=50361 width=10) (actual time=0.078..0.081 rows=4 
> loops=1)
>                                  Recheck Cond: ((domain)::text = 
> 'RT::System-Role'::text)
>                                  ->  Bitmap Index Scan on groups1  
> (cost=0.00..1894.58 rows=50361 width=0) (actual time=0.059..0.059 rows=4 
> loops=1)
>                                        Index Cond: ((domain)::text = 
> 'RT::System-Role'::text)
>                ->  Hash  (cost=52.39..52.39 rows=24 width=6) (actual 
> time=0.401..0.401 rows=97 loops=1)
>                      ->  Bitmap Heap Scan on acl acl_4  (cost=8.77..52.39 
> rows=24 width=6) (actual time=0.228..0.303 rows=97 loops=1)
>                            Recheck Cond: ((((rightname)::text = 
> 'OwnTicket'::text) AND ((objecttype)::text = 'RT::Queue'::text) AND (objectid 
> = 3)) OR (((rightname)::text = 'OwnTicket'::text) AND ((objecttype)::text = 
> 'RT::System'::text)))
>                            ->  BitmapOr  (cost=8.77..8.77 rows=24 width=0) 
> (actual time=0.205..0.205 rows=0 loops=1)
>                                  ->  Bitmap Index Scan on acl1  
> (cost=0.00..4.30 rows=4 width=0) (actual time=0.091..0.091 rows=2 loops=1)
>                                        Index Cond: (((rightname)::text = 
> 'OwnTicket'::text) AND ((objecttype)::text = 'RT::Queue'::text) AND (objectid 
> = 3))
>                                  ->  Bitmap Index Scan on acl1  
> (cost=0.00..4.45 rows=20 width=0) (actual time=0.113..0.113 rows=95 loops=1)
>                                        Index Cond: (((rightname)::text = 
> 'OwnTicket'::text) AND ((objecttype)::text = 'RT::System'::text))
>  Total runtime: 30892.368 ms
> 
> Here most of the time is spent in the hash join for:
> 
> Hash Cond: (cachedgroupmembers_2.memberid = main.id)
> 
> and this join eliminates all the remaining rows.
> 
> Clearly for us, this first join is useless, but can anyone comment on
> whether this is usual, and why it's so slow for us? 
> 
> We're running an essentially identical database configuration with
> another instance of 3.8.6, which has around 10 privileged users and so
> unsurprisingly doesn't have this problem, and the query planner is
> different here:
> 
>  Unique  (cost=638.22..638.31 rows=1 width=1086) (actual time=0.355..0.355 
> rows=0 loops=1)
>    ->  Sort  (cost=638.22..638.22 rows=1 width=1086) (actual 
> time=0.353..0.353 rows=0 loops=1)
>          Sort Key: main.name, main.id, main.password, main.comments, 
> main.signature, main.emailaddress, main.freeformcontactinfo, 
> main.organization, main.realname, main.nickname, main.lang, 
> main.emailencoding, main.webencoding, main.externalcontactinfoid, 
> main.contactinfosystem, main.externalauthid, main.authsystem, main.gecos, 
> main.homephone, main.workphone, main.mobilephone, main.pagerphone, 
> main.address1, main.address2, main.city, main.state, main.zip, main.country, 
> main.timezone, main.pgpkey, main.creator, main.created, main.lastupdatedby, 
> main.lastupdated
>          Sort Method:  quicksort  Memory: 25kB
>          ->  Nested Loop  (cost=0.00..638.21 rows=1 width=1086) (actual 
> time=0.300..0.300 rows=0 loops=1)
>                ->  Nested Loop  (cost=0.00..637.87 rows=1 width=8) (actual 
> time=0.299..0.299 rows=0 loops=1)
>                      ->  Nested Loop  (cost=0.00..635.76 rows=6 width=4) 
> (actual time=0.298..0.298 rows=0 loops=1)
>                            ->  Nested Loop  (cost=0.00..608.38 rows=3 
> width=4) (actual time=0.296..0.296 rows=0 loops=1)
>                                  ->  Seq Scan on acl acl_4  (cost=0.00..16.40 
> rows=2 width=6) (actual time=0.031..0.219 rows=3 loops=1)
>                                        Filter: (((rightname)::text = 
> 'OwnTicket'::text) AND ((((objecttype)::text = 'RT::Queue'::text) AND 
> (objectid = 3)) OR ((objecttype)::text = 'RT::System'::text)))
>                                  ->  Index Scan using groups1 on groups 
> groups_3  (cost=0.00..294.15 rows=147 width=10) (actual time=0.024..0.024 
> rows=0 loops=3)
>                                        Index Cond: (((groups_3.domain)::text 
> = 'RT::System-Role'::text) AND ((groups_3.type)::text = 
> (acl_4.principaltype)::text))
>                            ->  Index Scan using cachedgroupmembers3 on 
> cachedgroupmembers cachedgroupmembers_2  (cost=0.00..9.06 rows=5 width=8) 
> (never executed)
>                                  Index Cond: (cachedgroupmembers_2.groupid = 
> groups_3.id)
>                      ->  Index Scan using principals_pkey on principals 
> principals_1  (cost=0.00..0.34 rows=1 width=4) (never executed)
>                            Index Cond: (principals_1.id = 
> cachedgroupmembers_2.memberid)
>                            Filter: ((principals_1.id <> 1) AND 
> (principals_1.disabled = 0::smallint) AND ((principals_1.principaltype)::text 
> = 'User'::text))
>                ->  Index Scan using users_pkey on users main  
> (cost=0.00..0.33 rows=1 width=1086) (never executed)
>                      Index Cond: (main.id = principals_1.id)
>  Total runtime: 0.602 ms
> 
> cachedgroupmembers and users have the same indexes on both systems
> (I'm not sure whether that's relevant).
> 
> Both systems are Debian lenny, RT 3.8.6, Postgres 8.3.
> 
> If anyone has any other advice about running RT 3.8 on postgres
> I'd be interested too - I've added a couple of extra indexes:
> 
> CREATE INDEX Groups3 ON Groups (LOWER(Domain), LOWER(Type));
> CREATE INDEX users5 ON users (LOWER(emailaddress));
> 
> which are missing from the default installation which have
> cut down some other common slow queries (the former
> is already in an RT ticket:
> 
> http://rt3.fsck.com/Ticket/Display.html?id=13056
> 
> )
> 
> Thanks,
> Dominic.
> 
Hi Dominic,

First, do you have $UseSQLForACLChecks set? I know that that
is a new option and there may still be performance tuning that
needs to be done to have it work well. We run RT 3.8.5 on a
PostgreSQL 8.4.1 database with 25K users and about 400 privileged
users and we do not see a performance problem. Would you mind
posting your postgres.conf changes from the default values as
well as the indexes you have defined for the tables involved.
Also, what is your statistics target for your tables?

Regards,
Ken
_______________________________________________
http://lists.bestpractical.com/cgi-bin/mailman/listinfo/rt-users

Community help: http://wiki.bestpractical.com
Commercial support: sa...@bestpractical.com


Discover RT's hidden secrets with RT Essentials from O'Reilly Media. 
Buy a copy at http://rtbook.bestpractical.com

Reply via email to