[ https://issues.apache.org/jira/browse/AMBARI-9334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14305300#comment-14305300 ]
Jonathan Hurley commented on AMBARI-9334: ----------------------------------------- You cannot use DISTINCT on stage as it contains BLOB data. > Ambari StageDAO.findByCommandStatuses causes Postgress HIGH CPU > --------------------------------------------------------------- > > Key: AMBARI-9334 > URL: https://issues.apache.org/jira/browse/AMBARI-9334 > Project: Ambari > Issue Type: Bug > Components: ambari-server > Affects Versions: 1.6.0, 1.6.1, 1.7.0 > Environment: RHEL 6.4/6.5 > postgresql-server-8.4.13-1.el6_3.x86_64 > postgresql-8.4.13-1.el6_3.x86_64 > postgresql-libs-8.4.13-1.el6_3.x86_64 > ambari-agent-1.6.1-98.x86_64 > ambari-log4j-1.6.1.98-1.noarch > ambari-server-1.6.1-98.noarch > Reporter: Greg Senia > Assignee: Jonathan Hurley > Priority: Critical > Fix For: 2.0.0 > > Attachments: AMBARI-9334.patch, AMBARI-9334.patch.2, Screen Shot > 2015-01-26 at 12.18.56 PM.png, Screen Shot 2015-01-26 at 12.19.26 PM.png, > StageDAO-1.6.1.patch, StageDAO.java > > > The following code that generates a query causes postgres to use lots of CPU > espcially if the Ambari DB grows over time. We reduced CPU by 30-40% by > fixing the code below. > Before: > public List<StageEntity> findByCommandStatuses(Collection<HostRoleStatus> > statuses) { > TypedQuery<StageEntity> query = > entityManagerProvider.get().createQuery("SELECT stage " + > "FROM StageEntity stage WHERE stage.stageId IN (SELECT hrce.stageId FROM " + > "HostRoleCommandEntity hrce WHERE stage.requestId = hrce.requestId and > hrce.status IN ?1 ) " + > "ORDER BY stage.requestId, stage.stageId", StageEntity.class); > return daoUtils.selectList(query, statuses); > } > After: > @RequiresSession > public List<StageEntity> findByCommandStatuses(Collection<HostRoleStatus> > statuses) { > TypedQuery<StageEntity> query = > entityManagerProvider.get().createQuery("SELECT stage "+ > "FROM StageEntity stage, HostRoleCommandEntity hrce " + > "WHERE stage.requestId = hrce.requestId AND stage.stageId = hrce.stageId and > hrce.status IN ?1 " + > "ORDER BY stage.requestId, stage.stageId", StageEntity.class); > return daoUtils.selectList(query, statuses); > } > Before EXPLAIN ANALYZE: > ambari=> explain analyze SELECT t0.stage_id, t0.cluster_host_info, > t0.cluster_id, t0.log_info, t0.request_context, t0.request_id FROM stage t0 > WHERE t0.stage_id IN (SELECT t1.stage_id FROM host_role_command t1 WHERE > ((t0.request_id = t1.request_id) AND (t1.status IN > ('QUEUED','IN_PROGRESS','PENDING')))) ORDER BY t0.request_id, t0.stage_id > ; > QUERY PLAN > > ------------------------------------------------------------------------------------------------------------------------------------ > Sort (cost=7407488.50..7407492.69 rows=1676 width=894) (actual > time=55418.086..55418.086 rows=0 loops=1) > Sort Key: t0.request_id, t0.stage_id > Sort Method: quicksort Memory: 25kB > -> Seq Scan on stage t0 (cost=0.00..7407398.75 rows=1676 width=894) > (actual time=55418.081..55418.081 rows=0 loops=1) > Filter: (SubPlan 1) > SubPlan 1 > -> Seq Scan on host_role_command t1 (cost=0.00..4418.07 rows=1 > width=8) (actual time=16.514..16.514 rows=0 loops=3353) > Filter: (($0 = request_id) AND ((status)::text = ANY > ('{QUEUED,IN_PROGRESS,PENDING}'::text[]))) > Total runtime: 55418.123 ms > (9 rows) > After: Explain Analyze: > ambari=> explain analyze SELECT t0.stage_id, t0.cluster_host_info, > t0.cluster_id, t0.log_info, t0.request_context, t0.request_id FROM stage t0, > host_role_command t1 WHERE ((t0.request_id = t1.request_id) AND t0.stage_id > =t1.stage_id and (t1.status IN ('QUEUED','IN_PROGRESS','PENDING'))) ORDER BY > t0.stage_id,t0.request_id > ; > QUERY PLAN > > -------------------------------------------------------------------------------------------------------------------------------- > Sort (cost=4346.51..4346.52 rows=2 width=894) (actual time=53.605..53.605 > rows=0 loops=1) > Sort Key: t0.stage_id, t0.request_id > Sort Method: quicksort Memory: 25kB > -> Nested Loop (cost=0.00..4346.50 rows=2 width=894) (actual > time=53.596..53.596 rows=0 loops=1) > -> Seq Scan on host_role_command t1 (cost=0.00..4338.22 rows=1 > width=16) (actual time=53.595..53.595 rows=0 loops=1) > Filter: ((status)::text = ANY > ('{QUEUED,IN_PROGRESS,PENDING}'::text[])) > -> Index Scan using stage_pkey on stage t0 (cost=0.00..8.27 rows=1 > width=894) (never executed) > Index Cond: ((t0.stage_id = t1.stage_id) AND (t0.request_id = > t1.request_id)) > Total runtime: 53.654 ms > (9 rows) -- This message was sent by Atlassian JIRA (v6.3.4#6332)