[
https://issues.apache.org/jira/browse/AMBARI-10050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14359297#comment-14359297
]
Hadoop QA commented on AMBARI-10050:
------------------------------------
{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12704221/AMBARI-10050.patch
against trunk revision .
{color:green}+1 @author{color}. The patch does not contain any @author
tags.
{color:red}-1 tests included{color}. The patch doesn't appear to include
any new or modified tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.
{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.
{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.
{color:red}-1 core tests{color}. The test build failed in ambari-server
Test results:
https://builds.apache.org/job/Ambari-trunk-test-patch/2015//testReport/
Console output:
https://builds.apache.org/job/Ambari-trunk-test-patch/2015//console
This message is automatically generated.
> Querying For Requests By Task Status Has Poor Performance
> ---------------------------------------------------------
>
> Key: AMBARI-10050
> URL: https://issues.apache.org/jira/browse/AMBARI-10050
> Project: Ambari
> Issue Type: Bug
> Components: ambari-server
> Affects Versions: 2.0.0
> Reporter: Jonathan Hurley
> Assignee: Jonathan Hurley
> Priority: Critical
> Fix For: 2.0.0
>
> Attachments: AMBARI-10050.patch
>
>
> When querying for the requests that are either IN_PROGRESS, FAILED or
> COMPLETED, the query being used is inefficient and can cause a wait of up to
> 10 minutes in a cluster where there is a large number of stages and tasks.
> {{HostRoleCommandDAO.getRequestsByTaskStatus(...)}}
> This SQL seems overly complex for what it is. Removing the nested SELECT
> seems like a great way to reduce the query time:
> {code}
> SELECT DISTINCT task.request_id as request_id
> FROM host_role_command task WHERE task.status IN ( 'COMPLETED', 'FAILED',
> 'TIMEDOUT', 'ABORTED')
> ORDER BY task.request_id ASC;
> {code}
> ... or if we want to keep the NOT IN
> {code}
> SELECT DISTINCT task.request_id as task_id
> FROM host_role_command task WHERE task.status NOT IN ( 'QUEUED',
> 'IN_PROGRESS',
> 'PENDING', 'HOLDING',
> 'HOLDING_FAILED',
> 'HOLDING_TIMEOUT' )
> ORDER BY task.request_id ASC
> LIMIT 1000
> {code}
> But to be honest, my suggestion is to rewrite this as a simple query that
> matches from an {{EnumSet}} found in {{HostRoleStatus}}.
> Essentially, the problem here is that {{getRequestsByStatus}} is trying to do
> the calculation work to determine the request status from task status all in
> SQL. This method should be broken out into 2 SQL queries:
> - My above query for IN_PROGRESS or FAILED requests
> - A new query for COMPLETED that looks for any requests where all tasks have
> completed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)