[ 
https://issues.apache.org/jira/browse/CONNECTORS-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13529590#comment-13529590
 ] 

Karl Wright commented on CONNECTORS-584:
----------------------------------------

r1420515 for the final fix.  Using multiple criteria for ORDER BY was what was 
breaking MySQL.  However, I need to verify that the fix does not break any 
other database; if so we will need an abstraction.

                
> Crawling with MySQL does not use indexes for order-by on critical queries
> -------------------------------------------------------------------------
>
>                 Key: CONNECTORS-584
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-584
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Framework core
>    Affects Versions: ManifoldCF 1.0.1
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 1.1
>
>
> The following ORDER-BY query is taking a long time on MySQL:
> {code}
> --------------------------------------------------------
> # Time: 121204 16:25:40
> # User@Host: manifoldcf[manifoldcf] @ localhost [127.0.0.1]
> # Query_time: 7.240532  Lock_time: 0.000204 Rows_sent: 1200  Rows_examined: 
> 611091
> SET timestamp=1354605940;
> SELECT 
> t0.id,t0.jobid,t0.dochash,t0.docid,t0.status,t0.failtime,t0.failcount,t0.priorityset
>  FROM jobqueue t0 WHERE t0.status IN ('P','G') AND t0.checkaction='R' AND 
> t0.checktime<=1354605932817 AND EXISTS(SELECT 'x' FROM jobs t1 WHERE 
> t1.status IN ('A','a') AND t1.id=t0.jobid AND t1.priority=5) AND NOT 
> EXISTS(SELECT 'x' FROM jobqueue t2 WHERE t2.dochash=t0.dochash AND t2.status 
> IN ('A','F','a','f','D','d') AND t2.jobid!=t0.jobid) AND NOT EXISTS(SELECT 
> 'x' FROM prereqevents t3,events t4 WHERE t0.id=t3.owner AND 
> t3.eventname=t4.name) ORDER BY t0.docpriority ASC,t0.status 
> ASC,t0.checkaction ASC,t0.checktime ASC LIMIT 1200;
> # Time: 121204 16:25:44
> # User@Host: manifoldcf[manifoldcf] @ localhost [127.0.0.1]
> # Query_time: 3.064339  Lock_time: 0.000084 Rows_sent: 1  Rows_examined: 
> 406359
> SET timestamp=1354605944;
> SELECT docpriority,jobid,dochash,docid FROM jobqueue t0 WHERE status IN 
> ('P','G') AND checkaction='R' AND checktime<=1354605932817 AND EXISTS(SELECT 
> 'x' FROM jobs t1 WHERE t1.status IN ('A','a') AND t1.id=t0.jobid)  ORDER BY 
> docpriority ASC,status ASC,checkaction ASC,checktime ASC LIMIT 1;
> -------------------------------------------------------
> {code}
> I wonder if the queries appropriately use index of the table. 
> As a result of EXPLAIN against the slow query, there was filesort.
> There seems to be some conditions that MySQL does not use index depending on 
> ORDER BY:
>  - Executing ORDER BY against multiple keys
>  - When keys selected from records are different from keys used by ORDER BY
> Since filesort was happening, fully scanning records should be having MCF 
> slower.
> Do you think this could happen even in PostgreSQL or HSQLDB?
> Do you think queries could be modified to use index appropriately?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to