[ https://issues.apache.org/jira/browse/SOLR-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14142529#comment-14142529 ]
Steve Rowe edited comment on SOLR-5986 at 9/21/14 5:46 PM: ----------------------------------------------------------- Patch (see detailed info about my changes at a new review request I created: https://reviews.apache.org/r/25882/): * Renamed {{QueryTimeoutBase}} to {{QueryTimeout}}, {{QueryTimeout}} to {{QueryTimeoutImpl}}, and {{SolrQueryTimeout}} to {{SolrQueryTimeoutImpl}} * Changed timeout checks to use to use subtraction rather than direct comparison, to handle overflow (see {{nanoTime()}} javadocs) * Beefed up javadocs * Now parsing {{timeAllowed}} request param as a long (added this capability to {{SolrParams}}) * Added a cloud test * Added testing of very large {{timeAllowed}} values, and negative values too I beasted all 4 tests 20 times each, all are passed. was (Author: steve_rowe): Patch (see detailed info about my changes at a new review request I created: https://reviews.apache.org/r/25882/): * Renamed {{QueryTimeoutBase}}->{{QueryTimeout}}, {{QueryTimeout}}->{{QueryTimeoutImpl}}, and {{SolrQueryTimeout}}->{{SolrQueryTimeoutImpl}} * Changed timeout checks to use to use subtraction rather than direct comparison, to handle overflow (see {{nanoTime()}} javadocs) * Beefed up javadocs * Now parsing {{timeAllowed}} request param as a long (added this capability to {{SolrParams}}) * Added a cloud test * Added testing of very large {{timeAllowed}} values, and negative values too I beasted all 4 tests 20 times each, all are passed. > Don't allow runaway queries from harming Solr cluster health or search > performance > ---------------------------------------------------------------------------------- > > Key: SOLR-5986 > URL: https://issues.apache.org/jira/browse/SOLR-5986 > Project: Solr > Issue Type: Improvement > Components: search > Reporter: Steve Davids > Assignee: Anshum Gupta > Priority: Critical > Fix For: 4.10 > > Attachments: SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, > SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, > SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch > > > The intent of this ticket is to have all distributed search requests stop > wasting CPU cycles on requests that have already timed out or are so > complicated that they won't be able to execute. We have come across a case > where a nasty wildcard query within a proximity clause was causing the > cluster to enumerate terms for hours even though the query timeout was set to > minutes. This caused a noticeable slowdown within the system which made us > restart the replicas that happened to service that one request, the worst > case scenario are users with a relatively low zk timeout value will have > nodes start dropping from the cluster due to long GC pauses. > [~amccurry] Built a mechanism into Apache Blur to help with the issue in > BLUR-142 (see commit comment for code, though look at the latest code on the > trunk for newer bug fixes). > Solr should be able to either prevent these problematic queries from running > by some heuristic (possibly estimated size of heap usage) or be able to > execute a thread interrupt on all query threads once the time threshold is > met. This issue mirrors what others have discussed on the mailing list: > http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org