[ https://issues.apache.org/jira/browse/SOLR-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158609#comment-14158609 ]
Hoss Man commented on SOLR-5986: -------------------------------- FYI: this assertion (modified by r1627622/r1627635) has been failing in jenkins several times since committed... {noformat} 1498993 shalin // test group query 1627635 anshum // TODO: Remove this? This doesn't make any real sense now that timeAllowed might trigger early 1627635 anshum // termination of the request during Terms enumeration/Query expansion. 1627635 anshum // During such an exit, partial results isn't supported as it wouldn't make any sense. 1627635 anshum // Increasing the timeAllowed from 1 to 100 for now. 1498993 shalin queryPartialResults(upShards, upClients, 1498993 shalin "q", "*:*", 1498993 shalin "rows", 100, 1498993 shalin "fl", "id," + i1, 1498993 shalin "group", "true", 1498993 shalin "group.query", t1 + ":kings OR " + t1 + ":eggs", 1498993 shalin "group.limit", 10, 1498993 shalin "sort", i1 + " asc, id asc", 1627635 anshum CommonParams.TIME_ALLOWED, 100, 1498993 shalin ShardParams.SHARDS_INFO, "true", 1498993 shalin ShardParams.SHARDS_TOLERANT, "true"); {noformat} example: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Linux/11221/ {noformat} Error Message: Request took too long during query expansion. Terminating request. Stack Trace: org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Request took too long during query expansion. Terminating request. at __randomizedtesting.SeedInfo.seed([377AFD4F005F159A:B69C7357770075A6]:0) at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:570) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:215) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:211) at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:91) at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:301) at org.apache.solr.TestDistributedSearch.queryPartialResults(TestDistributedSearch.java:596) at org.apache.solr.TestDistributedSearch.doTest(TestDistributedSearch.java:499) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:875) {noformat} I'm not fully understanding what anshum ment by this TODO, and I think he's offline for the next few days, so i went ahead and comment this out with a link back to this jira for him to look at before resolving this jira. > Don't allow runaway queries from harming Solr cluster health or search > performance > ---------------------------------------------------------------------------------- > > Key: SOLR-5986 > URL: https://issues.apache.org/jira/browse/SOLR-5986 > Project: Solr > Issue Type: Improvement > Components: search > Reporter: Steve Davids > Assignee: Anshum Gupta > Priority: Critical > Fix For: 5.0 > > Attachments: SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, > SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, > SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, SOLR-5986.patch, > SOLR-5986.patch > > > The intent of this ticket is to have all distributed search requests stop > wasting CPU cycles on requests that have already timed out or are so > complicated that they won't be able to execute. We have come across a case > where a nasty wildcard query within a proximity clause was causing the > cluster to enumerate terms for hours even though the query timeout was set to > minutes. This caused a noticeable slowdown within the system which made us > restart the replicas that happened to service that one request, the worst > case scenario are users with a relatively low zk timeout value will have > nodes start dropping from the cluster due to long GC pauses. > [~amccurry] Built a mechanism into Apache Blur to help with the issue in > BLUR-142 (see commit comment for code, though look at the latest code on the > trunk for newer bug fixes). > Solr should be able to either prevent these problematic queries from running > by some heuristic (possibly estimated size of heap usage) or be able to > execute a thread interrupt on all query threads once the time threshold is > met. This issue mirrors what others have discussed on the mailing list: > http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org