[ 
https://issues.apache.org/jira/browse/SOLR-15252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17369704#comment-17369704
 ] 

Walter Underwood commented on SOLR-15252:
-----------------------------------------

I see that this has been closed, but it is a feature we would use.

I've had deep paging (by bots) cause a Solr outage multiple times, both at 
Netflix and at Chegg. Yes, we put in defenses in the middle tier, but Solr 
really should defend against this. It was a denial of service vulnerability in 
1.3 and it is still a vulnerability in 8.7.

I would be fine with a max_row or deepest_row parameter that would return a 400 
Bad Request for requests where start+rows was greater than the threshold.

Every time this has happened, it has been a huge pain to debug. A small number 
of queries, like 100, can take down a large cluster. Because it is just a few 
queries, it is not caught by normal bot defenses and it is hard to spot in logs.

Our middle-tier limit is 500. That is plenty of results for our use cases. I 
can imagine different limits for different uses (web, mobile), but we don't 
need that now.

> Solr should log WARN log when a query requests huge rows number
> ---------------------------------------------------------------
>
>                 Key: SOLR-15252
>                 URL: https://issues.apache.org/jira/browse/SOLR-15252
>             Project: Solr
>          Issue Type: Improvement
>          Components: query
>            Reporter: Jan Høydahl
>            Assignee: Jan Høydahl
>            Priority: Major
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We have all seen it - clients that use Integer.MAX_VALUE or 10000000 as rows 
> parameter, to just make sure they get all possible results. And this of 
> course leads to high GC pauses since Lucene allocates an array up front to 
> hold results.
> Solr should either log WARN when it encounters a value above a certain 
> threshold, such as 100k (then you should use cursormark instead). Or it 
> should simply respond with 400 error and have a system property or query 
> parameter folks can use to override if they know what they are doing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to