Re: Limiting the number of queries/updates to Solr

S G Mon, 07 Aug 2017 13:53:37 -0700

I tried using the Jetty's QoS filter for rate limiting the queries.
It has a good option to apply different rates per URL pattern.


However, the same is not being picked up by Solr and the details of the
same are shared on
https://stackoverflow.com/questions/45536986/why-is-this-qos-jetty-filter-not-working

Has someone has worked on this before and can help?

Thanks
SG


On Fri, Aug 4, 2017 at 5:51 PM, S G <sg.online.em...@gmail.com> wrote:

> timeAllowed parameter is a not a good choice for rate limiting and could
> crash the whole Solr cluster.
> In fact, timeAllowed parameter should increase the chances of crashing the
> whole cluster:
>
> When the timeAllowed for a query is over, it's client will get a failure
> but the server handling the query itself will not kill the thread running
> that query. So Solr itself would still be working on that long-running
> query but the client has got a timeOut.
> These failure-receiving client-threads are now free to process other
> requests: retry failed ones or fire new queries to Solr.
> This should suffocate Solr even more, although client application's
> threads will not get blocked ever.
>
> With a rate limiter, we save both - clients' extra traffic gets
> rejected-responses and all Solr nodes breathe easy too.
> IMO, timeAllowed parameter will almost always kill the whole Solr cluster.
>
> -SG
>
>
>
>
> On Fri, Aug 4, 2017 at 3:30 PM, Varun Thacker <va...@vthacker.in> wrote:
>
>> Hi Hrishikesh,
>>
>> I think SOLR-7344 is probably an important addition to Solr. It could help
>> users isolate analytical queries ( streaming ) , search queries and
>> indexing requests and throttle requests
>>
>> Let's continue the discussion on the Jira
>>
>> On Thu, Aug 3, 2017 at 2:03 AM, Rick Leir <rl...@leirtech.com> wrote:
>>
>> >
>> >
>> > On 2017-08-02 11:33 PM, Shawn Heisey wrote:
>> >
>> >> On 8/2/2017 8:41 PM, S G wrote:
>> >>
>> >>> Problem is that peak load estimates are just estimates.
>> >>> It would be nice to enforce them from Solr side such that if a rate
>> >>> higher than that is seen at any core, the core will automatically
>> begin to
>> >>> reject the requests.
>> >>> Such a feature would contribute to cluster stability while making sure
>> >>> the customer gets an exception to remind them of a slower rate.
>> >>>
>> >> Solr doesn't have anything like this.  This is primarily because there
>> >> is no network server code in Solr.  The networking is provided by the
>> >> servlet container.  The container in modern Solr versions is nearly
>> >> guaranteed to be Jetty.  As long as I have been using Solr, it has
>> >> shipped with a Jetty container.
>> >>
>> >> https://wiki.apache.org/solr/WhyNoWar
>> >>
>> >> I have no idea whether Jetty is capable of the kind of rate limiting
>> >> you're after.  If it is, it would be up to you to figure out the
>> >> configuration.
>> >>
>> >> You could always put a proxy server like haproxy in front of Solr.  I'm
>> >> pretty sure that haproxy is capable rejecting connections when the
>> >> request rate gets too high.  Other proxy servers (nginx, apache, F5
>> >> BigIP, solutions from Microsoft, Cisco, etc) are probably also capable
>> >> of this.
>> >>
>> >> IMHO, intentionally causing connections to fail when a limit is
>> exceeded
>> >> would not be a very good idea.  When the rate gets too high, the first
>> >> thing that happens is all the requests slow down.  The slowdown could
>> be
>> >> dramatic.  As the rate continues to increase, some of the requests
>> >> probably would begin to fail.
>> >>
>> >> What you're proposing would be guaranteed to cause requests to fail.
>> >> Failing requests are even more likely than slow requests to result in
>> >> users finding a new source for whatever service they are getting from
>> >> your organization.
>> >>
>> > Shawn,
>> > Agreed, a connection limit is not a good idea.  But there is the
>> > timeAllowed parameter <https://cwiki.apache.org/conf
>> > luence/display/solr/Common+Query+Parameters#CommonQueryPa
>> > rameters-ThetimeAllowedParameter>
>> > timeAllowed - This parameter specifies the amount of time, in
>> > milliseconds, allowed for a search to complete. If this time expires
>> before
>> > the search is complete, any partial results will be returned.
>> >
>> > https://stackoverflow.com/questions/19557476/timing-out-a-query-in-solr
>> >
>> > With timeAllowed, you need not estimate what connection rate is
>> > unbearable. Rather, you would set a max response time. If some queries
>> take
>> > much longer than other queries, then this would cause the long ones to
>> > fail, which might be a good strategy. However, if queries normally all
>> take
>> > about the same time, then this would cause all queries to return partial
>> > results until the server recovers, which might be a bad strategy. In
>> this
>> > case, Walter's post is sensible.
>> >
>> > A previous thread suggested that timeAllowed could cause bad performance
>> > on some cloud servers.
>> > cheers -- Rick
>> >
>> >
>> >
>> >
>> >
>>
>
>

Re: Limiting the number of queries/updates to Solr

Reply via email to