Re: How does Solr handle overloads so well?

2012-09-22 Thread Mike Gagnon
This is embarrassing. I just realized that in the experiments where I saw
Solr providing good service in the face of the overload requests, I was
actually sending requests at a rate of 30 requests per second, not 300
requests per second. Once I ratcheted up the rate a little bit, Solr
started to overload like the other applications I've tested.

Thanks for your time, and sorry for the mistake!

Mike

On Fri, Sep 21, 2012 at 7:19 AM, Mike Gagnon  wrote:

> Thanks. If Solr doesn't have any special logic for dealing with
> algorithmic-complexity attack-like overloads, then it sounds like Jetty and
> Tomcat are responsible for Solr's unusually good performance in my
> experiments (unusual compared to other non-Java web applications).
>
> Cheers,
> Mike
>
> On Wed, Sep 19, 2012 at 8:30 AM, Walter Underwood 
> wrote:
>
>> The front-end code protection that I mentioned was outside of Solr. At
>> that time, requests with very large start values were slow, so we put code
>> in the front end to never request those. Even if the user wanted page 5000
>> of the results, they would get page 100.
>>
>> Now, those requests are fast, so that external protection is not needed.
>>
>> I was running overload tests this summer and could not get Solr to behave
>> badly. The throughput would drop off with overload, but not too bad. This
>> was all with simple queries on a 1.2M doc index.
>>
>> wunder
>> Walter Underwood
>> Search Guy, Chegg
>>
>> On Sep 19, 2012, at 8:20 AM, Erik Hatcher wrote:
>>
>> > How are you triggering an infinite loop in your requests to Solr?
>> >
>> >   Erik
>> >
>> > On Sep 19, 2012, at 11:12 , Mike Gagnon wrote:
>> >
>> >> [ I am sorry for breaking the thread, but my inbox has neither
>> received my
>> >> original post to the mailing list, nor Otis's response (so I can't
>> reply to
>> >> his response) ]
>> >>
>> >> Thanks a bunch for your response Otis.  Let me more thoroughly explain
>> my
>> >> experimental workload and why I am surprised Solr works so well.
>> >>
>> >> The most important characteristic of my workload is that many of the
>> >> requests (60 per second) cause infinite loops within Solr. That is,
>> each of
>> >> those requests causes a separate infinite loop within it's request
>> context.
>> >>
>> >> This workload is similar to an algorithmic-complexity attack --- a
>> type of
>> >> DoS.  In every web-app stack I've tested (except Solr/Jetty and
>> >> Solr/Tomcat) such workloads cause an immediate and complete denial of
>> >> service. What happens for these vulnerable applications, is that the
>> thread
>> >> pool fills up with infinite loops, and incoming requests become
>> rejected.
>> >>
>> >> But Solr manages to survive such an attack. My best guess is that Solr
>> has
>> >> an especially good overload strategy that quickly kicks out the
>> infinite
>> >> loop requests -- which lowers CPU contention, and allows other
>> requests to
>> >> be admitted.
>> >>
>> >> My first guess would be that Tomcat or Jetty is responsible for the
>> good
>> >> response to overload. However,
>> >> there was a good discussion in 2008 on this mailing list about Solr
>> >> Security:
>> >>
>> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200811.mbox/browser
>> >>
>> >> In this discuss Walter Underwood commented: "We have protected against
>> >> several different DoS problems in our front-end code."
>> >>
>> >> Perhaps it is these front-end defenses that help Solr survive my
>> workloads?
>> >>
>> >> Thanks!
>> >> Mike Gagnon
>> >>
>> >>
>> >>> Hm, I'm not sure how to approach this. Solr is not alone here -
>> there's
>> >>> container like jetty, solr inside it and lucene inside solr.
>> >>> Next, that index is rally small, so there is no disk IO. The
>> request
>> >>> rate is also not super high and if you did this over a fast connection
>> >> then
>> >>> there are also no issues with slow response writing or with having
>> lots of
>> >>> concurrent connections or running out of threads ...
>> >>>
>> >>> ...so it's not really that surprising solr keeps working :)
>&g

Re: How does Solr handle overloads so well?

2012-09-21 Thread Mike Gagnon
Thanks. If Solr doesn't have any special logic for dealing with
algorithmic-complexity attack-like overloads, then it sounds like Jetty and
Tomcat are responsible for Solr's unusually good performance in my
experiments (unusual compared to other non-Java web applications).

Cheers,
Mike

On Wed, Sep 19, 2012 at 8:30 AM, Walter Underwood wrote:

> The front-end code protection that I mentioned was outside of Solr. At
> that time, requests with very large start values were slow, so we put code
> in the front end to never request those. Even if the user wanted page 5000
> of the results, they would get page 100.
>
> Now, those requests are fast, so that external protection is not needed.
>
> I was running overload tests this summer and could not get Solr to behave
> badly. The throughput would drop off with overload, but not too bad. This
> was all with simple queries on a 1.2M doc index.
>
> wunder
> Walter Underwood
> Search Guy, Chegg
>
> On Sep 19, 2012, at 8:20 AM, Erik Hatcher wrote:
>
> > How are you triggering an infinite loop in your requests to Solr?
> >
> >   Erik
> >
> > On Sep 19, 2012, at 11:12 , Mike Gagnon wrote:
> >
> >> [ I am sorry for breaking the thread, but my inbox has neither received
> my
> >> original post to the mailing list, nor Otis's response (so I can't
> reply to
> >> his response) ]
> >>
> >> Thanks a bunch for your response Otis.  Let me more thoroughly explain
> my
> >> experimental workload and why I am surprised Solr works so well.
> >>
> >> The most important characteristic of my workload is that many of the
> >> requests (60 per second) cause infinite loops within Solr. That is,
> each of
> >> those requests causes a separate infinite loop within it's request
> context.
> >>
> >> This workload is similar to an algorithmic-complexity attack --- a type
> of
> >> DoS.  In every web-app stack I've tested (except Solr/Jetty and
> >> Solr/Tomcat) such workloads cause an immediate and complete denial of
> >> service. What happens for these vulnerable applications, is that the
> thread
> >> pool fills up with infinite loops, and incoming requests become
> rejected.
> >>
> >> But Solr manages to survive such an attack. My best guess is that Solr
> has
> >> an especially good overload strategy that quickly kicks out the infinite
> >> loop requests -- which lowers CPU contention, and allows other requests
> to
> >> be admitted.
> >>
> >> My first guess would be that Tomcat or Jetty is responsible for the good
> >> response to overload. However,
> >> there was a good discussion in 2008 on this mailing list about Solr
> >> Security:
> >>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200811.mbox/browser
> >>
> >> In this discuss Walter Underwood commented: "We have protected against
> >> several different DoS problems in our front-end code."
> >>
> >> Perhaps it is these front-end defenses that help Solr survive my
> workloads?
> >>
> >> Thanks!
> >> Mike Gagnon
> >>
> >>
> >>> Hm, I'm not sure how to approach this. Solr is not alone here - there's
> >>> container like jetty, solr inside it and lucene inside solr.
> >>> Next, that index is rally small, so there is no disk IO. The
> request
> >>> rate is also not super high and if you did this over a fast connection
> >> then
> >>> there are also no issues with slow response writing or with having
> lots of
> >>> concurrent connections or running out of threads ...
> >>>
> >>> ...so it's not really that surprising solr keeps working :)
> >>>
> >>> But...tell us more.
> >>>
> >>> Otis
> >>> --
> >>> Performance Monitoring - http://sematext.com/spm
> >>>
> >>>
> >>>
> >>> On Sep 12, 2012 8:51 PM, "Mike Gagnon"  wrote:
> >>>
> >>> Hi,
> >>>
> >>> I have been studying how server software responds to requests that
> cause
> >>> CPU overloads (such as infinite loops).
> >>>
> >>> In my experiments I have observed that Solr performs unusually well
> when
> >>> subjected to such loads. Every other piece of web software I've
> >>> experimented with drops to zero service under such loads. Do you know
> how
> >>> Solr achieves such good performan

Re: How does Solr handle overloads so well?

2012-09-19 Thread Mike Gagnon
Via this bug: https://issues.apache.org/jira/browse/SOLR-2631


> ... Solr can infinite loop, use 100% CPU and stack overflow, if you

> execute the following HTTP request:
> - http://localhost:8983/solr/select?qt=/admin/ping
> - http://localhost:8983/solr/admin/ping?qt=/admin/ping

I am running Solr 3.1, which has that bug.

Thanks,
Mike

On Wed, Sep 19, 2012 at 8:20 AM, Erik Hatcher wrote:

> How are you triggering an infinite loop in your requests to Solr?
>
> Erik
>
> On Sep 19, 2012, at 11:12 , Mike Gagnon wrote:
>
> > [ I am sorry for breaking the thread, but my inbox has neither received
> my
> > original post to the mailing list, nor Otis's response (so I can't reply
> to
> > his response) ]
> >
> > Thanks a bunch for your response Otis.  Let me more thoroughly explain my
> > experimental workload and why I am surprised Solr works so well.
> >
> > The most important characteristic of my workload is that many of the
> > requests (60 per second) cause infinite loops within Solr. That is, each
> of
> > those requests causes a separate infinite loop within it's request
> context.
> >
> > This workload is similar to an algorithmic-complexity attack --- a type
> of
> > DoS.  In every web-app stack I've tested (except Solr/Jetty and
> > Solr/Tomcat) such workloads cause an immediate and complete denial of
> > service. What happens for these vulnerable applications, is that the
> thread
> > pool fills up with infinite loops, and incoming requests become rejected.
> >
> > But Solr manages to survive such an attack. My best guess is that Solr
> has
> > an especially good overload strategy that quickly kicks out the infinite
> > loop requests -- which lowers CPU contention, and allows other requests
> to
> > be admitted.
> >
> > My first guess would be that Tomcat or Jetty is responsible for the good
> > response to overload. However,
> > there was a good discussion in 2008 on this mailing list about Solr
> > Security:
> >
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200811.mbox/browser
> >
> > In this discuss Walter Underwood commented: "We have protected against
> > several different DoS problems in our front-end code."
> >
> > Perhaps it is these front-end defenses that help Solr survive my
> workloads?
> >
> > Thanks!
> > Mike Gagnon
> >
> >
> >> Hm, I'm not sure how to approach this. Solr is not alone here - there's
> >> container like jetty, solr inside it and lucene inside solr.
> >> Next, that index is rally small, so there is no disk IO. The request
> >> rate is also not super high and if you did this over a fast connection
> > then
> >> there are also no issues with slow response writing or with having lots
> of
> >> concurrent connections or running out of threads ...
> >>
> >> ...so it's not really that surprising solr keeps working :)
> >>
> >> But...tell us more.
> >>
> >> Otis
> >> --
> >> Performance Monitoring - http://sematext.com/spm
> >>
> >>
> >>
> >> On Sep 12, 2012 8:51 PM, "Mike Gagnon"  wrote:
> >>
> >> Hi,
> >>
> >> I have been studying how server software responds to requests that cause
> >> CPU overloads (such as infinite loops).
> >>
> >> In my experiments I have observed that Solr performs unusually well when
> >> subjected to such loads. Every other piece of web software I've
> >> experimented with drops to zero service under such loads. Do you know
> how
> >> Solr achieves such good performance? I am guessing that when Solr is
> >> overload sheds load to make room for incoming requests, but I could not
> >> find any documentation that describes Solr's overload strategy.
> >>
> >> Experimental setup: I ran Solr 3.1 on a 12-core machine with 12 GB ram,
> >> using it index and search about 10,000 pages on MediaWiki. I test both
> >> Solr+Jetty and Solr+Tomcat. I submitted a variety of Solr queries at a
> > rate
> >> of 300 requests per second. At the same time, I submitted "overload
> >> requests" at a rate of 60 requests per second. Each overload request
> > caused
> >> an infinite loop in Solr via
> >> https://issues.apache.org/jira/browse/SOLR-2631.
> >>
> >> With Jetty about 70% of non-overload requests completed --- 95% of
> > requests
> >> completing within 0.6 seconds.
> >> With Tomcat about 34% of non-overload requests completed --- 95% of
> >> requests completing within 0.6 seconds.
> >>
> >> I also ran Solr+Jetty with non-overload requests coming in 65 requests
> per
> >> second (overload requests remain at 60 requests per second). In this
> >> workload, the completion rate drops to 15% and the 95th percentile
> latency
> >> increases to 25.
> >>
> >> Cheers,
> >> Mike Gagnon
> >>
>
>


Re: How does Solr handle overloads so well?

2012-09-19 Thread Mike Gagnon
[ I am sorry for breaking the thread, but my inbox has neither received my
original post to the mailing list, nor Otis's response (so I can't reply to
his response) ]

Thanks a bunch for your response Otis.  Let me more thoroughly explain my
experimental workload and why I am surprised Solr works so well.

The most important characteristic of my workload is that many of the
requests (60 per second) cause infinite loops within Solr. That is, each of
those requests causes a separate infinite loop within it's request context.

This workload is similar to an algorithmic-complexity attack --- a type of
DoS.  In every web-app stack I've tested (except Solr/Jetty and
Solr/Tomcat) such workloads cause an immediate and complete denial of
service. What happens for these vulnerable applications, is that the thread
pool fills up with infinite loops, and incoming requests become rejected.

But Solr manages to survive such an attack. My best guess is that Solr has
an especially good overload strategy that quickly kicks out the infinite
loop requests -- which lowers CPU contention, and allows other requests to
be admitted.

My first guess would be that Tomcat or Jetty is responsible for the good
response to overload. However,
there was a good discussion in 2008 on this mailing list about Solr
Security:
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200811.mbox/browser

In this discuss Walter Underwood commented: "We have protected against
several different DoS problems in our front-end code."

Perhaps it is these front-end defenses that help Solr survive my workloads?

Thanks!
Mike Gagnon


> Hm, I'm not sure how to approach this. Solr is not alone here - there's
> container like jetty, solr inside it and lucene inside solr.
> Next, that index is rally small, so there is no disk IO. The request
> rate is also not super high and if you did this over a fast connection
then
> there are also no issues with slow response writing or with having lots of
> concurrent connections or running out of threads ...
>
> ...so it's not really that surprising solr keeps working :)
>
> But...tell us more.
>
> Otis
> --
> Performance Monitoring - http://sematext.com/spm
>
>
>
> On Sep 12, 2012 8:51 PM, "Mike Gagnon"  wrote:
>
> Hi,
>
> I have been studying how server software responds to requests that cause
> CPU overloads (such as infinite loops).
>
> In my experiments I have observed that Solr performs unusually well when
> subjected to such loads. Every other piece of web software I've
> experimented with drops to zero service under such loads. Do you know how
> Solr achieves such good performance? I am guessing that when Solr is
> overload sheds load to make room for incoming requests, but I could not
> find any documentation that describes Solr's overload strategy.
>
> Experimental setup: I ran Solr 3.1 on a 12-core machine with 12 GB ram,
> using it index and search about 10,000 pages on MediaWiki. I test both
> Solr+Jetty and Solr+Tomcat. I submitted a variety of Solr queries at a
rate
> of 300 requests per second. At the same time, I submitted "overload
> requests" at a rate of 60 requests per second. Each overload request
caused
> an infinite loop in Solr via
> https://issues.apache.org/jira/browse/SOLR-2631.
>
> With Jetty about 70% of non-overload requests completed --- 95% of
requests
> completing within 0.6 seconds.
> With Tomcat about 34% of non-overload requests completed --- 95% of
> requests completing within 0.6 seconds.
>
> I also ran Solr+Jetty with non-overload requests coming in 65 requests per
> second (overload requests remain at 60 requests per second). In this
> workload, the completion rate drops to 15% and the 95th percentile latency
> increases to 25.
>
> Cheers,
> Mike Gagnon
>


How does Solr handle overloads so well?

2012-09-14 Thread Mike Gagnon
Hi,

I have been studying how server software responds to requests that cause
CPU overloads (such as infinite loops).

In my experiments I have observed that Solr performs unusually well when
subjected to such loads. Every other piece of web software I've
experimented with drops to zero service under such loads. Do you know how
Solr achieves such good performance?

I am guessing that when Solr is overload sheds load to make room for
incoming requests, but I could not find any documentation that describes
Solr's overload strategy.

Experimental setup: I ran Solr 3.1 on a 12-core machine with 12 GB ram,
using it index and search about 10,000 pages on MediaWiki. I test both
Solr+Jetty and Solr+Tomcat. I submitted a variety of Solr queries at a rate
of 300 requests per second. At the same time, I submitted "overload
requests" at a rate of 60 requests per second. Each overload request caused
an infinite loop in Solr via https://issues.apache.org/jira/browse/SOLR-2631
 .

With Jetty about 70% of non-overload requests completed --- 95% of requests
completing within 0.6 seconds.
With Tomcat about 34% of non-overload requests completed --- 95% of
requests completing within 0.6 seconds.

I also ran Solr+Jetty with non-overload requests coming in 65 requests per
second (overload requests remain at 60 requests per second). In this
workload, the completion rate drops to 15% and the 95th percentile latency
increases to 25.

Cheers,
Mike Gagnon


How does Solr handle overloads so well?

2012-09-12 Thread Mike Gagnon
Hi,

I have been studying how server software responds to requests that cause
CPU overloads (such as infinite loops).

In my experiments I have observed that Solr performs unusually well when
subjected to such loads. Every other piece of web software I've
experimented with drops to zero service under such loads. Do you know how
Solr achieves such good performance? I am guessing that when Solr is
overload sheds load to make room for incoming requests, but I could not
find any documentation that describes Solr's overload strategy.

Experimental setup: I ran Solr 3.1 on a 12-core machine with 12 GB ram,
using it index and search about 10,000 pages on MediaWiki. I test both
Solr+Jetty and Solr+Tomcat. I submitted a variety of Solr queries at a rate
of 300 requests per second. At the same time, I submitted "overload
requests" at a rate of 60 requests per second. Each overload request caused
an infinite loop in Solr via https://issues.apache.org/jira/browse/SOLR-2631.

With Jetty about 70% of non-overload requests completed --- 95% of requests
completing within 0.6 seconds.
With Tomcat about 34% of non-overload requests completed --- 95% of
requests completing within 0.6 seconds.

I also ran Solr+Jetty with non-overload requests coming in 65 requests per
second (overload requests remain at 60 requests per second). In this
workload, the completion rate drops to 15% and the 95th percentile latency
increases to 25.

Cheers,
Mike Gagnon