Hey Erick, Firstly - thank you so much for your detailed response - it is really appreciated! Unfortunately some of the context of my original message was lost in because the screenshots weren't there. The additional latency spike does absolutely result in a poor user experience for us, some of our legacy applications hit solr quite a few times in order to render the client experience so the compound effect can take a search result render from 500ms to 3-4 seconds for a chunk of our users every 10 minutes.
I know I'll never get this down to 0, I'm just striving to make what changes are feasible without going down too much of a rabbit hole. Please note I'm relatively new to Solr and have inherited a legacy stack __ The memory footprint is lower because I also reduced the size, not just the warming value. The warmup time is now sub 1second which I'm good with. I am working through the static warming queries today with one of the teams, so hopefully that will also have an impact. I will look at the docValues as well. Thanks again Karl On 30/01/2020, 00:24, "Erick Erickson" <erickerick...@gmail.com> wrote: Autowarming is significantly misunderstood. One of it's purposes in “the bad old days” was to rebuild very expensive on-heap structures for searching/sorting/grouping/and function queries. These are exactly what docValues are designed to make much, much faster. If you are still using spinning disks, the other benefit of warming queries is to read the index off disk and into MMapDirectory space. SSDs make this much faster too. I often see two common mistakes: 1> no autowarming 2> excessive autowarming I usually recommend people start with, say autowarm counts in the 10-20 as a start. One implication of what you’ve said so far is that the additional 9 seconds your old autowarming took didn’t get you any benefit either, so putting it back isn’t indicated. I’m not quite clear why you say your memory footprint is lower, it’s unrelated to autowarming unless you also decreased your size parameter. If you’re saying that your reduced cache size hasn’t changed your 95th percentile, I’d keep reducing it until it _did_ have a measurable effect. The hit ratio is only loosely related to autowarming. So focusing on autowarming as a way to improve the hit ratio is probably the wrong focus. So the first thing I’d do is make very, very sure that all the fields I used for grouping/sorting/faceting/function operations are docValues. Second, a static warming query that insured this rather relying on autowarming of the queryResultCache to happen to exercise those functions would be another step. NOTE: you don’t have to do all those operations on every field, just sorting on each field would suffice. NOTE: as of Solr 7.6, you can add “uninvertible=true” to your field types to insure that you have docValues set, see: SOLR-12962 And then I’d ask how much effort is smoothing out that kind of spike worth? You certainly see it with monitoring tools, but do users notice at all? If not, I wouldn’t spend all that much effort pursuing it… Best, Erick > On Jan 29, 2020, at 4:48 PM, Karl Stoney <karl.sto...@autotrader.co.uk.INVALID> wrote: > > So interestingly tweaking my filter cache i've got the warming time down to 1s (from 10!) and also reduced my memory footprint due to the smaller cache size. > > However, I still get these latency spikes (these changes have made no difference to them). > > So the theory about them being due to the warming being too intensive is wrong. > > I know the images didn't load btw so when I say spike I mean p95th response time going from 50ms to 100-120ms momentarily. > ________________________________ > From: Walter Underwood <wun...@wunderwood.org> > Sent: 29 January 2020 21:30 > To: solr-user@lucene.apache.org <solr-user@lucene.apache.org> > Subject: Re: Solr Searcher 100% Latency Spike > > Looking at the log, that takes one or two seconds after a complete batch reload (master/slave). So that is loading a cold index, all new files. This is not a big index, about a half million book titles. > > wunder > Walter Underwood > wun...@wunderwood.org > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fobserver.wunderwood.org%2F&data=02%7C01%7Ckarl.stoney%40autotrader.co.uk%7Cc67416e932d74851402d08d7a51ad3c3%7C926f3743f3d24b8a816818cfcbe776fe%7C0%7C0%7C637159406947278454&sdata=E1YkJlFTDtQPSkC9%2BNHft%2FjqkuTFXaz0BKO5RxahV3w%3D&reserved=0 (my blog) > >> On Jan 29, 2020, at 1:21 PM, Karl Stoney <karl.sto...@autotrader.co.uk.INVALID> wrote: >> >> Out of curiosity, could you define "fast"? >> I'm wondering what sort of figures people target their searcher warm time at >> ________________________________ >> From: Walter Underwood <wun...@wunderwood.org> >> Sent: 29 January 2020 21:13 >> To: solr-user@lucene.apache.org <solr-user@lucene.apache.org> >> Subject: Re: Solr Searcher 100% Latency Spike >> >> I use a static set of warming queries, about 20 of them. That is fast and gets a decent amount of the index into file buffers. Your top queries won’t change much unless you have a news site or a seasonal business. >> >> Like this: >> >> <listener event="newSearcher" class="solr.QuerySenderListener"> >> <arr name="queries"> >> <lst> >> <!-- Top non-numeric query words from August 2011 rush --> >> <str name="q">introduction</str> >> <str name="q">intermediate</str> >> <str name="q">fundamentals</str> >> <str name="q">understanding</str> >> <str name="q">introductory</str> >> <str name="q">precalculus</str> >> <str name="q">foundations</str> >> <str name="q">microeconomics</str> >> <str name="q">microbiology</str> >> <str name="q">macroeconomics</str> >> <str name="q">discovering</str> >> <str name="q">international</str> >> <str name="q">mathematics</str> >> <str name="q">organizational</str> >> <str name="q">criminology</str> >> <str name="q">developmental</str> >> <str name="q">engineering</str> >> </lst> >> </arr> >> </listener> >> >> wunder >> Walter Underwood >> wun...@wunderwood.org >> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fobserver.wunderwood.org%2F&data=02%7C01%7Ckarl.stoney%40autotrader.co.uk%7Cc67416e932d74851402d08d7a51ad3c3%7C926f3743f3d24b8a816818cfcbe776fe%7C0%7C0%7C637159406947278454&sdata=E1YkJlFTDtQPSkC9%2BNHft%2FjqkuTFXaz0BKO5RxahV3w%3D&reserved=0 (my blog) >> >>> On Jan 29, 2020, at 1:01 PM, Shawn Heisey <apa...@elyograg.org> wrote: >>> >>> On 1/29/2020 12:44 PM, Karl Stoney wrote: >>>> Looking for a bit of support here. When we soft commit (every 10 minutes), we get a latency spike that means response times for solr are loosely double, as you can see in this screenshot: >>> >>> Attachments almost never make it to the list. We cannot see any of your screenshots. >>> >>>> They do correlate to filterCache warmup, which seem to take between 10s and 30s: >>>> We don't have any other caches enabled, due to the high level of cardinality of the queries. >>>> The spikes are specifically on /select >>>> We have the following autowarm configuration for the filterCache: >>>> <filterCache class="solr.FastLRUCache" >>>> size="8192" >>>> initialSize="8192" >>>> cleanupThread="true" >>>> autowarmCount="900"/> >>> >>> Autowarm, especially on filterCache, can be an extremely lengthy process. What Solr must do in order to warm the cache here is execute up to 900 queries, sequentially, on the new index. That can take a lot of time and use a lot of resources like CPU and I/O. >>> >>> In order to reduce the impact of cache warming, I had to reduce my own autowarmCount on the filterCache to 4. >>> >>> Thanks, >>> Shawn >> >> This e-mail is sent on behalf of Auto Trader Group Plc, Registered Office: 1 Tony Wilson Place, Manchester, Lancashire, M15 4FN (Registered in England No. 9439967). This email and any files transmitted with it are confidential and may be legally privileged, and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender. This email message has been swept for the presence of computer viruses. > > This e-mail is sent on behalf of Auto Trader Group Plc, Registered Office: 1 Tony Wilson Place, Manchester, Lancashire, M15 4FN (Registered in England No. 9439967). This email and any files transmitted with it are confidential and may be legally privileged, and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender. This email message has been swept for the presence of computer viruses. This e-mail is sent on behalf of Auto Trader Group Plc, Registered Office: 1 Tony Wilson Place, Manchester, Lancashire, M15 4FN (Registered in England No. 9439967). This email and any files transmitted with it are confidential and may be legally privileged, and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender. This email message has been swept for the presence of computer viruses.