I would like to add one important consideration (that caused me various headaches in the past): the list of First Searcher queries is executed against the '*/select*' request handler, as '*q*' parameter. This means that you need to be careful with the Defaults, Appends and Invariants of such handler. The last time I checked it was not possible to specify the handler to hit with the First Searcher/New Searcher, I may be wrong now and the situation could have changed but I recommend to take a look.
Cheers -------------------------- Alessandro Benedetti Apache Lucene/Solr Committer Director, R&D Software Engineer, Search Consultant www.sease.io On Mon, 8 Nov 2021 at 22:45, Shawn Heisey <[email protected]> wrote: > On 11/8/21 2:05 PM, Nick Vladiceanu wrote: > > Ok, makes sense. However, when the core is initially created, the data > is not yet there. Running the firstSearcher queries against empty index > won’t have any beneficial effects when it comes to cache warming. Is there > any way to open the first searcher after the data is pulled from the > leader, and therefore, run the warmup queries? What’s the point of opening > the first searcher when initially the core is created, if there is no data? > > For the purposes of things like warming queries, the searcher isn't > aware that the index is empty when it starts. It just knows when it is > the first searcher, and when that is the case, it runs any configured > firstSearcher queries. Making it aware of something like that for the > purposes of avoiding such queries is possible, but that would add a lot > of complexity. Bugs are more likely as the code gets more complex. And > I would strongly argue that any benefits of added complexity in such an > important piece of code do not outweigh the risks. > > If this really concerns you, just have your indexing software reload the > core after the index is built, so firstSearcher queries are executed > again. If the list of firstSearcher queries takes a long time to run, > just set useColdSearcher to true, and the searcher will be made ready > for queries before the warming queries are executed. I don't remember > where in solrconfig.xml that config is. > > What I will generally recommend that people do is define a set of > queries in firstSearcher that will do initial warming on a completely > cold index, set useColdSearcher to true, and mostly rely on cache > autowarming after that. If cache autowarming doesn't do a good job, > then there are some possible remedies: > > 1) Add more memory so the OS can cache the index better. > 2) Change the cache autowarming config. > 3) Define some queries in newSearcher. > newSearcher is usually a smaller list than firstSearcher. > > A lot of performance issues are cured by adding more memory so the OS > can cache the index better. Good index caching is critical to getting > good performance out of any Lucene based software, which includes Solr. > Note that I am not talking about heap size -- I am talking about memory > that is not allocated to any program. > > When building the list of firstSearcher queries, the idea is to begin > the process of populating the OS disk cache and Solr's caches, not to > run every possible query variation users are likely to create. If you > end up with more than a handful of queries in that list, it's probably > too long. > > Thanks, > Shawn > > >
