Perhaps the tricky part here is that Solr makes it's caches for #parts# of the query. In other words, a query that sorts on field A will populate the cache for field A. Any other query that sorts on field A will use the same cache. So you really need just enough queries to populate, in this case, the fields you'll sort by. One could put together multiple sorts on a single query and populate the sort caches all at once if you wanted.
Similarly for faceting and filter queries. You might well be able to make just a few queries that filled up all the relevant caches rather than the using 100s, but you know your schema way better than I do. What I meant about replicating work is that trying to use your after hook to fire off the queries probably doesn't buy you anything over firstSearcher/newSearcher lists. All that said, though, if you really don't want to put your queries in the config file, it would be relatively trivial to write a small Java app that uses SolrJ to query the server, reading the queries from anyplace you chose and call it from the after hook. Personally, I think this is a high-cost option when compared to having the list in the config file due to the added complexity, but that's your call. Best Erick On Wed, Dec 8, 2010 at 12:25 PM, Mark <static.void....@gmail.com> wrote: > We only replicate twice an hour so we are far from real-time indexing. Our > application never writes to master rather we just pick up all changes using > updated_at timestamps when delta-importing using DIH. > > We don't have any warming queries in firstSearcher/newSearcher event > listeners. My initial post was asking how I would go about doing this with a > large number of queries. Our queries themselves tend to have a lot of > faceting and other restrictions on them so I would rather not list them all > out using xml. I was hoping there was some sort of log replayer handler or > class that would replay a bunch of queries while the node is offline. When > its done, it will bring the node back online ready to serve requests. > > > On 12/8/10 6:15 AM, Jonathan Rochkind wrote: > >> How often do you replicate? Do you know how long your warming queries take >> to complete? >> >> As others in this thread have mentioned, if your replications (or ordinary >> commits, if you weren't using replication) happen quicker than warming takes >> to complete, you can get overlapping indexes being warmed up, and run out of >> RAM (causing garbage collection to take lots of CPU, if not an out-of-memory >> error), or otherwise block on CPU with lots of new indexes being warmed at >> once. >> >> Solr is not very good at providing 'real time indexing' for this reason, >> although I believe there are some features in post-1.4 trunk meant to >> support 'near real time search' better. >> ________________________________________ >> From: Mark [static.void....@gmail.com] >> Sent: Tuesday, December 07, 2010 10:24 PM >> To: solr-user@lucene.apache.org >> Subject: Re: Warming searchers/Caching >> >> Maybe I should explain my problem a little more in detail. >> >> The problem we are experiencing is after a delta-import we notice a >> extremely high load time on the slave machines that just replicated. It >> goes away after a min or so production traffic once everything is cached. >> >> I already have a before/after hook that is in place before/after >> replication takes place. The before hook removes the slave from the >> cluster and then starts to replicate. When its done it calls the after >> hook and I would like to warm up the cache in this method so no users >> experience extremely long wait times. >> >> On 12/7/10 4:22 PM, Markus Jelsma wrote: >> >>> XInclude works fine but that's not what your looking for i guess. Having >>> the >>> 100 top queries is overkill anyway and it can take too long for a new >>> searcher >>> to warmup. >>> >>> Depending on the type of requests, i usually tend to limit warming to >>> popular >>> filter queries only as they generate a very high hit ratio at make >>> caching >>> useful [1]. >>> >>> If there are very popular user entered queries having a high initial >>> latency, >>> i'd have them warmed up as well. >>> >>> [1]: http://wiki.apache.org/solr/SolrCaching#Tradeoffs >>> >>> Warning: I haven't used this personally, but Xinclude looks like what >>>> you're after, see: http://wiki.apache.org/solr/SolrConfigXml#XInclude >>>> >>>> >>>> >>>> Best >>>> Erick >>>> >>>> On Tue, Dec 7, 2010 at 6:33 PM, Mark<static.void....@gmail.com> >>>> wrote: >>>> >>>>> Is there any plugin or easy way to auto-warm/cache a new searcher with >>>>> a >>>>> bunch of searches read from a file? I know this can be accomplished >>>>> using >>>>> the EventListeners (newSearcher, firstSearcher) but I rather not add >>>>> 100+ >>>>> queries to my solrconfig.xml. >>>>> >>>>> If there is no hook/listener available, is there some sort of Handler >>>>> that performs this sort of function? Thanks! >>>>> >>>>