Martin Davidsson schrieb:
I've tried to read up on how to decide, when writing a query, what criteria goes in the q parameter and what goes in the fq parameter, to achieve optimal performance. Is there [...] some kind of rule of thumb to help me decide how to split things up when querying against one or more fields.
This is a good question. I don't know if there is any such rule. I'm going to sum up my understanding of filter queries hoping that the pros will point out any flaws in my assumptions. http://wiki.apache.org/solr/SolrCaching - filterCache A filter query is cached, which means that it is the more useful the more often it is repeated. We know how often certain queries arise, or at least have the means to collect that data - so we know what might be candidates for filtering. The result of a filter query is cached and then used to filter a primary query result using set intersection. If my filter query result comprises more than 50 % of the entire document collection, its selectivity is poor. I might need it despite this fact, but it might also be worth while thinking about how to reframe the requirement, allowing for more efficient filters. Memory consumption is probably not a great concern here as the cache stores only document IDs. (And if those are integers, it's just 4 bytes each.) So having 100 filters containing 100,000 items on average, the memory consumption increase should be around 40 MB. By the way, are these document IDs (user in filterCache, documentCache, queryResultCache) the ones I configure in schema.xml or does Solr map my IDs to integers in order to ensure efficiency? A filter query should probably be orthogonal to the primary query, which means in plain English: unrelated to the primary query. To give an example, I have a field "category", which is a required field. In the class of searches where I use a filter on that field, the primary search is for something entirely different, so in most cases, it will not, or not necessarily, bias the primary result to any particular distribution of the category values. I then allow the application to apply filtering by category, incidentally, using faceting, which is a typical usage pattern, I guess. Michael Ludwig