Re: A few random questions about solr queries.
See below: On Tue, May 29, 2012 at 6:18 AM, santamaria2 wrote: > *1)* With faceting, how does facet.query perform in comparison to > facet.field? I'm just wondering this as in my use case, I need to facet over > a field -- which would get me the top n facets for that field, but I also > need to show the count for a "selected filter" which might have a relatively > low count so it doesn't appear in the top n returned facets. So the solution > would be to 'ensure' its presence by adding a 'facet.query=cat:val' in > addition to my facet.field=cat. You have two choices here. Either specify that the return should contain the "top", say, 1,000,000 responses (which would be a disaster in some cases) and facet by field, or facet by query. You really don't have any other choice than to add the facet.query here so performance is moot. > > I want to do this to quite a few fields. > > Related/example-based question: > When I facet over a field, and something gets returned, eg: John Smith (83), > and I also 'ensure' this facet's presence by having it in > facet.query=author:"John Smith", are two different calculations performed? > Or is the facet returned by facet.field also used by facet.query to obtain > the count? > I'm pretty sure that two different calculations are performed, but don't know for certain. But again, it seems like your use-case requires the addition of the query so why does it matter? > > > *2) *Is there a performance issue if I have around, say, 20 facet.query > conditions along with 10 facet.fields? 3/10 of those fields have around > 100,000 possible values. Remaining have a few hundred each. > It Depends (tm). You don't say, for instance, how big your index is. Or how much memory you have or. Really, the only good way to answer this question is to try it and _then_ worry about it. So far, you've really described your requirements so asking low-level implementation details seems premature unless and until you see a performance problem. > > > *3)* I've rummaged around a bit, looking for info on when to use q vs fq. I > want to clear my doubts for a certain use case. > > Where should my date range queries go? In q or fq? The default settings in > my site show results from the past 90 days with buttons to show stuff from > the last month and week as well. But the user is allowed to use a slider to > apply any date range... this is allowed, but it's not /that/ common. > I definitely use fq for filtering various tags. Choosing a tag is a common > activity. > In addition to Shawn's answer, using &fq clauses enables using of the filterCache which can substantially increase performance, but see this blog post for some interesting considerations when using NOW.. http://www.lucidimagination.com/blog/2012/02/23/date-math-now-and-filter-queries/ Best Erick > Should the date range query go in fq? As I mentioned, the default view shows > stuff from the past 90 days. So on each new day does this like invalidate > stuff in the cache? Or is stuff stored in the filtered cache in some way > that makes it easy to fetch stuff from the past 89 days when a query is > performed the next day? > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/A-few-random-questions-about-solr-queries-tp3986562.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: A few random questions about solr queries.
On 5/29/2012 4:18 AM, santamaria2 wrote: *3)* I've rummaged around a bit, looking for info on when to use q vs fq. I want to clear my doubts for a certain use case. Where should my date range queries go? In q or fq? The default settings in my site show results from the past 90 days with buttons to show stuff from the last month and week as well. But the user is allowed to use a slider to apply any date range... this is allowed, but it's not /that/ common. I definitely use fq for filtering various tags. Choosing a tag is a common activity. I can't answer your facet questions, but this one I can. If you are using the default relevancy ranking and you do not want the values in a given part of your search to affect the score, put it in a filter query (fq). Also, if you are sorting all your search results in a deterministic way rather than using relevancy, use a filter query. If you do want those values to affect the score, which is normal for fulltext fields, put your search clause in the regular query (q). Most of the time, a date range is not something that you want to affect the relevancy score, so it is a perfect candidate for filter queries. Thanks, Shawn
Re: A few random questions about solr queries.
A wee bit of clarification on the 2nd question. I meant relative performance, ie. would it be much slower to facet over 20 facet.queries & 10 facet.fields compared to say, 4 facet.queries & facet.fields. I wonder if this makes sense... So... is a bump improper etiquette here? >_> -- View this message in context: http://lucene.472066.n3.nabble.com/A-few-random-questions-about-solr-queries-tp3986562p3986977.html Sent from the Solr - User mailing list archive at Nabble.com.
A few random questions about solr queries.
*1)* With faceting, how does facet.query perform in comparison to facet.field? I'm just wondering this as in my use case, I need to facet over a field -- which would get me the top n facets for that field, but I also need to show the count for a "selected filter" which might have a relatively low count so it doesn't appear in the top n returned facets. So the solution would be to 'ensure' its presence by adding a 'facet.query=cat:val' in addition to my facet.field=cat. I want to do this to quite a few fields. Related/example-based question: When I facet over a field, and something gets returned, eg: John Smith (83), and I also 'ensure' this facet's presence by having it in facet.query=author:"John Smith", are two different calculations performed? Or is the facet returned by facet.field also used by facet.query to obtain the count? *2) *Is there a performance issue if I have around, say, 20 facet.query conditions along with 10 facet.fields? 3/10 of those fields have around 100,000 possible values. Remaining have a few hundred each. *3)* I've rummaged around a bit, looking for info on when to use q vs fq. I want to clear my doubts for a certain use case. Where should my date range queries go? In q or fq? The default settings in my site show results from the past 90 days with buttons to show stuff from the last month and week as well. But the user is allowed to use a slider to apply any date range... this is allowed, but it's not /that/ common. I definitely use fq for filtering various tags. Choosing a tag is a common activity. Should the date range query go in fq? As I mentioned, the default view shows stuff from the past 90 days. So on each new day does this like invalidate stuff in the cache? Or is stuff stored in the filtered cache in some way that makes it easy to fetch stuff from the past 89 days when a query is performed the next day? -- View this message in context: http://lucene.472066.n3.nabble.com/A-few-random-questions-about-solr-queries-tp3986562.html Sent from the Solr - User mailing list archive at Nabble.com.