Lucene's default scoring should give you much of what you want - ranking hits of low-frequency terms higher - without any special query syntax - just list out your terms and use "OR" as your default operator.

-- Jack Krupansky

-----Original Message----- From: svante karlsson
Sent: Thursday, January 23, 2014 6:42 AM
To: solr-user@lucene.apache.org
Subject: how to write an efficient query with a subquery to restrict the search space?

I have a solr db containing 1 billion records that I'm trying to use in a
NoSQL fashion.

What I want to do is find the best matches using all search terms but
restrict the search space to the most unique terms

In this example I know that val2 and val4 is rare terms and val1 and val3
are more common. In my real scenario I'll have 20 fields that I want to
include or exclude in the inner query depending on the uniqueness of the
requested value.


my first approach was:
q=field1:val1 OR field2:val2 OR field3:val3 OR field4:val4 AND (field2:val2
OR field4:val4)&rows=100&fl=*

but what I think I get is
.....  field4:val4 AND (field2:val2 OR field4:val4)   this result is then
OR'ed with the rest

if I write
q=(field1:val1 OR field2:val2 OR field3:val3 OR field4:val4) AND
(field2:val2 OR field4:val4)&rows=100&fl=*

then what I think I get is two sub-queries that is evaluated separately and
then joined - performance wise this is bad.

Whats the best way to write these types of queries?


Are there any performance issues when running it on several solrcloud nodes
vs a single instance or should it scale?



/svante

Reply via email to