Lucene's default scoring should give you much of what you want - ranking
hits of low-frequency terms higher - without any special query syntax - just
list out your terms and use "OR" as your default operator.
-- Jack Krupansky
-----Original Message-----
From: svante karlsson
Sent: Thursday, January 23, 2014 6:42 AM
To: solr-user@lucene.apache.org
Subject: how to write an efficient query with a subquery to restrict the
search space?
I have a solr db containing 1 billion records that I'm trying to use in a
NoSQL fashion.
What I want to do is find the best matches using all search terms but
restrict the search space to the most unique terms
In this example I know that val2 and val4 is rare terms and val1 and val3
are more common. In my real scenario I'll have 20 fields that I want to
include or exclude in the inner query depending on the uniqueness of the
requested value.
my first approach was:
q=field1:val1 OR field2:val2 OR field3:val3 OR field4:val4 AND (field2:val2
OR field4:val4)&rows=100&fl=*
but what I think I get is
..... field4:val4 AND (field2:val2 OR field4:val4) this result is then
OR'ed with the rest
if I write
q=(field1:val1 OR field2:val2 OR field3:val3 OR field4:val4) AND
(field2:val2 OR field4:val4)&rows=100&fl=*
then what I think I get is two sub-queries that is evaluated separately and
then joined - performance wise this is bad.
Whats the best way to write these types of queries?
Are there any performance issues when running it on several solrcloud nodes
vs a single instance or should it scale?
/svante