The rule you quoted is not relevant to your two example queries since they
both have at least one "MUST" ("+") term.
I'll restate the rule: If there are one or more MUST clauses, then none of
the SHOULD clauses need be present for a document to be a match, but if
there are no MUST clauses, then at least one of the SHOULD clause terms must
be present for a document to match.
Further, an additional number of SHOULD clause terms may be required by
setting the "minMatch" ("mm" query parameter) or the
BooleanQuery#setMinimumNumberShouldMatch(int min) method to set the minim
number of optional clauses that must match.
Parentheses can be added without impact around any sequence of terms
provided there are no intervening operators or operator in front of the left
parenthesis, but if there are intervening operators or an operator in front
of the left parenthesis, the match may be impacted.
Yes, grouping CAN be used to eliminate confusion, but just as with any
algebra, one must pay attention to where the operators are placed - I can't
change "x*y+z" to "x*(y+z)".
-- Jack Krupansky
-----Original Message-----
From: Anders Melchiorsen
Sent: Thursday, January 24, 2013 4:28 AM
To: solr-user@lucene.apache.org
Subject: Re: Confused by queries
Hello.
That is indeed an excellent article, thanks for pointing me at it. With
a title like that, it is no wonder that I was unable to google it on my
own.
It is probably the exception in this rule that has been confusing me:
If a BooleanQuery contains no MUST BooleanClauses, then a
document is only considered a match against the BooleanQuery
if one or more of the SHOULD BooleanClauses is a match.
So "+group:id +keyword:text" and "(+group:id) +keyword:text" mean
completely different things.
I have mostly been using the reference at
http://lucene.apache.org/core/3_6_0/queryparsersyntax.html and it does
not mention this distinction. Quite the contrary, actually, as it says
that grouping can be used to eliminate confusion, thereby suggesting
that
the usual rules of Boolean algebra apply.
Thanks again,
Anders.
On 23.01.2013 02:20, Erick Erickson wrote:
Solr/Lucene does not implement strict boolean logic. Here's an
excellent blog discussing this:
http://searchhub.org/dev/2011/12/28/why-not-and-or-and-not/
Best
Erick
On Tue, Jan 22, 2013 at 7:25 PM, Otis Gospodnetic
<otis.gospodne...@gmail.com> wrote:
Well, depends on what you indexed.
Otis
Solr & ElasticSearch Support
http://sematext.com/
On Jan 22, 2013 5:48 PM, "Anders Melchiorsen" <m...@spoon.kalibalik.dk>
wrote:
Thanks, though I am still confused.
How about this one:
manu:apple => 1 hit
+name:video => 2 hits
manu:apple +name:video => 2 hits
Solr ignores the manu:apple part completely?
Cheers,
Anders.
Den 22/01/13 23.16, Jack Krupansky skrev:
The first query:
name:ipod OR -name:ipod => 0 hits
The "OR" and "-" are actually at the same level of the BooleanQuery, so
the "-" overrides the OR so it's equivalent to:
name:ipod -name:ipod => 0 hits
For the second query:
(name:ipod) OR (-name:ipod) => 3 hits
Pure negative queries are supported only at the top level, so the
"(-name:ipod)" matches nothing, so the query is equivalent to:
(name:ipod) => 3 hits
You can simply insert a "*:*" to assure that it is not a pure negative
query inside the parentheses:
(name:ipod) OR (*:* -name:ipod)
-- Jack Krupansky
-----Original Message----- From: Anders Melchiorsen
Sent: Tuesday, January 22, 2013 4:59 PM
To: solr-user@lucene.apache.org
Subject: Confused by queries
Hello!
With the example server of Solr 4.0.0 (with *.xml indexed), I get these
results:
*:* => 32 hits
name:ipod => 3 hits
-name:ipod => 29 hits
That is all fine, but for these next queries, I would expect to get 32
hits (i.e. everything), or at least the same number of hits for both
queries:
name:ipod OR -name:ipod => 0 hits
(name:ipod) OR (-name:ipod) => 3 hits
As my expectations are not met, I must be missing something?
Thanks,
Anders.