The rule you quoted is not relevant to your two example queries since they both have at least one "MUST" ("+") term.

I'll restate the rule: If there are one or more MUST clauses, then none of the SHOULD clauses need be present for a document to be a match, but if there are no MUST clauses, then at least one of the SHOULD clause terms must be present for a document to match.

Further, an additional number of SHOULD clause terms may be required by setting the "minMatch" ("mm" query parameter) or the BooleanQuery#setMinimumNumberShouldMatch(int min) method to set the minim number of optional clauses that must match.

Parentheses can be added without impact around any sequence of terms provided there are no intervening operators or operator in front of the left parenthesis, but if there are intervening operators or an operator in front of the left parenthesis, the match may be impacted.

Yes, grouping CAN be used to eliminate confusion, but just as with any algebra, one must pay attention to where the operators are placed - I can't change "x*y+z" to "x*(y+z)".

-- Jack Krupansky

-----Original Message----- From: Anders Melchiorsen
Sent: Thursday, January 24, 2013 4:28 AM
To: solr-user@lucene.apache.org
Subject: Re: Confused by queries

Hello.

That is indeed an excellent article, thanks for pointing me at it. With
a title like that, it is no wonder that I was unable to google it on my
own.

It is probably the exception in this rule that has been confusing me:

    If a BooleanQuery contains no MUST BooleanClauses, then a
    document is only considered a match against the BooleanQuery
    if one or more of the SHOULD BooleanClauses is a match.

So "+group:id +keyword:text" and "(+group:id) +keyword:text" mean
completely different things.

I have mostly been using the reference at
http://lucene.apache.org/core/3_6_0/queryparsersyntax.html and it does
not mention this distinction. Quite the contrary, actually, as it says
that grouping can be used to eliminate confusion, thereby suggesting
that
the usual rules of Boolean algebra apply.


Thanks again,
Anders.


On 23.01.2013 02:20, Erick Erickson wrote:
Solr/Lucene does not implement strict boolean logic. Here's an
excellent blog discussing this:

http://searchhub.org/dev/2011/12/28/why-not-and-or-and-not/

Best
Erick

On Tue, Jan 22, 2013 at 7:25 PM, Otis Gospodnetic
<otis.gospodne...@gmail.com> wrote:
Well, depends on what you indexed.

Otis
Solr & ElasticSearch Support
http://sematext.com/
On Jan 22, 2013 5:48 PM, "Anders Melchiorsen" <m...@spoon.kalibalik.dk>
wrote:

Thanks, though I am still confused.

How about this one:

manu:apple => 1 hit
+name:video => 2 hits

manu:apple +name:video => 2 hits

Solr ignores the manu:apple part completely?


Cheers,
Anders.


Den 22/01/13 23.16, Jack Krupansky skrev:

The first query:

   name:ipod OR -name:ipod => 0 hits

The "OR" and "-" are actually at the same level of the BooleanQuery, so
the "-" overrides the OR so it's equivalent to:

   name:ipod -name:ipod => 0 hits

For the second query:

   (name:ipod) OR (-name:ipod) => 3 hits

Pure negative queries are supported only at the top level, so the
"(-name:ipod)" matches nothing, so the query is equivalent to:

   (name:ipod) => 3 hits

You can simply insert a "*:*" to assure that it is not a pure negative
query inside the parentheses:

   (name:ipod) OR (*:* -name:ipod)

-- Jack Krupansky

-----Original Message----- From: Anders Melchiorsen
Sent: Tuesday, January 22, 2013 4:59 PM
To: solr-user@lucene.apache.org
Subject: Confused by queries

Hello!

With the example server of Solr 4.0.0 (with *.xml indexed), I get these
results:

*:* => 32 hits
name:ipod => 3 hits
-name:ipod => 29 hits

That is all fine, but for these next queries, I would expect to get 32
hits (i.e. everything), or at least the same number of hits for both
queries:

name:ipod OR -name:ipod => 0 hits
(name:ipod) OR (-name:ipod) => 3 hits

As my expectations are not met, I must be missing something?


Thanks,
Anders.




Reply via email to