(changed subject for this topic). Weird. I'm seeing it wrong myself, and have for a while -- I even wrote some custom pre-processor logic at my app level to work around it. Weird, I dunno.

Wait. "Queries with -one OR -two return less documents than a either operand does on its own."

Wait, that's exactly what's wrong, isn't it? How can there be fewer documents that have "-one OR -two" then have "-one" alone? If there are X documents that do not have a "one" in them, there can't be less than X documents that EITHER do not have a "one" OR do not have a "two" (ie, documents that do not have BOTH one and two), can there? We didn't ask for "-one AND -two", we asked for "-one OR -two".

On 5/17/2011 6:42 PM, Markus Jelsma wrote:
mmmm, that's not what i see while testing right now. Queries with -one OR -two
return less documents than a either operand does on its own, this is with
LuceneQParser. I haven't done extensive testing since i rarely use boolean
algebra in Lucene or Solr.

Oops, you're right, I had misremembered --- Solr 1.4.1 "lucene" qp
handles pure negative fine, it's Solr 1.4.1 _dismax_ that does not.

Although, here's one, not actually related to this thread,  that DOESN'T
work in Solr 1.4.1 lucene query parser. Curious if it's been fixed in
Solr 3.1.

&defType=lucene&q=-one OR -two

That one does NOT work as expected in solr 1.4.1, although I can't
explain exactly what it's doing, it's not right. (It returns FEWER
results than "-one" alone, which can't be right algebraicly). I think.
So there are still some kinds of negative queries that do weird things.

On 5/17/2011 6:29 PM, Markus Jelsma wrote:
Such a negation works just as one would expect.

q=*:*
<result name="response" numFound="158" start="0">

q=*:*&fq=-type:text/html
<result name="response" numFound="25" start="0">

q=*:*&fq=type:text/html
<result name="response" numFound="133" start="0">

Well, that adds up , doesn't it ;)

1. I don't think Solr will re-use the filter cache in that situation,
although I'm not sure. But I comment anyway because, not what you asked
but something else that will trip you up with your example:

2. In fact, a pure-negative query like that doesn't work _at all_ in the
default solr/lucene query parser used for 'fq', at least in Solr 1.4.1.
Not sure if it's been improved in 3.1, but I don't think so.  It will
always return 0 hits, the solr/lucene query parser can't generate a
proper lucene query from a pure negative query like that.

To get around this, you can find a variation the query that means the
same thing but isn't that form. Here's a really ugly one I use, with a
nested dismax -- dismax ALSO has trouble with pure negatives, although I
think maybe edismax can handle em? But this weird as heck combo works,
maybe there's a better way.

NOT _query_:"{!dismax qf=something}history"

And to come around full circle, I have NO idea what effect nested
queries have on the filter cache. I think that STILL won't re-use the
filter cache.... but I wonder if it'll re-use the _query_ cache for
"history"?  I forget even more how the query cache works though.

On 5/17/2011 6:07 PM, Burton-West, Tom wrote:
If I have a query with a filter query such as : " q=art&fq=history" and
then run a second query  "q=art&fq=-history", will Solr realize that it
can use the cached results of the previous filter query "history"  (in
the filter cache) or will it not realize this and have to actually do a
second filter query against the index  for "not history"?

Tom

Reply via email to