> (changed subject for this topic). Weird. I'm seeing it wrong myself, and
> have for a while -- I even wrote some custom pre-processor logic at my
> app level to work around it.  Weird, I dunno.
> 
> Wait. "Queries with -one OR -two return less documents than a either
> operand does on its own."
> 
> Wait, that's exactly what's wrong, isn't it?  How can there be fewer
> documents that have "-one OR -two" then have "-one" alone?  If there are
> X documents that do not have a "one" in them, there can't be less than X
> documents that EITHER do not have a "one" OR do not have a "two" (ie,
> documents that do not have BOTH one and two), can there? We didn't ask
> for "-one AND -two", we asked for "-one OR -two".

This is propably due to the contents (HTML bodies) of the documents i've 
queried. It's not so strange for this type of document to return less 
documents when two negated operands are specified. In my case (i tested it) a 
conjunction returned the same documents as a disjunction did.

Again, i haven't done extensive testing on this subject.

> 
> On 5/17/2011 6:42 PM, Markus Jelsma wrote:
> > mmmm, that's not what i see while testing right now. Queries with -one OR
> > -two return less documents than a either operand does on its own, this
> > is with LuceneQParser. I haven't done extensive testing since i rarely
> > use boolean algebra in Lucene or Solr.
> > 
> >> Oops, you're right, I had misremembered --- Solr 1.4.1 "lucene" qp
> >> handles pure negative fine, it's Solr 1.4.1 _dismax_ that does not.
> >> 
> >> Although, here's one, not actually related to this thread,  that DOESN'T
> >> work in Solr 1.4.1 lucene query parser. Curious if it's been fixed in
> >> Solr 3.1.
> >> 
> >> &defType=lucene&q=-one OR -two
> >> 
> >> That one does NOT work as expected in solr 1.4.1, although I can't
> >> explain exactly what it's doing, it's not right. (It returns FEWER
> >> results than "-one" alone, which can't be right algebraicly). I think.
> >> So there are still some kinds of negative queries that do weird things.
> >> 
> >> On 5/17/2011 6:29 PM, Markus Jelsma wrote:
> >>> Such a negation works just as one would expect.
> >>> 
> >>> q=*:*
> >>> <result name="response" numFound="158" start="0">
> >>> 
> >>> q=*:*&fq=-type:text/html
> >>> <result name="response" numFound="25" start="0">
> >>> 
> >>> q=*:*&fq=type:text/html
> >>> <result name="response" numFound="133" start="0">
> >>> 
> >>> Well, that adds up , doesn't it ;)
> >>> 
> >>>> 1. I don't think Solr will re-use the filter cache in that situation,
> >>>> although I'm not sure. But I comment anyway because, not what you
> >>>> asked but something else that will trip you up with your example:
> >>>> 
> >>>> 2. In fact, a pure-negative query like that doesn't work _at all_ in
> >>>> the default solr/lucene query parser used for 'fq', at least in Solr
> >>>> 1.4.1. Not sure if it's been improved in 3.1, but I don't think so. 
> >>>> It will always return 0 hits, the solr/lucene query parser can't
> >>>> generate a proper lucene query from a pure negative query like that.
> >>>> 
> >>>> To get around this, you can find a variation the query that means the
> >>>> same thing but isn't that form. Here's a really ugly one I use, with a
> >>>> nested dismax -- dismax ALSO has trouble with pure negatives, although
> >>>> I think maybe edismax can handle em? But this weird as heck combo
> >>>> works, maybe there's a better way.
> >>>> 
> >>>> NOT _query_:"{!dismax qf=something}history"
> >>>> 
> >>>> And to come around full circle, I have NO idea what effect nested
> >>>> queries have on the filter cache. I think that STILL won't re-use the
> >>>> filter cache.... but I wonder if it'll re-use the _query_ cache for
> >>>> "history"?  I forget even more how the query cache works though.
> >>>> 
> >>>> On 5/17/2011 6:07 PM, Burton-West, Tom wrote:
> >>>>> If I have a query with a filter query such as : " q=art&fq=history"
> >>>>> and then run a second query  "q=art&fq=-history", will Solr realize
> >>>>> that it can use the cached results of the previous filter query
> >>>>> "history"  (in the filter cache) or will it not realize this and
> >>>>> have to actually do a second filter query against the index  for
> >>>>> "not history"?
> >>>>> 
> >>>>> Tom

Reply via email to