Lucene doesn't use a pure Boolean algebra, so things don't always do what one might expect and things like De Morgan's law don't hold.
The source of this comes from the combination of IR prefix notation (+/-) with standard Boolean AND/OR. If you look at the source, there a number of rules that disambiguate things. For example, comments in QueryParser.java indicate +a OR b is changed to a OR b as a way to combine + with OR. This isn't the only interpretation one could make, but it's the one chosen. As far as I know, this kind of disambiguation isn't documented anywhere. It's not in queryparsersyntax.html. -----Original Message----- From: Walt Stoneburner [mailto:[EMAIL PROTECTED] Sent: Monday, April 09, 2007 5:38 PM To: java-user@lucene.apache.org Subject: Re: Standard Parser Behavior Otis Gospodnetic <[EMAIL PROTECTED]> responded to Walt Stoneburner: > Purely negative queries don't work. Example: -A will not find all documents that do not have "A". What I'm trying to do is augment an existing query by appending qualifiers. If I search for +HORSE -RADISH, I should get only documents with horses but no radishes. So far so good. Now, if I do +HORSE -RADISH -SHOE, I should get only documents with horses, but none of those will have radish or shoe. Again, so far so good. What happens when I write +HORSE -(RADISH SHOE)? My first impression is that it is exactly like specifying the terms ... however, the boolean expression in me it telling me that's NOT( RADISH OR SHOE) ... which, from the programming example I sited before means a document with HORSE SHOE in it may actually get returned, because RADISH was not in it. Effectively this expression would only exclude documents if they mentioned both RADISH and SHOE in them. ...however, Lucene seems to be doing the expected, not the mathematic solution. That weirds me out. Why? Because if I change -(RADISH SHOE) to +(RADISH SHOE) than does that that say I must have either a radish or a shoe, ...both are not necessary? Put another way, if +(A B) means I must have at least an A or B (or both), but +A +B means I have to have both.... Then why doesn't -(X Y) mean I must not have an X or Y (or not both), where -X -Y means I can't have either. Did this simplify things or make them more muddy? -wls --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]