"OR allergies IS NULL" would be "OR (*:* -allergies:[* TO *])" in Lucene/Solr.

-- Jack Krupansky

-----Original Message----- From: Vitaly Funstein
Sent: Thursday, October 25, 2012 8:25 PM
To: java-user@lucene.apache.org
Subject: Re: query for documents WITHOUT a field?

Sorry for resurrecting an old thread, but how would one go about writing a
Lucene query similar to this?

SELECT * FROM patient WHERE first_name = 'Zed' OR allergies IS NULL

An AND case would be easy since one would just use a simple TermQuery with
a FieldValueFilter added, but what about other boolean cases? Admittedly,
this is a contrived example, but the point here is that it seems that since
filters are always applied to results after they are returned, how would
one go about making the null-ness of a field part of the query logic?

On Thu, Feb 16, 2012 at 1:45 PM, Uwe Schindler <u...@thetaphi.de> wrote:

I already mentioned that pseudo NULL term, but the user asked for another
solution...
--
Uwe Schindler
H.-H.-Meier-Allee 63, 28213 Bremen
http://www.thetaphi.de



Jamie Johnson <jej2...@gmail.com> schrieb:

Another possible solution is while indexing insert a custom token
which is impossible to show up in the index otherwise, then do the
filter based on that token.


On Thu, Feb 16, 2012 at 4:41 PM, Uwe Schindler <u...@thetaphi.de> wrote:
> As the documentation states:
> Lucene is an inverted index that does not have per-document fields. It
only
> knows terms pointing to documents. The query you are searching is a > query
> that returns all documents which have no term. To execute this query, it
> will get the term index and iterate all terms of a field, mark those in > a
> bitset and negates that. The filter/query I told you uses the FieldCache
to
> do this. Since 3.6 (also in 3.5, but there it is buggy/API different)
there
> is another fieldcache that returns exactly that bitset. The filter
mentioned
> only uses that bitset from this new fieldcache. Fieldcache is populated
on
> first access and keeps alive as long as the underlying index segment is
open
> (means as long as IndexReader is open and the parts of the index is not
> refreshed). If you are also sorting against your fields or doing other
> queries using FieldCache, there is no overhead, otherwise the bitset is
> populated on first access to the filter.
>
> Lucene 3.5 has no easy way to implement that filter, a "NULL" pseudo
term is
> the only solution (and also much faster on the first access in Lucene
3.6).
> Later accesses hitting the cache in 3.6 will be faster, of course.
>
> Another hacky way to achieve the same results is (works with almost any
> Lucene version):
> BooleanQuery consisting of: MatchAllDocsQuery() as MUST clause and
> PrefixQuery(field, "") as MUST_NOT clause. But the PrefixQuery will do a
> full term index scan without caching :-). You may use
CachingWrapperFilter
> with PrefixFilter instead.
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
>> -----Original Message-----
>> From: Tim Eck [mailto:tim...@gmail.com]
>> Sent: Thursday, February 16, 2012 10:14 PM
>> To: java-user@lucene.apache.org
>> Subject: RE: query for documents WITHOUT a field?
>>
>> Thanks for the fast response. I'll certainly have a look at the >> upcoming
> 3.6.x
>> release. What is the expected performance for using a negated filter?
>> In particular does it defeat the index in any way and require a full
index
> scan?
>> Is it different between regular fields and numeric fields?
>>
>> For 3.5 and earlier though, is there any suggestion other than magic
> values?
>>
>> -----Original Message-----
>> From: Uwe Schindler [mailto:u...@thetaphi.de]
>> Sent: Thursday, February 16, 2012 1:07 PM
>> To: java-user@lucene.apache.org
>> Subject: RE: query for documents WITHOUT a field?
>>
>> Lucene 3.6 will have a FieldValueFilter that can be negated:
>>
>> Query q = new ConstantScoreQuery(new FieldValueFilter("field", true));
>>
>> (see http://goo.gl/wyjxn)
>>
>> Lucen 3.5 does not yet have it, you can download 3.6 snapshots from
> Jenkins:
>> http://goo.gl/Ka0gr
>>
>> -----
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: u...@thetaphi.de
>>
>>
>> > -----Original Message-----
>> > From: Tim Eck [mailto:t...@terracottatech.com]
>> > Sent: Thursday, February 16, 2012 9:59 PM
>> > To: java-user@lucene.apache.org
>> > Subject: query for documents WITHOUT a field?
>> >
>> > My apologies if this answer is readily available someplace, I've
>> > searched around and not found a definitive answer.
>> >
>> >
>> >
>> > I'd like to run a query for documents that _do not_ contain >> > particular
>> indexed
>> > fields to implement something like a SQL-like query where a column is
>> null.
>> >
>> >
>> >
>> > I understand I could possibly use a magic value to represent "null",
>> > but
>> the data
>> > I'm searching doesn't led itself to reserving a value for null. I >> > also
>> understand I
>> > could add an extra field to hold this boolean isNull state but would
>> > love
>> a better
>> > solution :-)
>> >
>> >
>> >
>> > TIA
>> >
>> >
>>
>>
>>
>>_____________________________________________

>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>>
>>_____________________________________________

>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
>_____________________________________________

> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>

_____________________________________________

To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to