Hi,

In fact the FilteredQuery(MatchAllDocsQuery,...) with the filter should have 
been rewritten to a ConstantScoreQuery already, but for some unknown reason, 
Mike McCandless removed it in https://issues.apache.org/jira/browse/LUCENE-5418
Because of this it's better to do it like I said before (use 
ConstantScoreQuery).

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -----Original Message-----
> From: Uwe Schindler [mailto:u...@thetaphi.de]
> Sent: Wednesday, March 11, 2015 8:07 PM
> To: java-user@lucene.apache.org
> Subject: RE: Filtering question
> 
> Hi,
> 
> BooleanQuery:
> -- Clause 1: TermQuery
> -- Clause 2: FilteredQuery
> ----- Branch 1: MatchAllDocsQuery()
> ----- Branch 2: MyNDVFilter
> 
> 
> Why does it look like this? Clause 2 should simply be:
> ConstantScoreQuery(MyNDVFilter) In that case the BooleanQuery will
> execute more effectively, in case of 2 MUST clauses it will leap-frog.
> 
> The reason for this behavior is the way how FilteredQuery executes: A filter
> is seen as cheap, so it is applied down low. If it supports Bits() access 
> (instead
> of an iterator), it will be passed as acceptDocs to the query (a
> MatchAllDocsQuery).
> 
> If you also apply the TermsFilter on the top level IndexSearcher (which
> internally rewrites to FilteredQuery(query, filter)), the documents matching
> the TermsFilter will be applied as acceptDocs by your BooleanQuery, which
> will pass it also down to the MyNDVFilter.
> 
> Uwe
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
> 
> 
> > -----Original Message-----
> > From: Chris Bamford [mailto:ch...@chrisbamford.plus.com]
> > Sent: Wednesday, March 11, 2015 6:39 PM
> > To: java-user@lucene.apache.org
> > Subject: Re: Filtering question
> >
> > Additional -
> > I'm on lucene 4.10.2
> >
> > If I use a BooleanFilter as per Ian's suggestion I still get a null
> > acceptDocs being passed to my NDV filter.
> >
> >
> > Sent from my iPhone
> >
> > > On 11 Mar 2015, at 17:19, Chris Bamford <ch...@bammers.net> wrote:
> > >
> > > Hi Shai
> > >
> > > I thought that might be what acceptDocs was for, but in my case it
> > > is null
> > and throws a NPE if I try your suggestion.
> > >
> > > What am I doing wrong? I'd like to really understand this stuff ..
> > >
> > > Thanks
> > >
> > > Chris
> > >
> > >
> > >> On 11 Mar 2015, at 13:05, Shai Erera <ser...@gmail.com> wrote:
> > >>
> > >> I don't see that you use acceptDocs in your MyNDVFilter. I think it
> > >> would return false for all userB docs, but you should confirm that.
> > >>
> > >> Anyway, because you use an NDV field, you can't automatically skip
> > >> unrelated documents, but rather your code would look something like:
> > >>
> > >> for (int i = 0; i < reader.maxDoc(); i++) { if (!acceptDocs.get(i)) {
> > >>   continue;
> > >> }
> > >> // document is accepted, read values ...
> > >> }
> > >>
> > >> Shai
> > >>
> > >>> On Wed, Mar 11, 2015 at 1:25 PM, Ian Lea <ian....@gmail.com> wrote:
> > >>>
> > >>> Can you use a BooleanFilter (or ChainedFilter in 4.x) alongside your
> > >>> BooleanQuery?   Seems more logical and I suspect would solve the
> > problem.
> > >>> Caching filters can be good too, depending on how often your data
> > changes.
> > >>> See CachingWrapperFilter.
> > >>>
> > >>> --
> > >>> Ian.
> > >>>
> > >>>
> > >>> On Tue, Mar 10, 2015 at 12:45 PM, Chris Bamford
> > >>> <cbamf...@mimecast.com>
> > >>> wrote:
> > >>>
> > >>>>
> > >>>> Hi,
> > >>>>
> > >>>> I have an index of 30 docs, 20 of which have an owner field of
> "UserA"
> > >>>> and 10 of "UserB".
> > >>>> I also have a query which consists of:
> > >>>>
> > >>>> BooleanQuery:
> > >>>> -- Clause 1: TermQuery
> > >>>> -- Clause 2: FilteredQuery
> > >>>> ----- Branch 1: MatchAllDocsQuery()
> > >>>> ----- Branch 2: MyNDVFilter
> > >>>>
> > >>>> I execute my search as follows:
> > >>>>
> > >>>> searcher.search( booleanQuery,
> > >>>>                                   new TermFilter(new
> > >>>> Term("owner", "UserA"),
> > >>>>                                   50);
> > >>>>
> > >>>> The TermFilter's job is to reduce the number of searchable
> > >>>> documents from 30 to 20, which it does for all clauses of the
> > >>>> BooleanQuery except
> > >>> for
> > >>>> MyNDVFilter which iterates through the full 30 docs, 10 needlessly.
> > >>>> How can I restrict it so it behaves the same as the other query
> > branches?
> > >>>>
> > >>>> MyNDVFilter source code:
> > >>>>
> > >>>> public class MyNDVFilter extends Filter {
> > >>>>
> > >>>>    private String fieldName;
> > >>>>   private String matchTag;
> > >>>>
> > >>>>    public TagFilter(String ndvFieldName, String matchTag) {
> > >>>>       this.fieldName = ndvFieldName;
> > >>>>       this.matchTag = matchTag;
> > >>>>   }
> > >>>>
> > >>>>    @Override
> > >>>>   public DocIdSet getDocIdSet(AtomicReaderContext context, Bits
> > >>>> acceptDocs) throws IOException {
> > >>>>
> > >>>>        AtomicReader reader = context.reader();
> > >>>>       int maxDoc = reader.maxDoc();
> > >>>>       final FixedBitSet bitSet = new FixedBitSet(maxDoc);
> > >>>>       BinaryDocValues ndv = reader.getBinaryDocValues(fieldName);
> > >>>>
> > >>>>        if (ndv != null) {
> > >>>>           for (int i = 0; i < maxDoc; i++) {
> > >>>>               BytesRef br = ndv.get(i);
> > >>>>               if (br.length > 0) {
> > >>>>                   String strval = br.utf8ToString();
> > >>>>                   if (strval.equals(matchTag)) {
> > >>>>                       bitSet.set(i);
> > >>>>                       System.out.println("MyNDVFilter >> " +
> > >>>> matchTag + " matched " + i + " [" + strval + "]");
> > >>>>                   }
> > >>>>               }
> > >>>>           }
> > >>>>       }
> > >>>>
> > >>>>        return new DVDocSetId(bitSet);    // just wraps a FixedBitSet
> > >>>>   }
> > >>>> }
> > >>>>
> > >>>>
> > >>>>
> > >>>> Chris Bamford m: +44 7860 405292  w: www.mimecast.com  Senior
> > >>> Developer p:
> > >>>> +44 207 847 8700 Address click here
> > >>>> <http://www.mimecast.com/About-us/Contact-us/>
> > >>>> ------------------------------
> > >>>> [image: http://www.mimecast.com]
> > >>>> <
> > >>>
> >
> https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=83be6
> > >>> 74748892bc34425eb4133af3e68
> > >>>>
> > >>>> [image: LinkedIn]
> > >>>> <
> > >>>
> >
> https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=83a78
> > >>> f78bdfa40c471501ae0b813a68f>
> > >>> [image:
> > >>>> YouTube]
> > >>>> <
> > >>>
> >
> https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=ad1ed
> > >>> 1af5bb9cf9dc965267ed43faff0>
> > >>> [image:
> > >>>> Facebook]
> > >>>> <
> > >>>
> >
> https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=172d4
> > >>> ea57e4a4673452098ba62badace>
> > >>> [image:
> > >>>> Blog]
> > >>>> <
> > >>>
> >
> https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=871b3
> > >>> 0b627b3263b9ae2a8f37b0de5ff>
> > >>> [image:
> > >>>> Twitter]
> > >>>> <
> > >>>
> >
> https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=cc3a8
> > >>> 25e202ee26a108f3ef8a1dc3c6f
> > >
> > > --------------------------------------------------------------------
> > > - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > > For additional commands, e-mail: java-user-h...@lucene.apache.org
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to