Hi, In fact the FilteredQuery(MatchAllDocsQuery,...) with the filter should have been rewritten to a ConstantScoreQuery already, but for some unknown reason, Mike McCandless removed it in https://issues.apache.org/jira/browse/LUCENE-5418 Because of this it's better to do it like I said before (use ConstantScoreQuery).
Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -----Original Message----- > From: Uwe Schindler [mailto:u...@thetaphi.de] > Sent: Wednesday, March 11, 2015 8:07 PM > To: java-user@lucene.apache.org > Subject: RE: Filtering question > > Hi, > > BooleanQuery: > -- Clause 1: TermQuery > -- Clause 2: FilteredQuery > ----- Branch 1: MatchAllDocsQuery() > ----- Branch 2: MyNDVFilter > > > Why does it look like this? Clause 2 should simply be: > ConstantScoreQuery(MyNDVFilter) In that case the BooleanQuery will > execute more effectively, in case of 2 MUST clauses it will leap-frog. > > The reason for this behavior is the way how FilteredQuery executes: A filter > is seen as cheap, so it is applied down low. If it supports Bits() access > (instead > of an iterator), it will be passed as acceptDocs to the query (a > MatchAllDocsQuery). > > If you also apply the TermsFilter on the top level IndexSearcher (which > internally rewrites to FilteredQuery(query, filter)), the documents matching > the TermsFilter will be applied as acceptDocs by your BooleanQuery, which > will pass it also down to the MyNDVFilter. > > Uwe > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > > -----Original Message----- > > From: Chris Bamford [mailto:ch...@chrisbamford.plus.com] > > Sent: Wednesday, March 11, 2015 6:39 PM > > To: java-user@lucene.apache.org > > Subject: Re: Filtering question > > > > Additional - > > I'm on lucene 4.10.2 > > > > If I use a BooleanFilter as per Ian's suggestion I still get a null > > acceptDocs being passed to my NDV filter. > > > > > > Sent from my iPhone > > > > > On 11 Mar 2015, at 17:19, Chris Bamford <ch...@bammers.net> wrote: > > > > > > Hi Shai > > > > > > I thought that might be what acceptDocs was for, but in my case it > > > is null > > and throws a NPE if I try your suggestion. > > > > > > What am I doing wrong? I'd like to really understand this stuff .. > > > > > > Thanks > > > > > > Chris > > > > > > > > >> On 11 Mar 2015, at 13:05, Shai Erera <ser...@gmail.com> wrote: > > >> > > >> I don't see that you use acceptDocs in your MyNDVFilter. I think it > > >> would return false for all userB docs, but you should confirm that. > > >> > > >> Anyway, because you use an NDV field, you can't automatically skip > > >> unrelated documents, but rather your code would look something like: > > >> > > >> for (int i = 0; i < reader.maxDoc(); i++) { if (!acceptDocs.get(i)) { > > >> continue; > > >> } > > >> // document is accepted, read values ... > > >> } > > >> > > >> Shai > > >> > > >>> On Wed, Mar 11, 2015 at 1:25 PM, Ian Lea <ian....@gmail.com> wrote: > > >>> > > >>> Can you use a BooleanFilter (or ChainedFilter in 4.x) alongside your > > >>> BooleanQuery? Seems more logical and I suspect would solve the > > problem. > > >>> Caching filters can be good too, depending on how often your data > > changes. > > >>> See CachingWrapperFilter. > > >>> > > >>> -- > > >>> Ian. > > >>> > > >>> > > >>> On Tue, Mar 10, 2015 at 12:45 PM, Chris Bamford > > >>> <cbamf...@mimecast.com> > > >>> wrote: > > >>> > > >>>> > > >>>> Hi, > > >>>> > > >>>> I have an index of 30 docs, 20 of which have an owner field of > "UserA" > > >>>> and 10 of "UserB". > > >>>> I also have a query which consists of: > > >>>> > > >>>> BooleanQuery: > > >>>> -- Clause 1: TermQuery > > >>>> -- Clause 2: FilteredQuery > > >>>> ----- Branch 1: MatchAllDocsQuery() > > >>>> ----- Branch 2: MyNDVFilter > > >>>> > > >>>> I execute my search as follows: > > >>>> > > >>>> searcher.search( booleanQuery, > > >>>> new TermFilter(new > > >>>> Term("owner", "UserA"), > > >>>> 50); > > >>>> > > >>>> The TermFilter's job is to reduce the number of searchable > > >>>> documents from 30 to 20, which it does for all clauses of the > > >>>> BooleanQuery except > > >>> for > > >>>> MyNDVFilter which iterates through the full 30 docs, 10 needlessly. > > >>>> How can I restrict it so it behaves the same as the other query > > branches? > > >>>> > > >>>> MyNDVFilter source code: > > >>>> > > >>>> public class MyNDVFilter extends Filter { > > >>>> > > >>>> private String fieldName; > > >>>> private String matchTag; > > >>>> > > >>>> public TagFilter(String ndvFieldName, String matchTag) { > > >>>> this.fieldName = ndvFieldName; > > >>>> this.matchTag = matchTag; > > >>>> } > > >>>> > > >>>> @Override > > >>>> public DocIdSet getDocIdSet(AtomicReaderContext context, Bits > > >>>> acceptDocs) throws IOException { > > >>>> > > >>>> AtomicReader reader = context.reader(); > > >>>> int maxDoc = reader.maxDoc(); > > >>>> final FixedBitSet bitSet = new FixedBitSet(maxDoc); > > >>>> BinaryDocValues ndv = reader.getBinaryDocValues(fieldName); > > >>>> > > >>>> if (ndv != null) { > > >>>> for (int i = 0; i < maxDoc; i++) { > > >>>> BytesRef br = ndv.get(i); > > >>>> if (br.length > 0) { > > >>>> String strval = br.utf8ToString(); > > >>>> if (strval.equals(matchTag)) { > > >>>> bitSet.set(i); > > >>>> System.out.println("MyNDVFilter >> " + > > >>>> matchTag + " matched " + i + " [" + strval + "]"); > > >>>> } > > >>>> } > > >>>> } > > >>>> } > > >>>> > > >>>> return new DVDocSetId(bitSet); // just wraps a FixedBitSet > > >>>> } > > >>>> } > > >>>> > > >>>> > > >>>> > > >>>> Chris Bamford m: +44 7860 405292 w: www.mimecast.com Senior > > >>> Developer p: > > >>>> +44 207 847 8700 Address click here > > >>>> <http://www.mimecast.com/About-us/Contact-us/> > > >>>> ------------------------------ > > >>>> [image: http://www.mimecast.com] > > >>>> < > > >>> > > > https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=83be6 > > >>> 74748892bc34425eb4133af3e68 > > >>>> > > >>>> [image: LinkedIn] > > >>>> < > > >>> > > > https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=83a78 > > >>> f78bdfa40c471501ae0b813a68f> > > >>> [image: > > >>>> YouTube] > > >>>> < > > >>> > > > https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=ad1ed > > >>> 1af5bb9cf9dc965267ed43faff0> > > >>> [image: > > >>>> Facebook] > > >>>> < > > >>> > > > https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=172d4 > > >>> ea57e4a4673452098ba62badace> > > >>> [image: > > >>>> Blog] > > >>>> < > > >>> > > > https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=871b3 > > >>> 0b627b3263b9ae2a8f37b0de5ff> > > >>> [image: > > >>>> Twitter] > > >>>> < > > >>> > > > https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=cc3a8 > > >>> 25e202ee26a108f3ef8a1dc3c6f > > > > > > -------------------------------------------------------------------- > > > - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org