BooleanQuery is subject to the 1024 limit on the number of clauses, so you can't use it in that case. You should use TermsQuery/TermsFilter instead.
Le mer. 19 juil. 2017 à 13:52, Kumaran Ramasubramanian <kums....@gmail.com> a écrit : > Hi Adrien > > > i have tried > > BooleanQuery with ConstantScoreQuery based suggestion from this link, > > http://lucene.472066.n3.nabble.com/BooleanFilter-vs-BooleanQuery-performance-td4106920.html > > If you want it fast, use > > > > BooleanQuery and wrap it with ConstantScoreQuery. Then there is also no > > scoring done (in most cases, older BooleanQuery sometimes still > calculated > > the score). > > > > > 3. if i disable scoring process using ConstantScoreQuery, is it possible > > give more than 1024 query clauses? > > i tried this.. But still getting java.lang.OutOfMemoryError.. Why > ? > > > java.lang.OutOfMemoryError: Java heap space > > at > > > org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsEnum.<init>(Lucene41PostingsReader.java:345) > > at > > > org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.docs(Lucene41PostingsReader.java:254) > > at > > > org.apache.lucene.codecs.blocktree.SegmentTermsEnum.docs(SegmentTermsEnum.java:999) > > at org.apache.lucene.index.TermsEnum.docs(TermsEnum.java:149) > > at > org.apache.lucene.search.TermQuery$TermWeight.scorer(TermQuery.java:84) > > at > > > org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:356) > > at > > > org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:164) > > at > > > org.apache.lucene.search.FilteredQuery$RandomAccessFilterStrategy.filteredScorer(FilteredQuery.java:542) > > at > > > org.apache.lucene.search.FilteredQuery$FilterStrategy.filteredBulkScorer(FilteredQuery.java:504) > > at > > > org.apache.lucene.search.FilteredQuery$1.bulkScorer(FilteredQuery.java:150) > > > > > If i use BooleanQuery and wrap it with ConstantScoreQuery, shall i use 1 > lakh boolean clauses in booleanquery ? > > > > > > - > - > Kumaran R > > > > On Wed, Jul 19, 2017 at 8:26 AM, Kumaran Ramasubramanian < > kums....@gmail.com > > wrote: > > > > > > > Thank you Adrien :-) > > > > > > > > On 18-Jul-2017 3:21 PM, "Adrien Grand" <jpou...@gmail.com> wrote: > > > > Sorry for the confusion, I keep saying query in all cases because queries > > and filters got merged in Lucene 5.0. If you are using BooleanFilter > rather > > than BooleanQuery with Lucene 4 then things should be mostly ok if you > have > > many clauses. But like TermsQuery, BooleanFilter always consume all > > matching documents from all its clauses. So if you intersect it with a > > selective query, it is wasteful. > > > > Le mar. 18 juil. 2017 à 11:42, Kumaran Ramasubramanian < > kums....@gmail.com > > > > > a écrit : > > > > > Hi Adrien, > > > > > > Thanks for your input... > > > > > > 1. using boolean filters is working for even 1lakh Filter Clauses in > > > > booleanFilter... is there any consequence using filters in this case? > > > shall > > > > i proceed with this? > > > > > > > > > code snippet i used for this statement 1.. > > > > > > for (int i = 0; i < 10 > > > > 00 > > > > 00; i++) > > > > { > > > > Term term = new Term(" > > > > key > > > > " > > > > +i > > > > , " > > > > value > > > > " > > > > +i > > > > ); > > > > TermsFilter filter = new > > > > > > > > TermsFilter(term); > > > > FilterClause filterClause = new > > FilterClause(filter, > > > > BooleanClause.Occur.SHOULD); > > > > boolFilter.add(filterClause); > > > > } > > > > > > > > > > > > Do you see any problem in using > > > > > > TermsFilter over TermsQuery? > > > > > > btw, i will test with TermsQuery and let you know. > > > > > > > > > > > > -- > > > Kumaran R > > > > > > > > > > > > > > > On Tue, Jul 18, 2017 at 1:59 AM, Adrien Grand <jpou...@gmail.com> > wrote: > > > > > > > Could you use TermInSetQuery (TermsQuery in older Lucene versions)? > It > > is > > > > worse at skipping over matches than a BooleanQuery but keeps memory > > > > usage low and disk access sequential, on the contrary to large > boolean > > > > queries. > > > > > > > > Otherwise you would probably need to rethink how you design your > > > documents > > > > in order to be able to run simpler queries. > > > > > > > > Le lun. 17 juil. 2017 à 16:28, Kumaran Ramasubramanian < > > > kums....@gmail.com > > > > > > > > > a écrit : > > > > > > > > > Hi All, > > > > > > > > > > i am using lucene 4.10.4 > > > > > > > > > > In lucene search, i know we have 1024 limitation in number of > boolean > > > > query > > > > > clauses. i know we can increase this limit.. but i want to > understand > > > > > queries vs filter in lucene 4.10.4... > > > > > > > > > > i want to make queries larger than 1024.. Relevance is not needed > for > > > > > me. What are the best possible options? > > > > > > > > > > 1. using boolean filters is working for even 1lakh Filter Clauses > in > > > > > booleanFilter... is there any consequence using filters in this > case? > > > > shall > > > > > i proceed with this? > > > > > > > > > > 2. if i am giving very less memory for filters, it is managed to > > > > complete a > > > > > search after so much GC cycles.. Why cannot we do the same for > query > > > > > clauses too? What is the actual technical reason for 1024 > limitation > > in > > > > > boolean query? > > > > > > > > > > 3. if i disable scoring process using ConstantScoreQuery, is it > > > possible > > > > > give more than 1024 query clauses? > > > > > i tried this.. But still getting > java.lang.OutOfMemoryError.. > > > Why > > > > ? > > > > > > > > > > java.lang.OutOfMemoryError: Java heap space > > > > > > > > > > > > at > > > > > >> > > > > > org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$ > > > > BlockDocsEnum.<init>(Lucene41PostingsReader.java:345) > > > > > > > > > > > > at > > > > > >> > > > > > org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.docs( > > > > Lucene41PostingsReader.java:254) > > > > > > > > > > > > at > > > > > >> > > > > > org.apache.lucene.codecs.blocktree.SegmentTermsEnum. > > > > docs(SegmentTermsEnum.java:999) > > > > > > > > > > > > at org.apache.lucene.index.TermsEnum.docs(TermsEnum.java:149) > > > > > > > > > > > > at > > > > > org.apache.lucene.search.TermQuery$TermWeight.scorer(TermQue > > ry.java:84) > > > > > > > > > > > > at > > > > > >> > > > > > org.apache.lucene.search.BooleanQuery$BooleanWeight. > > > > scorer(BooleanQuery.java:356) > > > > > > > > > > > > at > > > > > >> > > > > > org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer( > > > > ConstantScoreQuery.java:164) > > > > > > > > > > > > at > > > > > >> > > > > > org.apache.lucene.search.FilteredQuery$RandomAccessFilterStrategy. > > > > filteredScorer(FilteredQuery.java:542) > > > > > > > > > > > > at > > > > > >> > > > > > org.apache.lucene.search.FilteredQuery$FilterStrategy. > > > > filteredBulkScorer(FilteredQuery.java:504) > > > > > > > > > > > > at > > > > > >> > > > > > org.apache.lucene.search.FilteredQuery$1.bulkScorer( > > > > FilteredQuery.java:150) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Any pointers are much appreciated... Thank you.. > > > > > > > > > > > > > > > > > > > > -- > > > > > Kumaran R > > > > > > > > > > > > > > > > > > >