I suggested TermsFilter, not TermFilter :) Note the sneaky extra s .... Mike McCandless
http://blog.mikemccandless.com On Wed, Oct 29, 2014 at 8:20 AM, Pawel Rog <pawelro...@gmail.com> wrote: > Hi, > I already tried to transform Queries to filter (TermQuery -> TermFilter) > but didn't see much speed up. I wrote that wrapped this filter into > ConstantScoreQuery and in other test I used FilteredQuery with > MatchAllDocsQuery and BooleanFilter. Both cases seems to work quite similar > in terms of performance to simple BooleanQuery. > But of course I'll also try to use TermsFilter. Maybe it will speedUp > filters. > > Michael Sokolov I haven't prepared any statistics about number of > BooleanClauses used and if there are some repeating sets of terms. I think > I have to collect some stats for better understanding what can be improved. > > -- > Paweł Róg > > > On Wed, Oct 29, 2014 at 12:30 PM, Michael Sokolov < > msoko...@safaribooksonline.com> wrote: > >> I'm curious to know more about your use case, because I have an idea for >> something that addresses this, but haven't found the opportunity to develop >> it yet - maybe somebody else wants to :). The basic idea is to reduce the >> number of terms needed to be looked up by collapsing commonly-occurring >> collections of terms into synthetic "tiles". If your queries have a lot of >> overlap, this could greatly reduce the number of terms in a query rewritten >> to use tiles. It's sort of complex, requires indexing support, or a filter >> cache, and there's no working implementation as yet, so this is probably >> not really going to be helpful for you in the short term, but if you can >> share some information I'd love to know: >> >> what kind of things are you searching? >> how many terms do your larger queries have? >> do the query terms overlap among your queries? >> >> -Mike Sokolov >> >> >> On 10/28/14 9:40 PM, Pawel Rog wrote: >> >>> Hi, >>> I have to run query with a lot of boolean should clauses. Queries like >>> these were of course slow so I decided to change query to filter wrapped >>> by >>> ConstantScoreQuery but it also didn't help. >>> >>> Profiler shows that most of the time is spent on seekExact in >>> BlockTreeTermsReader$FieldReader$SegmentTermsEnum >>> >>> When I go deeper in trace I see that inside seekExact most time is spent >>> on >>> loadBlock and even deeper ByteBufferIndexInput.clone. >>> >>> Do you have any ideas how I can make it work faster or it is not possible >>> and I have to live with it? >>> >>> -- >>> Paweł Róg >>> >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org