I suggested TermsFilter, not TermFilter :)  Note the sneaky extra s ....

Mike McCandless

http://blog.mikemccandless.com


On Wed, Oct 29, 2014 at 8:20 AM, Pawel Rog <pawelro...@gmail.com> wrote:
> Hi,
> I already tried to transform Queries to filter (TermQuery -> TermFilter)
> but didn't see much speed up. I wrote that  wrapped this filter into
> ConstantScoreQuery and in other test I used FilteredQuery with
> MatchAllDocsQuery and BooleanFilter. Both cases seems to work quite similar
> in terms of performance to simple BooleanQuery.
> But of course I'll also try to use TermsFilter. Maybe it will speedUp
> filters.
>
> Michael Sokolov I haven't prepared any statistics about number of
> BooleanClauses used and if there are some repeating sets of terms. I think
> I have to collect some stats for better understanding what can be improved.
>
> --
> Paweł Róg
>
>
> On Wed, Oct 29, 2014 at 12:30 PM, Michael Sokolov <
> msoko...@safaribooksonline.com> wrote:
>
>> I'm curious to know more about your use case, because I have an idea for
>> something that addresses this, but haven't found the opportunity to develop
>> it yet - maybe somebody else wants to :).  The basic idea is to reduce the
>> number of terms needed to be looked up by collapsing commonly-occurring
>> collections of terms into synthetic "tiles".  If your queries have a lot of
>> overlap, this could greatly reduce the number of terms in a query rewritten
>> to use tiles. It's sort of complex, requires indexing support, or a filter
>> cache, and there's no working implementation as yet, so this is probably
>> not really going to be helpful for you in the short term, but if you can
>> share some information I'd love to know:
>>
>> what kind of things are you searching?
>> how many terms do your larger queries have?
>> do the query terms overlap among your queries?
>>
>> -Mike Sokolov
>>
>>
>> On 10/28/14 9:40 PM, Pawel Rog wrote:
>>
>>> Hi,
>>> I have to run query with a lot of boolean should clauses. Queries like
>>> these were of course slow so I decided to change query to filter wrapped
>>> by
>>> ConstantScoreQuery but it also didn't help.
>>>
>>> Profiler shows that most of the time is spent on seekExact in
>>> BlockTreeTermsReader$FieldReader$SegmentTermsEnum
>>>
>>> When I go deeper in trace I see that inside seekExact most time is spent
>>> on
>>> loadBlock and even deeper ByteBufferIndexInput.clone.
>>>
>>> Do you have any ideas how I can make it work faster or it is not possible
>>> and I have to live with it?
>>>
>>> --
>>> Paweł Róg
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to