RE: Increase search performance

Atul Bisaria Thu, 01 Feb 2018 03:31:50 -0800

Hi Adrien,

Thanks for your reply.


I have also tried testing with UsageTrackingQueryCachingPolicy, but did not 
observe a significant change in both latency and throughput.

Given that I have specific search requirements of no scoring and sorting the 
search results in a random order (reason for custom sort object), I have also 
explored writing a custom collector and could observe quite a difference in 
latency figures.

Let me know if this custom collector code has any loopholes which I could be 
missing:

class RandomOrderCollector extends SimpleCollector
{
        private int maxHitsRequired;
        private int docBase;

        private List<Integer> matches = new ArrayList<Integer>();

        public RandomOrderCollector(int maxHitsRequired)
        {
                this.maxHitsRequired = maxHitsRequired;
        }

        public boolean needsScores()
        {
                return false;
        }

        @Override
        public void collect(int doc) throws IOException
        {
                matches.add(docBase + doc);
        }

        @Override
        protected void doSetNextReader(LeafReaderContext context) throws 
IOException
        {
                super.doSetNextReader(context);
                this.docBase = context.docBase;
        }

        public List<Integer> getHits()
        {
                Collections.shuffle(matches);
                maxHitsRequired = Math.min(matches.size(), maxHitsRequired);

                return matches.subList(0, maxHitsRequired);
        }
}

Best Regards,
Atul Bisaria

-----Original Message-----
From: Adrien Grand [mailto:jpou...@gmail.com]
Sent: Wednesday, January 31, 2018 6:33 PM
To: java-user@lucene.apache.org
Subject: Re: Increase search performance

Hi Atul,


Le mar. 30 janv. 2018 à 16:24, Atul Bisaria <atul.bisa...@ericsson.com> a écrit 
:

> 1.     Using ConstantScoreQuery so that scoring overhead is removed since
> scoring is not required in my search use case. I also use a custom
> Sort object which does not sort by score (see code below).
>

If you don't sort by score, then wrapping with a ConstantScoreQuery won't help 
as Lucene will figure out scores are not needed anyway.


> 2.     Using query cache
>
>
>
> My understanding is that query cache would cache query results and
> hence lead to significant increase in performance. Is this understanding 
> correct?
>

It depends what you mean by performance. If you are optimizing for worst-case 
latency, then the query cache might make things worse due to the fact that 
caching a query requires to visit all matches, while query execution can 
sometimes just skip over non-interesting matches (eg. in conjunctions).

However if you are looking at improving throughput, then usually the default 
policy of the query cache of caching queries that look reused usually helps.


> I am using Lucene version 5.4.1 where query cache seems to be enabled
> by default (https://issues.apache.org/jira/browse/LUCENE-6784), but I
> am not able to see any significant change in search performance.
>




> Here is the code I am testing with:
>
>
>
> DirectoryReader reader = DirectoryReader.open(directory);      //using
> MMapDirectory
>
> IndexSearcher searcher = new IndexSearcher(reader); //IndexReader and
> IndexSearcher are created only once
>
> searcher.setQueryCachingPolicy(QueryCachingPolicy.ALWAYS_CACHE);
>

Don't do that, this will always cache all filters, which usually makes things 
slower for the reason mentioned above. I would rather advise that you use an 
instance of UsageTrackingQueryCachingPolicy.

RE: Increase search performance

Reply via email to