Improving PayloadTermQuery Performance

Neil Hooey Mon, 23 May 2011 15:58:55 -0700

What are some ways that one can increase the performance of PayloadTermQuery's?


I'm currently getting a max of 22 QPS after 90k unique queries from a
payload-enhanced keyword field on a dataset of 18 million documents,
where a simple term search on the equivalent multivalue string field
gives a max of 700 QPS.

Here are the performance numbers for queries 89,000 - 90,000:
 Int #    Reqs    Secs  Reqs/s     Avg  Median    80th    95th    99th     Max
    89    1000   45.52    22.0   0.045   0.013   0.067   0.198   0.360   1.144

In terms of implementation, I wrote a bunch of custom classes that end
up overriding QueryParserBase.newTermQuery() to return a
PayloadTermQuery instead of a TermQuery. This implementation seems to
work fine, but it's very slow.

I'm using HTTPD::Bench::ApacheBench with anywhere between 1 and 40
concurrent requests, and it pegs one of four CPUs at 100% the whole
time, leaving the others idle.

Specfically, are there ways to:
1. Use more than one CPU for PayloadTermQuery processing?
2. Take advantage of caching when calculating payloads?
   (I've heard multivalue string fields take advantage of caching
where payloads do not)
3. Increase the query throughput for payloads in any other way?

Thanks,

- Neil

Improving PayloadTermQuery Performance

Reply via email to