What are some ways that one can increase the performance of PayloadTermQuery's?
I'm currently getting a max of 22 QPS after 90k unique queries from a
payload-enhanced keyword field on a dataset of 18 million documents,
where a simple term search on the equivalent multivalue string field
gives a max of 700 QPS.
Here are the performance numbers for queries 89,000 - 90,000:
Int # Reqs Secs Reqs/s Avg Median 80th 95th 99th Max
89 1000 45.52 22.0 0.045 0.013 0.067 0.198 0.360 1.144
In terms of implementation, I wrote a bunch of custom classes that end
up overriding QueryParserBase.newTermQuery() to return a
PayloadTermQuery instead of a TermQuery. This implementation seems to
work fine, but it's very slow.
I'm using HTTPD::Bench::ApacheBench with anywhere between 1 and 40
concurrent requests, and it pegs one of four CPUs at 100% the whole
time, leaving the others idle.
Specfically, are there ways to:
1. Use more than one CPU for PayloadTermQuery processing?
2. Take advantage of caching when calculating payloads?
(I've heard multivalue string fields take advantage of caching
where payloads do not)
3. Increase the query throughput for payloads in any other way?
Thanks,
- Neil