Re: How to improve the Solr "OR" query performance

2013-07-03 Thread Otis Gospodnetic
Hi,

Does that OR query need to be scored?
Does it repeat?
If answers are no and yes, you should use fq, not q.

Otis
--
Solr & ElasticSearch Support -- http://sematext.com/
Performance Monitoring -- http://sematext.com/spm



On Wed, Jul 3, 2013 at 12:07 PM, Kevin Osborn  wrote:
> Also, what is the total document count for your result set? We have an
> application that is also very slow because it does a lot or OR queries. The
> problem is that the result set is very large because of the ORs. Profiling
> showed that Solr was spending the bulk of its time scoring the documents.
>
> Also, instead of OR, you may want to look at dismax or edismax. For search
> box type applications, OR is not really what you want. It just seems like
> what you want.
>
> -Kevin
>
>
> On Wed, Jul 3, 2013 at 5:10 AM, Toke Eskildsen 
> wrote:
>
>> On Wed, 2013-07-03 at 05:48 +0200, huasanyelao wrote:
>> > The response time for "OR" query is around 1-2seconds(the "AND" query is
>> just about 30ms-40ms ).
>>
>> The number of hits will also be much lower for the AND-query. To check
>> whether it is the OR or the size of the result set that is the problem,
>> please try and construct an AND-based query that hits about as many
>> documents as your slow OR query.
>>
>> With an index size of just 9GB, I am surprised that you use sharding.
>> Have you tried using just a single instance to avoid the merge-overhead?
>>
>> - Toke Eskildsen, State and University Library, Denmark
>>
>>
>
>
> --
> *KEVIN OSBORN*
> LEAD SOFTWARE ENGINEER
> CNET Content Solutions
> OFFICE 949.399.8714
> CELL 949.310.4677  SKYPE osbornk
> 5 Park Plaza, Suite 600, Irvine, CA 92614
> [image: CNET Content Solutions]


Re: How to improve the Solr "OR" query performance

2013-07-03 Thread Kevin Osborn
Also, what is the total document count for your result set? We have an
application that is also very slow because it does a lot or OR queries. The
problem is that the result set is very large because of the ORs. Profiling
showed that Solr was spending the bulk of its time scoring the documents.

Also, instead of OR, you may want to look at dismax or edismax. For search
box type applications, OR is not really what you want. It just seems like
what you want.

-Kevin


On Wed, Jul 3, 2013 at 5:10 AM, Toke Eskildsen wrote:

> On Wed, 2013-07-03 at 05:48 +0200, huasanyelao wrote:
> > The response time for "OR" query is around 1-2seconds(the "AND" query is
> just about 30ms-40ms ).
>
> The number of hits will also be much lower for the AND-query. To check
> whether it is the OR or the size of the result set that is the problem,
> please try and construct an AND-based query that hits about as many
> documents as your slow OR query.
>
> With an index size of just 9GB, I am surprised that you use sharding.
> Have you tried using just a single instance to avoid the merge-overhead?
>
> - Toke Eskildsen, State and University Library, Denmark
>
>


-- 
*KEVIN OSBORN*
LEAD SOFTWARE ENGINEER
CNET Content Solutions
OFFICE 949.399.8714
CELL 949.310.4677  SKYPE osbornk
5 Park Plaza, Suite 600, Irvine, CA 92614
[image: CNET Content Solutions]


Re: How to improve the Solr "OR" query performance

2013-07-03 Thread Toke Eskildsen
On Wed, 2013-07-03 at 05:48 +0200, huasanyelao wrote:
> The response time for "OR" query is around 1-2seconds(the "AND" query is just 
> about 30ms-40ms ).

The number of hits will also be much lower for the AND-query. To check
whether it is the OR or the size of the result set that is the problem,
please try and construct an AND-based query that hits about as many
documents as your slow OR query.

With an index size of just 9GB, I am surprised that you use sharding.
Have you tried using just a single instance to avoid the merge-overhead?

- Toke Eskildsen, State and University Library, Denmark



Re: How to improve the Solr "OR" query performance

2013-07-03 Thread Raymond Wiker
On Jul 3, 2013, at 05:48 , huasanyelao  wrote:
> Nowdays, I've got a urgent task to improve the "OR" query performance with 
> solr. 
> I have deployed 9 shards with solr-cloud in two server(each server : 16 
> cores, 32G RAM).
> The total document count: 60,000,000, total index size : 9G.
> According to the requirement, I have to use the "OR" query to get results. 
> The average number of query terms is about 15. 
> The response time for "OR" query is around 1-2seconds(the "AND" query is just 
> about 30ms-40ms ).
> Our target : promote 50%, that is, at most 500ms-1s per query. 
> The document will soar to 80,000,000, however, the performance should keep in 
> 500ms-1s query.
> Any advise or approach is appreciated. Thanks in advance.

If you haven't already done so, you should take a look at what fields you need 
to return from your queries, and specify the set of fields in the "fl" 
parameter. I've seen some really bad response times in cases where large fields 
were being returned (e.g, the entire text content of the source document).

Re: How to improve the Solr "OR" query performance

2013-07-03 Thread Dotan Cohen
On Wed, Jul 3, 2013 at 6:48 AM, huasanyelao  wrote:
> Nowdays, I've got a urgent task to improve the "OR" query performance with 
> solr.
> I have deployed 9 shards with solr-cloud in two server(each server : 16 
> cores, 32G RAM).
> The total document count: 60,000,000, total index size : 9G.
> According to the requirement, I have to use the "OR" query to get results.
> The average number of query terms is about 15.
> The response time for "OR" query is around 1-2seconds(the "AND" query is just 
> about 30ms-40ms ).
> Our target : promote 50%, that is, at most 500ms-1s per query.
> The document will soar to 80,000,000, however, the performance should keep in 
> 500ms-1s query.
> Any advise or approach is appreciated. Thanks in advance.
>

What size documents? I've currently got stats like this, only a few
more documents but 5s searches on 15 ORs:
q=love%20OR%20hate%20OR%20beer%20OR%20sex%20OR%20peace%20OR%20war%20OR%20up%20OR%20down%20OR%20this%20OR%20that%20OR%20left%20OR%20right%20OR%20north%20OR%20south%20OR%20east%20OR%20west
05604love OR hate
OR beer OR sex OR peace OR war OR up OR down OR this OR that OR left
OR right OR north OR south OR east OR west


My index currently has 77461952 documents, most under 1 KiB each but
upwards of ten fields.

--
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com


How to improve the Solr "OR" query performance

2013-07-03 Thread huasanyelao
Nowdays, I've got a urgent task to improve the "OR" query performance with 
solr. 
I have deployed 9 shards with solr-cloud in two server(each server : 16 cores, 
32G RAM).
The total document count: 60,000,000, total index size : 9G.
According to the requirement, I have to use the "OR" query to get results. 
The average number of query terms is about 15. 
The response time for "OR" query is around 1-2seconds(the "AND" query is just 
about 30ms-40ms ).
Our target : promote 50%, that is, at most 500ms-1s per query. 
The document will soar to 80,000,000, however, the performance should keep in 
500ms-1s query.
Any advise or approach is appreciated. Thanks in advance.


Jack