Thank you Shawn and Erick for the quick response.

A follow up question.


Basedon 
https://cwiki.apache.org/confluence/display/solr/Common+Query+Parameters#CommonQueryParameters-Thefq%28FilterQuery%29Parameter,I
 see the "fl" (field list) parameter.  Does this mean I canbuild my Lucene 
search syntax as follows:


    q=skyfall OR ian ORfleming&fl=title&fl=owner&fq=doc_type:DOC


And get the same result as (per Shawn's example changed it bit toadd OR):


    q=title:(skyfall OR ian OR fleming)owner:(skyfall OR ian OR 
fleming)&fq=doc_type:DOC


Btw, my default search operator is set to AND.  My need is tofind whatever the 
user types in both of those two fields (or maybe some otherfields which is 
controlled by the UI).. For example, user types"skyfall ian fleming" and 
selected 3 fields, and want to narrowdown to doc_type DOC.


- MJ




-----Original Message-----
From: Erick Erickson <erickerick...@gmail.com>
To: solr-user <solr-user@lucene.apache.org>
Sent: Wed, Apr 30, 2014 5:33 pm
Subject: Re: Which Lucene search syntax is faster


I'd add that I think you're worrying about the wrong thing. 10M
documents is not very many by modern Solr standards. I rather suspect
that you won't notice much difference in performance due to how you
construct the query.

Shawn's suggestion to use fq clauses is spot on, though. fq clauses
are re-used (see filterCache in solrconfig.xml). My rule of thumb is
to use fq clauses for most everything that does NOT contribute to
scoring...

Best,
Erick

On Wed, Apr 30, 2014 at 2:18 PM, Shawn Heisey <s...@elyograg.org> wrote:
> On 4/30/2014 2:29 PM, johnmu...@aol.com wrote:
>> My question is this: what Lucene search syntax will give meback result the 
fastest?  If my user is interestedin finding data within “title” and “owner” 
fields only “doc_type” “DOC”, shouldI build my Lucene search syntax as:
>>
>> 1) skyfall ian fleming AND doc_type:DOC
>
> If your default field is text, I'm fairly sure this will become
> equivalent to the following which is probably NOT what you want.
> Parentheses can be very important.
>
> text:skyfall OR text:ian OR (text:fleming AND doc_type:DOC)
>
>> 2) title:(skyfall OR ian OR fleming) owner:(skyfall OR ian OR fleming) AND 
doc_type:DOC
>
> This kind of query syntax is probably what you should shoot for.  Not
> from a performance perspective -- just from the perspective of making
> your queries completely correct.  Note that the +/- syntax combined with
> parentheses is far more precise than using AND/OR/NOT.
>
>> 3) Something else I don't know about.
>
> The edismax query parser is very powerful.  That might be something
> you're interested in.
>
> https://cwiki.apache.org/confluence/display/solr/The+Extended+DisMax+Query+Parser
>
>
>> Of the 10 million documents I will be indexing, 80% will be of "doc_type" 
PDF, and about 10% of type DOC, so please keep that in mind as a factor (if 
that 
will mean anything in terms of which syntax I should use).
>
> For the most part, whatever general query format you choose to use will
> not matter very much.  There are exceptions, but mostly Solr (Lucene) is
> smart enough to convert your query to an efficient final parsed format.
> Turn on the debugQuery parameterto see what it does with each query.
>
> Regardless of whether you use the standard lucene query parser or
> edismax, incorporate filter queries into your query constructing logic.
> Your second example above would be better to express like this, with the
> default operator set to OR.  This uses both q and fq parameters:
>
> q=title:(skyfall ian fleming) owner:(skyfall ian fleming)&fq=doc_type:DOC
>
> https://cwiki.apache.org/confluence/display/solr/Common+Query+Parameters#CommonQueryParameters-Thefq%28FilterQuery%29Parameter
>
> Thanks,
> Shawn
>

 

Reply via email to