Hi Jörg,

This query 
{
   "query" : {
      "bool": {
          "must": {
               "match" : { "body" : "big" }
           },
          "must_not": {
               "match" : { "body" : "data" }       
           },
           "must": {
            "match" : {"id": 521}
           }
     }
   }
}

and this query are performing exactly same
{
   "query" : {
      "bool": {
          "must": {
               "match" : { "body" : "big" }
           },
          "must_not": {
               "match" : { "body" : "data" }       
           }
     }
   },
   "filter" : {
       "term" : { "id" : "521" }
   }
}

I am not able understand what makes a filtered query fast. Is there any 
place where I can find documentation on the internals of how different 
queries are processed by elasticsearch.

On Saturday, 23 August 2014 18:20:23 UTC+5:30, Jörg Prante wrote:
>
> Before firing queries, you should consider if the index design and query 
> choice is optimal.
>
> Numeric range queries are not straightforward. They were a major issue on 
> inverted index engines like Lucene/Elasticsearch and it has taken some time 
> to introduce efficient implementations. See e.g. 
> https://issues.apache.org/jira/browse/LUCENE-1673
>
> ES tries to compensate the downsides of massive numeric range queries by 
> loading all the field values into memory. To achieve effective queries, you 
> have to carefully discretize the values you index. 
>
> For example, a few hundred millions of different timestamps, with 
> millisecond resolution, are a real burden for searching on inverted 
> indices. A good discretization strategy for indexing is to reduce the total 
> amount of values in such field to a few hundred or thousands. For 
> timestamps, this means, indexing time-based series data in discrete 
> intervals of days, hours, minutes, maybe seconds is much more efficient 
> than e.g. millisecond resolution.
>
> Another topic is to use filters for boolean queries. They are much faster.
>
> Jörg
>
>
>
> On Sat, Aug 23, 2014 at 2:19 PM, Narendra Yadala <narendr...@gmail.com 
> <javascript:>> wrote:
>
>> Hi Ivan,
>>
>> Thanks for the input about aggregating on strings, I do that, but those 
>> queries take time but they do not crash node. 
>>
>> The queries which caused problem were pretty straightforward queries 
>> (such as a boolean query with two musts, one must is equal match and other 
>> a range match on long) but the real problem was with the size. When I kept 
>> size as Integer.MAX_VALUE, it caused all the problems. When I removed it, 
>> it started working fine. I think it is worth mentioning somewhere about 
>> this strange behavior (probably expected but strange).
>>
>> I did double up on the RAM though and now I have allocated 5*10G RAM to 
>> the cluster. Things are looking ok as of now, except that the aggregations 
>> (on strings) are quite slow. May be I would run these aggregations as batch 
>> and cache the outputs in a different type and move on for now.
>>
>> Thanks
>> NY
>>
>>
>> On Fri, Aug 22, 2014 at 10:34 PM, Ivan Brusic <iv...@brusic.com 
>> <javascript:>> wrote:
>>
>>> How expensive are your queries? Are you using aggregations or sorting on 
>>> string fields that could use up your field data cache? Are you using the 
>>> defaults for the cache? Post the current usage.
>>>
>>> If you post an example query and mapping, perhaps the community can help 
>>> optimize it.
>>>
>>> Cheers,
>>>
>>> Ivan
>>>
>>>
>>>  On Fri, Aug 22, 2014 at 12:28 AM, Narendra Yadala <narendr...@gmail.com 
>>> <javascript:>> wrote:
>>>
>>>>  I have a cluster of size 240 GB including replica and it has 5 nodes 
>>>> in it. I allocated 5 GB RAM (total 5*5 GB) to each node and started the 
>>>> cluster. When I start continuously firing queries on the cluster the GC 
>>>> starts kicking in and eventually node goes down because of OutOfMemory 
>>>> exception. I add upto 200k documents everyday. The indexing part works 
>>>> fine 
>>>> but querying part is causing trouble. I have the cluster on ec2 and I use 
>>>> ec2 discovery mode.
>>>>
>>>> What is ideal RAM size and are there any other parameters I need to 
>>>> tune to get this cluster going?
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "elasticsearch" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to elasticsearc...@googlegroups.com <javascript:>.
>>>>
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/elasticsearch/5b659d11-d757-4f8e-b347-60b3807c2dfe%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/elasticsearch/5b659d11-d757-4f8e-b347-60b3807c2dfe%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>  -- 
>>> You received this message because you are subscribed to a topic in the 
>>> Google Groups "elasticsearch" group.
>>> To unsubscribe from this topic, visit 
>>> https://groups.google.com/d/topic/elasticsearch/DdPD8MiquYQ/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to 
>>> elasticsearc...@googlegroups.com <javascript:>.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDQ9GTt%3Dcf1s1sXy57UMNB-0MNgNgCWEQOLooXDX7yNUA%40mail.gmail.com
>>>  
>>> <https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQDQ9GTt%3Dcf1s1sXy57UMNB-0MNgNgCWEQOLooXDX7yNUA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/CAOpeyMHfTmW06iSrximhD2F%2BxdeV2KhRy6AppO_JrcMgwXy2MA%40mail.gmail.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/CAOpeyMHfTmW06iSrximhD2F%2BxdeV2KhRy6AppO_JrcMgwXy2MA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4cafd135-eb98-490c-bb75-84010a92c778%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to