But I can do aggregation on 'banner' field on both cluster. Is that because
values of 'banner' are not so unique compared to 'ip' field


2014-04-02 16:27 GMT+08:00 Adrien Grand <adrien.gr...@elasticsearch.com>:

> Given your description of the problem, I think the issue is that your
> Elasticsearch cluster doesn't have enough memory to load field data for the
> ip field (which needs to be done for all documents, not only those that
> match your query). So you either need to give more nodes to your cluster,
> more memory to your nodes, or use doc values for your ip field[1] (the
> latter option requires reindexing).
>
> [1]
> http://www.elasticsearch.org/blog/disk-based-field-data-a-k-a-doc-values/
>
>
> On Wed, Apr 2, 2014 at 10:09 AM, <vir.ca...@gmail.com> wrote:
>
>> The smaller index have 1 million lines of data. They are the lines
>> filtered  by "prefix":{"ip":"100.1"} from the bigger one.
>>
>> 在 2014年4月2日星期三UTC+8下午4时04分27秒,vir....@gmail.com写道:
>>
>>> I do an *aggregation* search on my index(*6 nodes*). There are about *200
>>> million lines* of data(port scanning). Each line is same* like this 
>>> :**{"ip":"85.18.68.5",
>>> "banner":"cisco-IOS", "country":"IT", "_type":"port-80"}.*
>>> So you can image I have these data sort into different type by port they
>>> are scanning. Now, I want to know who open a lot of ports at the same time.
>>> So, I choose to do aggregation on IP field, and I get an OOM error that may
>>> be reasonable because of most of them open only one port so that there are
>>> too many buckets? I guess.
>>>
>>>
>>> And then, I use aggregation filter.
>>>
>>> {
>>>     "aggs":{
>>>         "just_name1":{
>>>         "filter":{
>>>             "prefix":{
>>>                 "ip":"100.1"
>>>             }
>>>         },
>>>         "aggs":{
>>>             "just_name2":{
>>>                 "terms":{
>>>                     "field":"ip",
>>>                     "execution_hint":"map"
>>>                 }
>>>                     }
>>>         }
>>>     }
>>>     }
>>> }(yes, my ip field is set as string)
>>>
>>> I think this time, I could make ES narrow down the set for aggregation. But 
>>> I still get an OOM error. While It works on a smaller index(another 
>>> cluster, one node). Why would this happen? After filtering, 2 cluster 
>>> should have an equal-volume set. Why the bigger one failed?
>>>
>>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/d384bea8-4a60-4521-aa0e-34bb2fd61ec5%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/d384bea8-4a60-4521-aa0e-34bb2fd61ec5%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> Adrien Grand
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/cf6dpcV7G3w/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6kOx7RXmBzU9wfhesUYiz-2Qx8mrZStb_rCGdQv%2BpqNQ%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6kOx7RXmBzU9wfhesUYiz-2Qx8mrZStb_rCGdQv%2BpqNQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJp1%3DtwM3KJ1QYvsKGcXi4bDfjwDF-bRviSsYX6jUBEg6w5qgQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to