Need help, multiple aggregations with filters extremely slow, where to look for optimizations?

Thomas Fri, 13 Jun 2014 00:10:19 -0700

Hi,

I'm facing a performance issue with some aggregations I perform, and I need 
your help if possible:


I have to documents, the *request* and the *event*. The request is the 
parent of the event. Below is a (sample) mapping

"event" : {
"dynamic" : "strict",
"_parent" : {
           "type" : "request"
        },
"properties" : {
   "event_time" : {
"format" : "dateOptionalTime",
"type" : "date"
           },
   "count" : {
      "type" : "integer"
    },
    "event" : {
        "index" : "not_analyzed",
        "type" : "string"
    }
         }
}

"request" : {
    "dynamic" : "strict",
     "_id" : {
       "path" : "uniqueId"
     },
     "properties" : {
        "uniqueId" : {
             "index" : "not_analyzed",
             "type" : "string"
        },
        "user" : {
             "index" : "not_analyzed",
             "type" : "string"
        },
       "code" : {
          "type" : "integer"
       },
       "country" : {
         "index" : "not_analyzed",
         "type" : "string"
       },
       "city" : {
         "index" : "not_analyzed",
         "type" : "string"
       }
      ....
    }
}

My cluster is becoming really big (almost 2 TB of data with billions of 
documents) and i maintain one index per day, whereas I occasionally delete 
old indices. My daily index is about 20GB big. The version of elasticsearch 
that I use is 1.1.1. 

My problems start when I want to get some aggregations of events with some 
criteria which is applied in the parent request document. For example count 
be the events of type *click for country = US and code=12. What I was 
initially doing was to generate a scriptFilter for the request document (in 
Groovy) and I was adding multiple aggregations in one search request. This 
ended up being very slow so I removed the scripting logic and I supported 
my logic with java code.*

What seems to be initially solved in my local machine, when I got back to 
the cluster, nothing has changed. Again my app performs really really poor. 
I get more than 10 seconds to perform a search with ~10 sub-aggregations.

What seems strange is that I notice that the cluster is pretty ok with 
regards load average, CPU etc. 

Any hints on where to look for solving this out? to be able to identify the 
bottleneck

*Ask for any additional information to provide*, I didn't want to make this 
post too long to read
Thank you

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8134f5b0-f947-406f-ab57-c44c6c82ce66%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Need help, multiple aggregations with filters extremely slow, where to look for optimizations?

Reply via email to