Over a sample ~2.5M document dataset, where each record holds a geopoint 
and some other data, I wanted ElasticSearch 1.4.1 to provide the following 
data:

For all results in a given geo_bounding box:
  Group results by: 
    (geohash of length 8, a term, day)
  For each group provide:
    2 sums of terms, 2 distinct terms of the documents in the group

The nested aggregation looked like:

geohash_grid
  terms
    date_histogram
      sum
      sum
      cardinality
      cardinality

I had two issues:

   1. I seem to have received only some of the response. The response 
   "hits.total" was 174054, yet when I summed the geohash_grid (first 
   aggregation) doc_count, I got about ~13K. I tried perhaps passing a large 
   "size" but this had no effect. Is there a way to get all of the response?
   2. The next logical step was to try pagination, but when I added 
   &scroll=60s to the URL, I received an ElasticsearchIllegalStateException 
   exception and 503 status. From the logs, the stack was:
   
[DEBUG][action.search.type       ] [zoidberg] [listener][4]: Failed to 
execute [org.elasticsearch.action.search.SearchRequest@77526f86] while 
moving to second phase
org.elasticsearch.ElasticsearchIllegalStateException
        at 
org.elasticsearch.action.search.type.TransportSearchHelper.buildScrollId(TransportSearchHelper.java:65)
        at 
org.elasticsearch.action.search.type.TransportSearchCountAction$AsyncAction.moveToSecondPhase(TransportSearchCountAction.java:80)
        at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.innerMoveToSecondPhase(TransportSearchTypeAction.java:397)
        at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:198)
        at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$1.onResult(TransportSearchTypeAction.java:174)
        at 
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$1.onResult(TransportSearchTypeAction.java:171)
        at 
org.elasticsearch.search.action.SearchServiceTransportAction$6.handleResponse(SearchServiceTransportAction.java:244)
        at 
org.elasticsearch.search.action.SearchServiceTransportAction$6.handleResponse(SearchServiceTransportAction.java:235)
        at 
org.elasticsearch.transport.netty.MessageChannelHandler.handleResponse(MessageChannelHandler.java:158)
        at 
org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:127)
        at 
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
        at 
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
        at 
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
        at 
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
        at 
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:310)
        at 
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
        at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
        at 
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
        at 
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
        at 
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
        at 
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
        at 
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
        at 
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
        at 
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
        at 
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
        at 
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
        at 
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

This occurs regardless of the geohash precision. 

Questions:

   1. To get the data I need, is the aggregation I have built the 
   correct/optimal way?
   2. Why can't I see all results in a non-paginated aggregation with a 
   large response? Is there a hard limit?
   3. What is the cause of the exception? 

Thanks
Eran

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c8192343-557b-4c0b-afab-0563eddcef07%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to