Over a sample ~2.5M document dataset, where each record holds a geopoint
and some other data, I wanted ElasticSearch 1.4.1 to provide the following
data:
For all results in a given geo_bounding box:
Group results by:
(geohash of length 8, a term, day)
For each group provide:
2 sums of terms, 2 distinct terms of the documents in the group
The nested aggregation looked like:
geohash_grid
terms
date_histogram
sum
sum
cardinality
cardinality
I had two issues:
1. I seem to have received only some of the response. The response
"hits.total" was 174054, yet when I summed the geohash_grid (first
aggregation) doc_count, I got about ~13K. I tried perhaps passing a large
"size" but this had no effect. Is there a way to get all of the response?
2. The next logical step was to try pagination, but when I added
&scroll=60s to the URL, I received an ElasticsearchIllegalStateException
exception and 503 status. From the logs, the stack was:
[DEBUG][action.search.type ] [zoidberg] [listener][4]: Failed to
execute [org.elasticsearch.action.search.SearchRequest@77526f86] while
moving to second phase
org.elasticsearch.ElasticsearchIllegalStateException
at
org.elasticsearch.action.search.type.TransportSearchHelper.buildScrollId(TransportSearchHelper.java:65)
at
org.elasticsearch.action.search.type.TransportSearchCountAction$AsyncAction.moveToSecondPhase(TransportSearchCountAction.java:80)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.innerMoveToSecondPhase(TransportSearchTypeAction.java:397)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:198)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$1.onResult(TransportSearchTypeAction.java:174)
at
org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$1.onResult(TransportSearchTypeAction.java:171)
at
org.elasticsearch.search.action.SearchServiceTransportAction$6.handleResponse(SearchServiceTransportAction.java:244)
at
org.elasticsearch.search.action.SearchServiceTransportAction$6.handleResponse(SearchServiceTransportAction.java:235)
at
org.elasticsearch.transport.netty.MessageChannelHandler.handleResponse(MessageChannelHandler.java:158)
at
org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:127)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
at
org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:310)
at
org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at
org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at
org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
at
org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
at
org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at
org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at
org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
This occurs regardless of the geohash precision.
Questions:
1. To get the data I need, is the aggregation I have built the
correct/optimal way?
2. Why can't I see all results in a non-paginated aggregation with a
large response? Is there a hard limit?
3. What is the cause of the exception?
Thanks
Eran
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c8192343-557b-4c0b-afab-0563eddcef07%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.