We've narrowed the problem down to a multi_match clause in our query: {"multi_match":{"fields":["attachments.*.bodies"], "query":"foobar"}}
This has to do with the way we've structured our index, We are searching an index that contains emails, and we are indexing attachments in the attachments.*.bodies fields. For example, attachments.1.bodies would contain the text body of an attachment. This structure is clearly sub-optimal in terms of multi_match queries, but I need to structure our index in some way that we can search the contents of an email and the parsed contents of its attachments, and get back the email as a result. >From reading the docs, it seems like the better way to solve this is with nested types? http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-nested-type.html --paul On Wednesday, May 28, 2014 7:11:05 PM UTC-4, Paul Sanwald wrote: > > Sorry, it's Java 7: > > jvm: { > pid: 20424 > version: 1.7.0_09-icedtea > vm_name: OpenJDK 64-Bit Server VM > vm_version: 23.7-b01 > vm_vendor: Oracle Corporation > start_time: 1401309063644 > mem: { > heap_init_in_bytes: 1073741824 > heap_max_in_bytes: 10498867200 > non_heap_init_in_bytes: 24313856 > non_heap_max_in_bytes: 318767104 > direct_max_in_bytes: 10498867200 > } > gc_collectors: [ > PS Scavenge > PS MarkSweep > ] > memory_pools: [ > Code Cache > PS Eden Space > PS Survivor Space > PS Old Gen > PS Perm Gen > ] > > On Wednesday, May 28, 2014 6:58:26 PM UTC-4, Mark Walkom wrote: >> >> What java version are you running, it's not in the stats gist. >> >> Regards, >> Mark Walkom >> >> Infrastructure Engineer >> Campaign Monitor >> email: ma...@campaignmonitor.com >> web: www.campaignmonitor.com >> >> >> On 29 May 2014 08:33, Paul Sanwald <pa...@redowlanalytics.com> wrote: >> >>> I apologize about the signature, it's automatic. I've created a gist >>> with the cluster node stats: >>> https://gist.github.com/pcsanwald/e11ba02ac591757c8d92 >>> >>> We are using 1.1.0, using aggregations a lot but nothing crazy. We run >>> our app on much much larger indices successfully. But, the problem seems to >>> be present itself on even basic search cases. The one thing that's >>> different about this dataset is a lot of it is in spanish. >>> >>> thanks for your help! >>> >>> On Wednesday, May 28, 2014 6:22:59 PM UTC-4, Mark Walkom wrote: >>>> >>>> Can you provide some specs on your cluster, OS, RAM, heap, disk, java >>>> and ES versions? >>>> Are you using parent/child relationships, TTLs, large facet or other >>>> queries? >>>> >>>> >>>> (Also, your elaborate legalese signature is kind of moot given you're >>>> posting to a public mailing list :p) >>>> >>>> Regards, >>>> Mark Walkom >>>> >>>> Infrastructure Engineer >>>> Campaign Monitor >>>> email: ma...@campaignmonitor.com >>>> web: www.campaignmonitor.com >>>> >>>> >>>> On 29 May 2014 07:27, Paul Sanwald <pa...@redowlanalytics.com> wrote: >>>> >>>>> Hi Everyone, >>>>> We are seeing continual OOM exceptions on one of our 1.1.0 >>>>> elasticsearch clusters, the index is ~30GB, quite small. I'm trying to >>>>> work >>>>> out the root cause via heap dump analysis, but not having a lot of luck. >>>>> I >>>>> don't want to include a bunch of unnecessary info, but the stacktrace >>>>> we're >>>>> seeing is pasted below. Has anyone seen this before? I've been using the >>>>> cluster stats and node stats APIs to try and find a smoking gun, but I'm >>>>> not seeing anything that looks out of the ordinary. >>>>> >>>>> Any ideas? >>>>> >>>>> 14/05/27 20:37:08 WARN transport.netty: [Strongarm] Failed to send >>>>> error message back to client for action [search/phase/query] >>>>> java.lang.OutOfMemoryError: GC overhead limit exceeded >>>>> 14/05/27 20:37:08 WARN transport.netty: [Strongarm] Actual Exception >>>>> org.elasticsearch.search.query.QueryPhaseExecutionException: >>>>> [eventdata][2]: q >>>>> uery[ConstantScore(*:*)],from[0],size[0]: Query Failed [Failed to >>>>> execute main >>>>> query] >>>>> at org.elasticsearch.search.query.QueryPhase.execute( >>>>> QueryPhase.java:1 >>>>> 27) >>>>> at org.elasticsearch.search.SearchService.executeQueryPhase( >>>>> SearchService.java:257) >>>>> at org.elasticsearch.search.action. >>>>> SearchServiceTransportAction$SearchQueryTransportHandler. >>>>> messageReceived(SearchServiceTransportAction.java:623) >>>>> at org.elasticsearch.search.action. >>>>> SearchServiceTransportAction$SearchQueryTransportHandler. >>>>> messageReceived(SearchServiceTransportAction.java:612) >>>>> at org.elasticsearch.transport.netty.MessageChannelHandler$ >>>>> RequestHandler.run(MessageChannelHandler.java:270) >>>>> at java.util.concurrent.ThreadPoolExecutor.runWorker( >>>>> ThreadPoolExecutor.java:1145) >>>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run( >>>>> ThreadPoolExecutor.java:615) >>>>> at java.lang.Thread.run(Thread.java:722) >>>>> Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded >>>>> >>>>> *Important Notice:* The information contained in or attached to this >>>>> email message is confidential and proprietary information of RedOwl >>>>> Analytics, Inc., and by opening this email or any attachment the >>>>> recipient >>>>> agrees to keep such information strictly confidential and not to use or >>>>> disclose the information other than as expressly authorized by RedOwl >>>>> Analytics, Inc. If you are not the intended recipient, please be aware >>>>> that any use, printing, copying, disclosure, dissemination, or the taking >>>>> of any act in reliance on this communication or the information contained >>>>> herein is strictly prohibited. If you think that you have received this >>>>> email message in error, please delete it and notify the sender. >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "elasticsearch" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to elasticsearc...@googlegroups.com. >>>>> To view this discussion on the web visit https://groups.google.com/d/ >>>>> msgid/elasticsearch/e6634ad4-619f-4f24-8287-d3bc97722a88% >>>>> 40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/e6634ad4-619f-4f24-8287-d3bc97722a88%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> >>> *Important Notice:* The information contained in or attached to this >>> email message is confidential and proprietary information of RedOwl >>> Analytics, Inc., and by opening this email or any attachment the recipient >>> agrees to keep such information strictly confidential and not to use or >>> disclose the information other than as expressly authorized by RedOwl >>> Analytics, Inc. If you are not the intended recipient, please be aware >>> that any use, printing, copying, disclosure, dissemination, or the taking >>> of any act in reliance on this communication or the information contained >>> herein is strictly prohibited. If you think that you have received this >>> email message in error, please delete it and notify the sender. >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to elasticsearc...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/elasticsearch/13264f7f-625d-4452-9be7-82691f735963%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/13264f7f-625d-4452-9be7-82691f735963%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> > *Important Notice:* The information contained in or attached to this > email message is confidential and proprietary information of RedOwl > Analytics, Inc., and by opening this email or any attachment the recipient > agrees to keep such information strictly confidential and not to use or > disclose the information other than as expressly authorized by RedOwl > Analytics, Inc. If you are not the intended recipient, please be aware > that any use, printing, copying, disclosure, dissemination, or the taking > of any act in reliance on this communication or the information contained > herein is strictly prohibited. If you think that you have received this > email message in error, please delete it and notify the sender. -- *Important Notice:* The information contained in or attached to this email message is confidential and proprietary information of RedOwl Analytics, Inc., and by opening this email or any attachment the recipient agrees to keep such information strictly confidential and not to use or disclose the information other than as expressly authorized by RedOwl Analytics, Inc. If you are not the intended recipient, please be aware that any use, printing, copying, disclosure, dissemination, or the taking of any act in reliance on this communication or the information contained herein is strictly prohibited. If you think that you have received this email message in error, please delete it and notify the sender. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2d383538-2813-44ac-a50c-8649b4a91bf8%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.