Hi, We also have this issue in 1.0.2. Per index we have over 1000 segments. Optimize doesn't reduce them.
Would love to see a fix for this. Thanks, Thibaut On Fri, Apr 11, 2014 at 2:22 PM, Nikolas Everett <nik9...@gmail.com> wrote: > For reference I'm also on 1.1.0 but I'm not seeing more segments then I > expect. I see an average of ~28 per shard on an index I write to > constantly. I don't write all that quickly, < 50 updates a second. > > Nik > > > On Fri, Apr 11, 2014 at 5:46 AM, Adrien Grand < > adrien.gr...@elasticsearch.com> wrote: > >> I managed to reproduce the issue locally, I'm looking into it. >> >> >> On Fri, Apr 11, 2014 at 9:45 AM, Robin Wallin <walro...@gmail.com> wrote: >> >>> Hello, >>> >>> We are experiencing a related problem with 1.1.0. Segments do not seem >>> to merge as they should during indexing. The optimize API does practically >>> nothing in terms of lowering the segments count either. The problem >>> persists through a cluster restart. The vast amount of segments seem to be >>> greatly impact the performance of the cluster, in a very negative way. >>> >>> We currently have 414 million documents across 3 nodes, each shard has >>> in average 1200 segments(!). >>> >>> With 1.0.1 we had even more documents, ~650 million, without any segment >>> problems. Looking in Marvel we were hovering at around 30-40 segments per >>> shard back then. >>> >>> Best Regards, >>> Robin >>> >>> On Friday, April 11, 2014 1:35:42 AM UTC+2, Adrien Grand wrote: >>> >>>> Thanks for reporting this, the behavior is definitely unexpected. I'll >>>> test _optimize on very large numbers of shards to see if I can reproduce >>>> the issue. >>>> >>>> >>>> On Thu, Apr 10, 2014 at 2:10 PM, Elliott Bradshaw >>>> <ebrad...@gmail.com>wrote: >>>> >>>>> Adrien, >>>>> >>>>> Just an FYI, after resetting the cluster, things seem to have >>>>> improved. Optimize calls now lead to CPU/IO activity over their duration. >>>>> Max_num_segments=1 does not seem to be working for me on any given call, >>>>> as >>>>> each call would only reduce the segment count by about 600-700. I ran 10 >>>>> calls in sequence overnight, and actually got down to 4 segments >>>>> (1/shard)! >>>>> >>>>> I'm glad I got the index optimized, searches are literally 10-20 times >>>>> faster without 1500/segments per shard to deal with. It's awesome. >>>>> >>>>> That said, any thoughts on why the index wasn't merging on its own, or >>>>> why optimize was returning prematurely? >>>>> >>>>> >>>>> On Wednesday, April 9, 2014 11:10:56 AM UTC-4, Elliott Bradshaw wrote: >>>>>> >>>>>> Hi Adrien, >>>>>> >>>>>> I kept the logs up over the last optimize call, and I did see an >>>>>> exception. I Ctrl-C'd a curl optimize call before making another one, >>>>>> but >>>>>> I don't think that that caused this exception. The error is essentially >>>>>> as >>>>>> follows: >>>>>> >>>>>> netty - Caught exception while handling client http traffic, closing >>>>>> connection [id: 0x4d8f1a90, /127.0.0.1:33480 :> /127.0.0.1:9200] >>>>>> >>>>>> java.nio.channels.ClosedChannelException at AbstractNioWorker. >>>>>> cleanUpWriteBuffer(AbstractNioWorker.java:433) >>>>>> at AbstractNioWorker.writeFromUserCode >>>>>> at NioServerSocketPipelineSink.handleAcceptedSocket >>>>>> at NioServerSocketPipelineSink.eventSunk >>>>>> at DefaultChannelPipeline$DefaultChannelhandlerContext.sendDownstream >>>>>> at Channels.write >>>>>> at OneToOneEncoder.doEncode >>>>>> at OneToOneEncoder.handleDownstream >>>>>> at DefaultChannelPipeline.sendDownstream >>>>>> at DefaultChannelPipeline.sendDownstream >>>>>> at Channels.write >>>>>> at AbstractChannel.write >>>>>> at NettyHttpChannel.sendResponse >>>>>> at RestOptimizeAction$1.onResponse(95) >>>>>> at RestOptimizeAction$1.onResponse(85) >>>>>> at TransportBroadcastOperationAction$AsyncBroadcastAction.finishHim >>>>>> at TransportBroadcastOperationAction$AsyncBroadcastAction.onOperation >>>>>> at TransportBroadcastOperationAction$AsyncBroadcastAction$2.run >>>>>> >>>>>> Sorry about the crappy stack trace. Still, looks like this might >>>>>> point to a problem! The exception fired about an hour after I kicked off >>>>>> the optimize. Any thoughts? >>>>>> >>>>>> On Wednesday, April 9, 2014 10:06:57 AM UTC-4, Elliott Bradshaw wrote: >>>>>>> >>>>>>> Hi Adrien, >>>>>>> >>>>>>> I did customize my merge policy, although I did so only because I >>>>>>> was so surprised by the number of segments left over after the load. >>>>>>> I'm >>>>>>> pretty sure the optimize problem was happening before I made this >>>>>>> change, >>>>>>> but either way here are my settings: >>>>>>> >>>>>>> "index" : { >>>>>>> "merge" : { >>>>>>> "policy" : { >>>>>>> "max_merged_segment" : "20gb", >>>>>>> "segments_per_tier" : 5, >>>>>>> "floor_segment" : "10mb" >>>>>>> }, >>>>>>> "scheduler" : "concurrentmergescheduler" >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> Not sure whether this set up could be a contributing factor or not. >>>>>>> Nothing really jumps out at me in the logs. In fact, when i kick off >>>>>>> the >>>>>>> optimize, I don't see any logging at all. Should I? >>>>>>> >>>>>>> I'm running the following command: curl -XPOST >>>>>>> http://localhost:9200/index/_optimize >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> >>>>>>> On Wednesday, April 9, 2014 8:56:35 AM UTC-4, Adrien Grand wrote: >>>>>>>> >>>>>>>> Hi Elliott, >>>>>>>> >>>>>>>> 1500 segments per shard is certainly way too much, and it is not >>>>>>>> normal that optimize doesn't manage to reduce the number of segments. >>>>>>>> - Is there anything suspicious in the logs? >>>>>>>> - Have you customized the merge policy or scheduler?[1] >>>>>>>> - Does the issue still reproduce if you restart your cluster? >>>>>>>> >>>>>>>> [1] http://www.elasticsearch.org/guide/en/elasticsearch/referenc >>>>>>>> e/current/index-modules-merge.html >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Apr 9, 2014 at 2:38 PM, Elliott Bradshaw < >>>>>>>> ebrad...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Any other thoughts on this? Would 1500 segments per shard be >>>>>>>>> significantly impacting performance? Have you guys noticed this >>>>>>>>> behavior >>>>>>>>> elsewhere? >>>>>>>>> >>>>>>>>> Thanks. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Monday, April 7, 2014 8:56:38 AM UTC-4, Elliott Bradshaw wrote: >>>>>>>>>> >>>>>>>>>> Adrian, >>>>>>>>>> >>>>>>>>>> I ran the following command: >>>>>>>>>> >>>>>>>>>> curl -XPUT http://localhost:9200/_settings -d >>>>>>>>>> '{"indices.store.throttle.max_bytes_per_sec" : "10gb"}' >>>>>>>>>> >>>>>>>>>> and received a { "acknowledged" : "true" } response. The logs >>>>>>>>>> showed "cluster state updated". >>>>>>>>>> >>>>>>>>>> I did have to close my index prior to changing the setting and >>>>>>>>>> reopen afterward. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I've since began another optimize, but again it doesn't look like >>>>>>>>>> much is happening. The optimize isn't returning and the total CPU >>>>>>>>>> usage on >>>>>>>>>> every node is holding at about 2% of a single core. I would copy a >>>>>>>>>> hot_threads stack trace, but I'm unfortunately on a closed network >>>>>>>>>> and this >>>>>>>>>> isn't possible. I can tell you that refreshes of hot_threads show >>>>>>>>>> vary >>>>>>>>>> little happening. The occasional [merge] thread (always in a >>>>>>>>>> LinkedTransferQueue.awaitMatch() state) or [optimize] (doing >>>>>>>>>> nothing on a waitForMerge() call) thread shows up, but it's always >>>>>>>>>> consuming 0-1% CPU. It sure feels like something isn't right. Any >>>>>>>>>> thoughts? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Apr 4, 2014 at 3:24 PM, Adrien Grand < >>>>>>>>>> adrien...@elasticsearch.com> wrote: >>>>>>>>>> >>>>>>>>>>> Did you see a message in the logs confirming that the setting >>>>>>>>>>> has been updated? It would be interesting to see the output of hot >>>>>>>>>>> threads[1] to see what your node is doing. >>>>>>>>>>> >>>>>>>>>>> [1] http://www.elasticsearch.org/guide/en/elasticsearch/referenc >>>>>>>>>>> e/current/cluster-nodes-hot-threads.html >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Apr 4, 2014 at 7:18 PM, Elliott Bradshaw < >>>>>>>>>>> ebrad...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Yes. I have run max_num_segments=1 every time. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Apr 4, 2014 at 12:26 PM, Michael Sick < >>>>>>>>>>>> michae...@serenesoftware.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Have you tried max_num_segments=1 on your optimize? >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Apr 4, 2014 at 11:27 AM, Elliott Bradshaw < >>>>>>>>>>>>> ebrad...@gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Any thoughts on this? I've run optimize several more times, >>>>>>>>>>>>>> and the number of segments falls each time, but I'm still over >>>>>>>>>>>>>> 1000 >>>>>>>>>>>>>> segments per shard. Has anyone else run into something similar? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Thursday, April 3, 2014 11:21:29 AM UTC-4, Elliott >>>>>>>>>>>>>> Bradshaw wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> OK. Optimize finally returned, so I suppose something was >>>>>>>>>>>>>>> happening in the background, but I'm still seeing over 6500 >>>>>>>>>>>>>>> segments. Even >>>>>>>>>>>>>>> after setting max_num_segments=5. Does this seem right? >>>>>>>>>>>>>>> Queries are a >>>>>>>>>>>>>>> little faster (350-400ms) but still not great. Bigdesk is >>>>>>>>>>>>>>> still showing a >>>>>>>>>>>>>>> fair amount of file IO. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Thursday, April 3, 2014 8:47:32 AM UTC-4, Elliott >>>>>>>>>>>>>>> Bradshaw wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi All, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I've recently upgraded to Elasticsearch 1.1.0. I've got a >>>>>>>>>>>>>>>> 4 node cluster, each with 64G of ram, with 24G allocated to >>>>>>>>>>>>>>>> Elasticsearch >>>>>>>>>>>>>>>> on each. I've batch loaded approximately 86 million documents >>>>>>>>>>>>>>>> into a >>>>>>>>>>>>>>>> single index (4 shards) and have started benchmarking >>>>>>>>>>>>>>>> cross_field/multi_match queries on them. The index has one >>>>>>>>>>>>>>>> replica and >>>>>>>>>>>>>>>> takes up a total of 111G. I've run several batches of warming >>>>>>>>>>>>>>>> queries, but >>>>>>>>>>>>>>>> queries are not as fast as I had hoped, approximately >>>>>>>>>>>>>>>> 400-500ms each. >>>>>>>>>>>>>>>> Given that *top *(on Centos) shows 5-8 GB of free memory >>>>>>>>>>>>>>>> on each server, I would assume that the entire index has been >>>>>>>>>>>>>>>> paged into >>>>>>>>>>>>>>>> memory (I had worried about disk performance previously, as we >>>>>>>>>>>>>>>> are working >>>>>>>>>>>>>>>> in a virtualized environment). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> A stats query on the index in questions shows that the >>>>>>>>>>>>>>>> index is composed of > 7000 segments. This seemed high to me, >>>>>>>>>>>>>>>> but maybe >>>>>>>>>>>>>>>> it's appropriate. Regardless, I dispatched an optimize >>>>>>>>>>>>>>>> command, but I am >>>>>>>>>>>>>>>> not seeing any progress and the command has not returned. >>>>>>>>>>>>>>>> Current merges >>>>>>>>>>>>>>>> remains at zero, and the segment count is not changing. >>>>>>>>>>>>>>>> Checking out hot >>>>>>>>>>>>>>>> threads in ElasticHQ, I initially saw an optimize call in the >>>>>>>>>>>>>>>> stack that >>>>>>>>>>>>>>>> was blocked on a waitForMerge call. This however has >>>>>>>>>>>>>>>> disappeared, and I'm >>>>>>>>>>>>>>>> seeing no evidence that the optimize is occuring. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Does any of this seem out of the norm or unusual? Has >>>>>>>>>>>>>>>> anyone else had similar issues. This is the second time I >>>>>>>>>>>>>>>> have tried to >>>>>>>>>>>>>>>> optimize an index since upgrading. I've gotten the same >>>>>>>>>>>>>>>> result both time. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks in advance for any help/tips! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - Elliott >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>>>>>> Google Groups "elasticsearch" group. >>>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from >>>>>>>>>>>>>> it, send an email to elasticsearc...@googlegroups.com. >>>>>>>>>>>>>> To view this discussion on the web visit >>>>>>>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/5391291f-5c5 >>>>>>>>>>>>>> e-4088-a1f2-93272beef0bb%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/5391291f-5c5e-4088-a1f2-93272beef0bb%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>>>>>>> . >>>>>>>>>>>>>> >>>>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> You received this message because you are subscribed to a >>>>>>>>>>>>> topic in the Google Groups "elasticsearch" group. >>>>>>>>>>>>> To unsubscribe from this topic, visit >>>>>>>>>>>>> https://groups.google.com/d/topic/elasticsearch/kqTRRADQBwc/ >>>>>>>>>>>>> unsubscribe. >>>>>>>>>>>>> To unsubscribe from this group and all its topics, send an >>>>>>>>>>>>> email to elasticsearc...@googlegroups.com. >>>>>>>>>>>>> To view this discussion on the web visit >>>>>>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/CAP8axnD7BUz >>>>>>>>>>>>> iGct2%3Db%3DfupaKYFnA5fR2TBsxHoURJumHSyODFA%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAP8axnD7BUziGct2%3Db%3DfupaKYFnA5fR2TBsxHoURJumHSyODFA%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>>>>>>>>> . >>>>>>>>>>>>> >>>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>>>> Google Groups "elasticsearch" group. >>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from >>>>>>>>>>>> it, send an email to elasticsearc...@googlegroups.com. >>>>>>>>>>>> To view this discussion on the web visit >>>>>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/CAGCt%2BFvoS >>>>>>>>>>>> QTvv%2B6G%3D3GOX27AuYdEwLiW%3Demc0JTouT9%2BBeUk_A%40mail.gma >>>>>>>>>>>> il.com<https://groups.google.com/d/msgid/elasticsearch/CAGCt%2BFvoSQTvv%2B6G%3D3GOX27AuYdEwLiW%3Demc0JTouT9%2BBeUk_A%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>>>>>>>> . >>>>>>>>>>>> >>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Adrien Grand >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> You received this message because you are subscribed to a topic >>>>>>>>>>> in the Google Groups "elasticsearch" group. >>>>>>>>>>> To unsubscribe from this topic, visit >>>>>>>>>>> https://groups.google.com/d/topic/elasticsearch/kqTRRADQBwc/ >>>>>>>>>>> unsubscribe. >>>>>>>>>>> To unsubscribe from this group and all its topics, send an email >>>>>>>>>>> to elasticsearc...@googlegroups.com. >>>>>>>>>>> To view this discussion on the web visit >>>>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6sQrP >>>>>>>>>>> jijV86nYGoGTAQ%3D3cO_pgyYE6%2B3sGjJPr8%2BKDsg%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6sQrPjijV86nYGoGTAQ%3D3cO_pgyYE6%2B3sGjJPr8%2BKDsg%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>>>>>>> . >>>>>>>>>>> >>>>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>> You received this message because you are subscribed to the Google >>>>>>>>> Groups "elasticsearch" group. >>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>> send an email to elasticsearc...@googlegroups.com. >>>>>>>>> To view this discussion on the web visit >>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/8742280e-922 >>>>>>>>> f-4e91-bcb2-6096ca0165e6%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/8742280e-922f-4e91-bcb2-6096ca0165e6%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>> . >>>>>>>>> >>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Adrien Grand >>>>>>>> >>>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "elasticsearch" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to elasticsearc...@googlegroups.com. >>>>> To view this discussion on the web visit https://groups.google.com/d/ >>>>> msgid/elasticsearch/344b09db-a2d8-4c2d-a917-dbf53eda03ce% >>>>> 40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/344b09db-a2d8-4c2d-a917-dbf53eda03ce%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> >>>> >>>> -- >>>> Adrien Grand >>>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to elasticsearch+unsubscr...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/elasticsearch/eda52a43-94ec-4574-b989-32727cf3cfe4%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/eda52a43-94ec-4574-b989-32727cf3cfe4%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> >> >> -- >> Adrien Grand >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to elasticsearch+unsubscr...@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7mYNrg1vVauWN8CyD-csXPqtdPad%3DC0QFiTyYOzsU2Bg%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7mYNrg1vVauWN8CyD-csXPqtdPad%3DC0QFiTyYOzsU2Bg%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> >> For more options, visit https://groups.google.com/d/optout. >> > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1f93pFKC0RqT40uFOg6Zwkn5UO0QNfeHHFGuYENwLD6w%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1f93pFKC0RqT40uFOg6Zwkn5UO0QNfeHHFGuYENwLD6w%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAE_AicT4JKYtkiRmkpsBSGTMMUjrq7WQsUptE22kDKbB3NMdcA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.