Thanks for reporting this, the behavior is definitely unexpected. I'll test
_optimize on very large numbers of shards to see if I can reproduce the
issue.


On Thu, Apr 10, 2014 at 2:10 PM, Elliott Bradshaw <ebradsh...@gmail.com>wrote:

> Adrien,
>
> Just an FYI, after resetting the cluster, things seem to have improved.
> Optimize calls now lead to CPU/IO activity over their duration.
> Max_num_segments=1 does not seem to be working for me on any given call, as
> each call would only reduce the segment count by about 600-700.  I ran 10
> calls in sequence overnight, and actually got down to 4 segments (1/shard)!
>
> I'm glad I got the index optimized, searches are literally 10-20 times
> faster without 1500/segments per shard to deal with.  It's awesome.
>
> That said, any thoughts on why the index wasn't merging on its own, or why
> optimize was returning prematurely?
>
>
> On Wednesday, April 9, 2014 11:10:56 AM UTC-4, Elliott Bradshaw wrote:
>>
>> Hi Adrien,
>>
>> I kept the logs up over the last optimize call, and I did see an
>> exception.  I Ctrl-C'd a curl optimize call before making another one, but
>> I don't think that that caused this exception.  The error is essentially as
>> follows:
>>
>> netty - Caught exception while handling client http traffic, closing
>> connection [id: 0x4d8f1a90, /127.0.0.1:33480 :> /127.0.0.1:9200]
>>
>> java.nio.channels.ClosedChannelException at AbstractNioWorker.
>> cleanUpWriteBuffer(AbstractNioWorker.java:433)
>> at AbstractNioWorker.writeFromUserCode
>> at NioServerSocketPipelineSink.handleAcceptedSocket
>> at NioServerSocketPipelineSink.eventSunk
>> at DefaultChannelPipeline$DefaultChannelhandlerContext.sendDownstream
>> at Channels.write
>> at OneToOneEncoder.doEncode
>> at OneToOneEncoder.handleDownstream
>> at DefaultChannelPipeline.sendDownstream
>> at DefaultChannelPipeline.sendDownstream
>> at Channels.write
>> at AbstractChannel.write
>> at NettyHttpChannel.sendResponse
>> at RestOptimizeAction$1.onResponse(95)
>> at RestOptimizeAction$1.onResponse(85)
>> at TransportBroadcastOperationAction$AsyncBroadcastAction.finishHim
>> at TransportBroadcastOperationAction$AsyncBroadcastAction.onOperation
>> at TransportBroadcastOperationAction$AsyncBroadcastAction$2.run
>>
>> Sorry about the crappy stack trace.  Still, looks like this might point
>> to a problem!  The exception fired about an hour after I kicked off the
>> optimize.  Any thoughts?
>>
>> On Wednesday, April 9, 2014 10:06:57 AM UTC-4, Elliott Bradshaw wrote:
>>>
>>> Hi Adrien,
>>>
>>> I did customize my merge policy, although I did so only because I was so
>>> surprised by the number of segments left over after the load.  I'm pretty
>>> sure the optimize problem was happening before I made this change, but
>>> either way here are my settings:
>>>
>>> "index" : {
>>> "merge" : {
>>> "policy" : {
>>> "max_merged_segment" : "20gb",
>>> "segments_per_tier" : 5,
>>> "floor_segment" : "10mb"
>>> },
>>> "scheduler" : "concurrentmergescheduler"
>>> }
>>> }
>>>
>>> Not sure whether this set up could be a contributing factor or not.
>>> Nothing really jumps out at me in the logs.  In fact, when i kick off the
>>> optimize, I don't see any logging at all.  Should I?
>>>
>>> I'm running the following command: curl -XPOST
>>> http://localhost:9200/index/_optimize
>>>
>>> Thanks!
>>>
>>>
>>> On Wednesday, April 9, 2014 8:56:35 AM UTC-4, Adrien Grand wrote:
>>>>
>>>> Hi Elliott,
>>>>
>>>> 1500 segments per shard is certainly way too much, and it is not normal
>>>> that optimize doesn't manage to reduce the number of segments.
>>>>  - Is there anything suspicious in the logs?
>>>>  - Have you customized the merge policy or scheduler?[1]
>>>>  - Does the issue still reproduce if you restart your cluster?
>>>>
>>>> [1] http://www.elasticsearch.org/guide/en/elasticsearch/
>>>> reference/current/index-modules-merge.html
>>>>
>>>>
>>>>
>>>> On Wed, Apr 9, 2014 at 2:38 PM, Elliott Bradshaw <ebrad...@gmail.com>wrote:
>>>>
>>>>> Any other thoughts on this?  Would 1500 segments per shard be
>>>>> significantly impacting performance?  Have you guys noticed this behavior
>>>>> elsewhere?
>>>>>
>>>>> Thanks.
>>>>>
>>>>>
>>>>> On Monday, April 7, 2014 8:56:38 AM UTC-4, Elliott Bradshaw wrote:
>>>>>>
>>>>>> Adrian,
>>>>>>
>>>>>> I ran the following command:
>>>>>>
>>>>>> curl -XPUT http://localhost:9200/_settings -d
>>>>>> '{"indices.store.throttle.max_bytes_per_sec" : "10gb"}'
>>>>>>
>>>>>> and received a { "acknowledged" : "true" } response.  The logs showed
>>>>>> "cluster state updated".
>>>>>>
>>>>>> I did have to close my index prior to changing the setting and reopen
>>>>>> afterward.
>>>>>>
>>>>>>
>>>>>> I've since began another optimize, but again it doesn't look like
>>>>>> much is happening.  The optimize isn't returning and the total CPU usage 
>>>>>> on
>>>>>> every node is holding at about 2% of a single core.  I would copy a
>>>>>> hot_threads stack trace, but I'm unfortunately on a closed network and 
>>>>>> this
>>>>>> isn't possible.  I can tell you that refreshes of hot_threads show vary
>>>>>> little happening.  The occasional [merge] thread (always in a
>>>>>> LinkedTransferQueue.awaitMatch() state) or [optimize] (doing nothing
>>>>>> on a waitForMerge() call) thread shows up, but it's always consuming 0-1%
>>>>>> CPU.  It sure feels like something isn't right.  Any thoughts?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 4, 2014 at 3:24 PM, Adrien Grand <
>>>>>> adrien...@elasticsearch.com> wrote:
>>>>>>
>>>>>>> Did you see a message in the logs confirming that the setting has
>>>>>>> been updated? It would be interesting to see the output of hot 
>>>>>>> threads[1]
>>>>>>> to see what your node is doing.
>>>>>>>
>>>>>>> [1] http://www.elasticsearch.org/guide/en/elasticsearch/referenc
>>>>>>> e/current/cluster-nodes-hot-threads.html
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Apr 4, 2014 at 7:18 PM, Elliott Bradshaw <ebrad...@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Yes. I have run max_num_segments=1 every time.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Apr 4, 2014 at 12:26 PM, Michael Sick <
>>>>>>>> michae...@serenesoftware.com> wrote:
>>>>>>>>
>>>>>>>>> Have you tried max_num_segments=1 on your optimize?
>>>>>>>>>
>>>>>>>>> On Fri, Apr 4, 2014 at 11:27 AM, Elliott Bradshaw <
>>>>>>>>> ebrad...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Any thoughts on this?  I've run optimize several more times, and
>>>>>>>>>> the number of segments falls each time, but I'm still over 1000 
>>>>>>>>>> segments
>>>>>>>>>> per shard.  Has anyone else run into something similar?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thursday, April 3, 2014 11:21:29 AM UTC-4, Elliott Bradshaw
>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> OK.  Optimize finally returned, so I suppose something was
>>>>>>>>>>> happening in the background, but I'm still seeing over 6500 
>>>>>>>>>>> segments.  Even
>>>>>>>>>>> after setting max_num_segments=5.  Does this seem right?  Queries 
>>>>>>>>>>> are a
>>>>>>>>>>> little faster (350-400ms) but still not great.  Bigdesk is still 
>>>>>>>>>>> showing a
>>>>>>>>>>> fair amount of file IO.
>>>>>>>>>>>
>>>>>>>>>>> On Thursday, April 3, 2014 8:47:32 AM UTC-4, Elliott Bradshaw
>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>
>>>>>>>>>>>> I've recently upgraded to Elasticsearch 1.1.0.  I've got a 4
>>>>>>>>>>>> node cluster, each with 64G of ram, with 24G allocated to 
>>>>>>>>>>>> Elasticsearch on
>>>>>>>>>>>> each.  I've batch loaded approximately 86 million documents into a 
>>>>>>>>>>>> single
>>>>>>>>>>>> index (4 shards) and have started benchmarking 
>>>>>>>>>>>> cross_field/multi_match
>>>>>>>>>>>> queries on them.  The index has one replica and takes up a total 
>>>>>>>>>>>> of 111G.
>>>>>>>>>>>> I've run several batches of warming queries, but queries are not 
>>>>>>>>>>>> as fast as
>>>>>>>>>>>> I had hoped, approximately 400-500ms each.  Given that *top *(on
>>>>>>>>>>>> Centos) shows 5-8 GB of free memory on each server, I would assume 
>>>>>>>>>>>> that the
>>>>>>>>>>>> entire index has been paged into memory (I had worried about disk
>>>>>>>>>>>> performance previously, as we are working in a virtualized 
>>>>>>>>>>>> environment).
>>>>>>>>>>>>
>>>>>>>>>>>> A stats query on the index in questions shows that the index is
>>>>>>>>>>>> composed of > 7000 segments.  This seemed high to me, but maybe 
>>>>>>>>>>>> it's
>>>>>>>>>>>> appropriate.  Regardless, I dispatched an optimize command, but I 
>>>>>>>>>>>> am not
>>>>>>>>>>>> seeing any progress and the command has not returned.  Current 
>>>>>>>>>>>> merges
>>>>>>>>>>>> remains at zero, and the segment count is not changing.  Checking 
>>>>>>>>>>>> out hot
>>>>>>>>>>>> threads in ElasticHQ, I initially saw an optimize call in the 
>>>>>>>>>>>> stack that
>>>>>>>>>>>> was blocked on a waitForMerge call.  This however has disappeared, 
>>>>>>>>>>>> and I'm
>>>>>>>>>>>> seeing no evidence that the optimize is occuring.
>>>>>>>>>>>>
>>>>>>>>>>>> Does any of this seem out of the norm or unusual?  Has anyone
>>>>>>>>>>>> else had similar issues.  This is the second time I have tried to 
>>>>>>>>>>>> optimize
>>>>>>>>>>>> an index since upgrading.  I've gotten the same result both time.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks in advance for any help/tips!
>>>>>>>>>>>>
>>>>>>>>>>>> - Elliott
>>>>>>>>>>>>
>>>>>>>>>>>  --
>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>> Google Groups "elasticsearch" group.
>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>>> send an email to elasticsearc...@googlegroups.com.
>>>>>>>>>>  To view this discussion on the web visit
>>>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/5391291f-5c5
>>>>>>>>>> e-4088-a1f2-93272beef0bb%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/5391291f-5c5e-4088-a1f2-93272beef0bb%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>> .
>>>>>>>>>>
>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  --
>>>>>>>>> You received this message because you are subscribed to a topic in
>>>>>>>>> the Google Groups "elasticsearch" group.
>>>>>>>>> To unsubscribe from this topic, visit https://groups.google.com/d/
>>>>>>>>> topic/elasticsearch/kqTRRADQBwc/unsubscribe.
>>>>>>>>> To unsubscribe from this group and all its topics, send an email
>>>>>>>>> to elasticsearc...@googlegroups.com.
>>>>>>>>> To view this discussion on the web visit
>>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/CAP8axnD7BUz
>>>>>>>>> iGct2%3Db%3DfupaKYFnA5fR2TBsxHoURJumHSyODFA%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAP8axnD7BUziGct2%3Db%3DfupaKYFnA5fR2TBsxHoURJumHSyODFA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>
>>>>>>>>
>>>>>>>>  --
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "elasticsearch" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to elasticsearc...@googlegroups.com.
>>>>>>>>  To view this discussion on the web visit
>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/CAGCt%2BFvoS
>>>>>>>> QTvv%2B6G%3D3GOX27AuYdEwLiW%3Demc0JTouT9%2BBeUk_A%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAGCt%2BFvoSQTvv%2B6G%3D3GOX27AuYdEwLiW%3Demc0JTouT9%2BBeUk_A%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>>
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Adrien Grand
>>>>>>>
>>>>>>> --
>>>>>>> You received this message because you are subscribed to a topic in
>>>>>>> the Google Groups "elasticsearch" group.
>>>>>>> To unsubscribe from this topic, visit https://groups.google.com/d/to
>>>>>>> pic/elasticsearch/kqTRRADQBwc/unsubscribe.
>>>>>>> To unsubscribe from this group and all its topics, send an email to
>>>>>>> elasticsearc...@googlegroups.com.
>>>>>>> To view this discussion on the web visit
>>>>>>> https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6sQrP
>>>>>>> jijV86nYGoGTAQ%3D3cO_pgyYE6%2B3sGjJPr8%2BKDsg%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6sQrPjijV86nYGoGTAQ%3D3cO_pgyYE6%2B3sGjJPr8%2BKDsg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>
>>>>>>  --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "elasticsearch" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to elasticsearc...@googlegroups.com.
>>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>>> msgid/elasticsearch/8742280e-922f-4e91-bcb2-6096ca0165e6%
>>>>> 40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/8742280e-922f-4e91-bcb2-6096ca0165e6%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Adrien Grand
>>>>
>>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/344b09db-a2d8-4c2d-a917-dbf53eda03ce%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/344b09db-a2d8-4c2d-a917-dbf53eda03ce%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6xdRmyK5cfCPtJUijqQ8nfV56RNLsUVXbYRdpY-ZqStA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to