Re: Failed start of 2nd instance on same host with mlockall=true

2014-08-27 Thread R. Toma
Hi Jörg,

Running just 1 JVM with 32GB on a 24-core 256GB machine is a waste. CPU, 
I/O, memory metrics substantiate this. And off course we need to explore 
multi-instance before asking mgmt for more money.

Regarding memlock: if no contiguous RAM is available I'd expect a fast 
error and not a totally hanging process and call traces (mentioning a 120 
second timeouts) in the dmesg. Do you think this is maybe a jvm or 
elasticsearch bug? If so, i'll file it.


Regards,
Renzo


Op dinsdag 26 augustus 2014 17:10:56 UTC+2 schreef Jörg Prante:
>
> You should run one node per host. 
>
> Two nodes add overhead and suffer from the effects you described.
>
> For mlockall, the user needs privilege to allocate the specified locked 
> mem, and the OS need contiguous RAM per mlockall call. If the user's 
> memlock limit is exhausted, or if RAM allocation gets fragmented, 
>  memlocking is no longer possible and fails.
>
> Jörg
>
>
> On Tue, Aug 26, 2014 at 2:54 PM, R. Toma 
> > wrote:
>
>> Hi all,
>>
>> In an attempt to squeeze more power out of our physical servers we want 
>> to run multiple ES jvm's per server.
>>
>> Some specs:
>> - servers has 24 cores, 256GB ram
>> - each instance binds on different (alias) ip
>> - each instance has 32GB heap
>> - both instances run under user 'elastic'
>> - limits for 'elastic' user: memlock=unlimited
>> - es config for both instances: bootstrap.mlockall=true
>>
>> The 1st instance has been running for weeks.
>>
>> When starting the 2nd instance the following things happen:
>> - increase of overal cpu load
>> - lots of I/O to disks
>> - no logging for 2nd instance
>> - 2nd instance hangs
>> - 1st instance keeps running, but gets slowish
>> - cd /proc/ causes a hang of cd process (until 2nd instance is 
>> killed)
>> - exec 'ps axuw' causes a hang of ps process (until 2nd instance is 
>> killed)
>>
>> Maybe (un)related: I have never been able to run Elasticsearch in a 
>> virtualbox with memlock=unlimited and mlockall=true.
>>
>>
>> After an hour of trial & errors I found that removing setting 
>> 'bootstrap.mlockall' (setting it to false) from 2nd instance's 
>> configuration fixes things.
>>
>> I am confused, but acknowledge I do not know anything about memlocking.
>>
>> Any ideas?
>>
>> Regards,
>> Renzo
>>
>>
>>
>>
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/b5e4770a-4194-48c9-aec4-4919dc53342a%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/b5e4770a-4194-48c9-aec4-4919dc53342a%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/dbe8fb95-2054-45ac-a07a-5bf2955e8869%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Failed start of 2nd instance on same host with mlockall=true

2014-08-27 Thread R. Toma
Found the following in the dmesg. Maybe I've hit a bug?



INFO: task java:18056 blocked for more than 120 seconds.
  Not tainted 2.6.32-431.3.1.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
java  D 0002 0 18056  1 0x0080
 883fe016fdc8 0082  883fe016fde8
 883fe016fd88 8111f3f0 881a89bc25d8 883fe016fde8
 883edb4025f8 883fe016ffd8 fbc8 883edb4025f8
Call Trace:
 [] ? find_get_pages_tag+0x40/0x130
 [] ? prepare_to_wait+0x4e/0x80
 [] jbd2_log_wait_commit+0xc5/0x140 [jbd2]
 [] ? autoremove_wake_function+0x0/0x40
 [] ? do_writepages+0x21/0x40
 [] jbd2_complete_transaction+0x68/0xb0 [jbd2]
 [] ext4_sync_file+0x121/0x1d0 [ext4]
 [] vfs_fsync_range+0xa1/0x100
 [] vfs_fsync+0x1d/0x20
 [] do_fsync+0x3e/0x60
 [] sys_fsync+0x10/0x20
 [] system_call_fastpath+0x16/0x1b




Op dinsdag 26 augustus 2014 14:54:50 UTC+2 schreef R. Toma:
>
> Hi all,
>
> In an attempt to squeeze more power out of our physical servers we want to 
> run multiple ES jvm's per server.
>
> Some specs:
> - servers has 24 cores, 256GB ram
> - each instance binds on different (alias) ip
> - each instance has 32GB heap
> - both instances run under user 'elastic'
> - limits for 'elastic' user: memlock=unlimited
> - es config for both instances: bootstrap.mlockall=true
>
> The 1st instance has been running for weeks.
>
> When starting the 2nd instance the following things happen:
> - increase of overal cpu load
> - lots of I/O to disks
> - no logging for 2nd instance
> - 2nd instance hangs
> - 1st instance keeps running, but gets slowish
> - cd /proc/ causes a hang of cd process (until 2nd instance is killed)
> - exec 'ps axuw' causes a hang of ps process (until 2nd instance is killed)
>
> Maybe (un)related: I have never been able to run Elasticsearch in a 
> virtualbox with memlock=unlimited and mlockall=true.
>
>
> After an hour of trial & errors I found that removing setting 
> 'bootstrap.mlockall' (setting it to false) from 2nd instance's 
> configuration fixes things.
>
> I am confused, but acknowledge I do not know anything about memlocking.
>
> Any ideas?
>
> Regards,
> Renzo
>
>
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7fc4566e-631d-49f7-b012-3d1c2270102f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Failed start of 2nd instance on same host with mlockall=true

2014-08-26 Thread R. Toma
Hi all,

In an attempt to squeeze more power out of our physical servers we want to 
run multiple ES jvm's per server.

Some specs:
- servers has 24 cores, 256GB ram
- each instance binds on different (alias) ip
- each instance has 32GB heap
- both instances run under user 'elastic'
- limits for 'elastic' user: memlock=unlimited
- es config for both instances: bootstrap.mlockall=true

The 1st instance has been running for weeks.

When starting the 2nd instance the following things happen:
- increase of overal cpu load
- lots of I/O to disks
- no logging for 2nd instance
- 2nd instance hangs
- 1st instance keeps running, but gets slowish
- cd /proc/ causes a hang of cd process (until 2nd instance is killed)
- exec 'ps axuw' causes a hang of ps process (until 2nd instance is killed)

Maybe (un)related: I have never been able to run Elasticsearch in a 
virtualbox with memlock=unlimited and mlockall=true.


After an hour of trial & errors I found that removing setting 
'bootstrap.mlockall' (setting it to false) from 2nd instance's 
configuration fixes things.

I am confused, but acknowledge I do not know anything about memlocking.

Any ideas?

Regards,
Renzo





-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b5e4770a-4194-48c9-aec4-4919dc53342a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


what is my actual index_buffer_size?

2014-05-04 Thread R. Toma
Hi group,

Anyone know how to query the actuel index_buffer_size?
I have searched thru the _cluster/stats output, but cannot find it.

Using 3-node ES 1.0.1 for a logsearch platform (lots of logstash bulking, 
less kibana querying) with fresh indices every day, I have 
indices.memory.index_buffer_size=50% for a total heap of 32GB.

I have a believe 50% is far too high, but cannot find a metric to prove 
this.

Any help is appreciated!

Regards,
Renzo 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/808da292-513c-491a-a3ac-fb220125cb74%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


List of all indices using the ElasticSeach Java API with Jest

2014-04-25 Thread R. Toma
Hi,
I do not know Jest, but if you can call the 'aliases' api you get a list of 
indices/aliases.
Regards,
Renzo

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ca1e4bfa-4547-492d-993c-fffcd45e16b1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


PERL Search::Elasticsearch - need to filter by time

2014-04-25 Thread R. Toma
Hi Eugedan,
You may want to check the type of field 'time' in the index mapping.
Regards,
Renzo

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2146e8a9-9682-451d-bd3b-6f915410b6e8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Recommended way to reduce overload on ES

2014-04-25 Thread R. Toma
We have succesfully upgraded from ES 0.20.x via 0.90 to 1.0.1. No 
reindexing was needed.


Op vrijdag 25 april 2014 02:09:36 UTC+2 schreef Itamar Syn-Hershko:
>
> Lucene is and has always been backwards-compatible. On the first merge of 
> an index created with an earlier version it will get upgraded to the latest.
>
> And you can't always reindex (think huge installations), so I believe this 
> is actually the intent of ES core team to not require that.
>
> You may want to upgrade gradually (0.90 and then 1.x) just to be safe, but 
> you don't have to reindex.
>
> --
>
> Itamar Syn-Hershko
> http://code972.com | @synhershko 
> Freelance Developer & Consultant
> Author of RavenDB in Action 
>
>
> On Fri, Apr 25, 2014 at 3:02 AM, Mark Walkom 
> 
> > wrote:
>
>> Really?
>> I've seen most recommend this, especially given such a large increase in 
>> the ES + lucene versions.
>>
>> Regards,
>> Mark Walkom
>>
>> Infrastructure Engineer
>> Campaign Monitor
>> email: ma...@campaignmonitor.com 
>> web: www.campaignmonitor.com
>>  
>>
>> On 25 April 2014 09:48, Itamar Syn-Hershko 
>> > wrote:
>>
>>> There's no need to reindex, it is enough to do full cluster restart 
>>> after upgrading the binaries and ES/Lucene will take care of the rest
>>>
>>> --
>>>
>>> Itamar Syn-Hershko
>>> http://code972.com | @synhershko 
>>> Freelance Developer & Consultant
>>> Author of RavenDB in Action 
>>>
>>>
>>> On Fri, Apr 25, 2014 at 2:42 AM, Mark Walkom 
>>> 
>>> > wrote:
>>>
 Upgrade ES! That is a very very old version and there are numerous 
 performance improvements in the later versions.
 You will need to reindex your data though, the underlying lucene 
 version has changed as well. You can leverage that process by building a 
 new cluster with the latest version, then migrating data over, and make 
 tweaks to suit your structure.

 Regards,
 Mark Walkom

 Infrastructure Engineer
 Campaign Monitor
 email: ma...@campaignmonitor.com 
 web: www.campaignmonitor.com


 On 24 April 2014 23:45, Itamar Syn-Hershko 
 > wrote:

> Elasticsearch can handle many open indexes on a cluster, but the 
> advice is to keep their number low per machine. I would try to aim for 
> indexes at a size of around 10GB, if this means a weekly index - so be it.
>
> My advice is on using time sliced indexes, not on the time span on 
> which to slice them. There's no thumb rule for that one I'm afraid.
>
>  --
>
> Itamar Syn-Hershko
> http://code972.com | @synhershko 
> Freelance Developer & Consultant
> Author of RavenDB in Action 
>
>
> On Thu, Apr 24, 2014 at 4:37 PM, >wrote:
>
>> Hi Itamar , thanks for the quick reply.
>>
>> Let me verify that I understood you correctly, you recommended to use 
>> an daily index for my scenario? ES can handle that amount of indexes 
>> (365 
>> for instance) ? or that procedure could make an overhead ?
>>
>> On Thursday, April 24, 2014 3:55:22 PM UTC+3, Itamar Syn-Hershko 
>> wrote:
>>
>>> You are currently letting ES handle sharding for you, but using the 
>>> rolling-indexes approach (aka time sliced indexes) where indexes 
>>> contain 
>>> all data for a given period of time and named after that period makes 
>>> much 
>>> more sense. Read: perform the sharding yourself on the index level, and 
>>> use 
>>> aliases or multi-index queries to maintain that.
>>>
>>> This will help with retiring old indexes. The high CPU scenario is 
>>> probably due to a lot of deletes and segment merges that happen under 
>>> the 
>>> hood (and possibly a wrong setting for the Java heap). Using the 
>>> aforementioned approach means you can just archive or delete an entire 
>>> index and not use TTLs or delete-by-query processes.
>>>
>>> Deciding on the optimal size of an index in that scenario highly 
>>> depends on your data, usage patterns and a lot of experimenting.
>>>
>>> That's to answer 1 & 2
>>>
>>>  3. Definitely, 0.20 is a very old version
>>>
>>> --
>>>
>>> Itamar Syn-Hershko
>>> http://code972.com | @synhershko 
>>> Freelance Developer & Consultant
>>> Author of RavenDB in Action 
>>>
>>>
>>> On Thu, Apr 24, 2014 at 3:48 PM,  wrote:
>>>
  Hi all,

 I'm looking for the recommended solution for my situation.

 We have a time based data. 
 Each day we index around 43,000,000 documents. Each 
 document weighs  around 0.6k.
 Our cluster contains 8 nodes (Amazon - m1.xlarge and m3.xlarge 
 machi

how to avoid/lighten shard recovery after restart?

2014-04-25 Thread R. Toma
Hi group,

Restarting a ES cluster triggers recovery which is long-lasting and load 
expensive. I am searching for a way to reduce the runtime and load of a 
restart. I read someone executes daily rolling restarts of his large ES 
cluster to ensure the primary and replica shards are 100% indentical, 
meaning they will be fast recoverable. But that sounds like a hack and not 
something you should happy with as SRE. And its impact on ES performance 
may be acceptable on a large cluster, but not on our 3 node cluster.

How I believe shard recovery works: if ES spots differences between a 
primary and its replica shard(s). It will rebuild the replica shard(s) as 
an exact copy of the primary shard. Rebuiding results in lots of network 
traffic and disk I/O.

We have a 3-node ES 1.0.1 cluster with 3k primary shards and 3k replica 
shards. During a recent restart (to reduce heapsize to 31G to get 
CompressedOops back) the recovery of the 1st node took the longest time (~6 
hours). Recovery and the 2nd less (~2 hours) and the 3rd is quick (<1 
hour). I believe recovery becomes faster after each node, because each 
recovery ends with more replica shards as exact copies of their primary.

I tried force-merging with an expensive max_num_segments=1, but the metrics 
segments.count + segments.memory of same shards still differ between pri + 
rep. No luck. For the curious few I have included the before + after 
results below.

Any ideas?

Regards,
Renzo


BEFORE:
idxshard prirep docs  store segments.count 
segments.memory 
logstash-pro-oracle-2014.04.24 0 p  1072 485592  8 
  14615 
logstash-pro-oracle-2014.04.24 0 r  1072 449022  1 
  11958 
logstash-pro-oracle-2014.04.24 1 p  1095 493774  7 
  14336 
logstash-pro-oracle-2014.04.24 1 r  1095 459966  1 
  11988 
logstash-pro-oracle-2014.04.24 2 p  1039 452078  5 
  13158 
logstash-pro-oracle-2014.04.24 2 r  1039 458513  6 
  13480 
logstash-pro-oracle-2014.04.24 3 p  1094 492753  8 
  14574 
logstash-pro-oracle-2014.04.24 3 r  1094 483347  6 
  13850 
logstash-pro-oracle-2014.04.24 4 p  1099 494740  8 
  14645 
logstash-pro-oracle-2014.04.24 4 r  1099 488953  7 
  14251 

AFTER:
idxshard prirep docs  store segments.count 
segments.memory 
logstash-pro-oracle-2014.04.24 0 p  1072 449358  1 
  11958 
logstash-pro-oracle-2014.04.24 0 r  1072 448884  1 
  11958 
logstash-pro-oracle-2014.04.24 1 p  1095 460391  1 
  11980 
logstash-pro-oracle-2014.04.24 1 r  1095 459918  1 
  11988 <-- rep is 8 bigger than its pri
logstash-pro-oracle-2014.04.24 2 p  1039 431341  1 
  11580 
logstash-pro-oracle-2014.04.24 2 r  1039 431695  1 
  11572 <-- rep is 8 smaller than its pri
logstash-pro-oracle-2014.04.24 3 p  1094 457135  1 
  11907 
logstash-pro-oracle-2014.04.24 3 r  1094 457970  1 
  11907 
logstash-pro-oracle-2014.04.24 4 p  1099 457640  1 
  11957 
logstash-pro-oracle-2014.04.24 4 r  1099 457165  1 
  11957 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20ca0eb4-1f62-4465-a289-2ecd740c9c2e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ELK stack needs tuning

2014-04-18 Thread R. Toma
Hi Jörg,

Thank you for pointing me to this article. I needed to read it twice, but I 
think I understand it now.

I believe shard overallocating works for use-cases where you want to store 
& search 'users' or  'products'. Such data allows you to divide all 
documents into groups to be stored in different shards using routing. All 
shards get indexed & searched.

But how does this work for logstash indices? I could create 1 index with 
365 shards (if I want 1 year of retention) and use alias routing (alias per 
date with routing to a shard) to index into a different shard every day, 
but after 1 year I need to purge a shard. And purging a shard is not easy. 
It would require a delete of every document in the shard.

Or am I missing something?

Regards,
Renzp


Op donderdag 17 april 2014 16:15:43 UTC+2 schreef Jörg Prante:
>
> "17 new indices every day" - whew. Why don't you use shard overallocating?
>
>
> https://groups.google.com/forum/#!msg/elasticsearch/49q-_AgQCp8/MRol0t9asEcJ
>
> Jörg
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/88a6f992-400b-4fb5-80e5-7b024b17ffd6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ELK stack needs tuning

2014-04-17 Thread R. Toma
Hi Mark,

Thank you for your comments.

Regarding the monitoring. We use the Diamond ES collector which saves 
metrics every 30 seconds in Graphite. ElasticHQ is nice, but does 
diagnostics calculations for the whole runtime of the cluster instead of 
last X minutes. It does have nice diagnostics rules, so I created Graphite 
dashboards for them. Marvel is surely nice, but with exception of Sense it 
does not offer me anything I do not already have with Graphite.

New finds:
* Setting index.codec.bloom.load=false on yesterdays/older indices frees up 
memory from the fielddata pool. This stays released even when searching.
* Closing older indices speeds up indexing & refreshing.

Regarding the closing benefit. The impact on refreshing is great! But from 
a functional point-of-view its bad. I know about the 'overhead per index', 
but cannot find a solution to this.

Does anyone know how to get an ELK stack with "unlimited" retention?

Regards,
Renzo



Op woensdag 16 april 2014 11:15:32 UTC+2 schreef Mark Walkom:
>
> Well once you go over 31-32GB of heap you lose pointer compression which 
> can actually slow you down. You might be better off reducing that and 
> running multiple instances per physical.
>
> >0.90.4 or so compression is on by default, so no need to specify that. 
> You might also want to change shards to a factor of your nodes, eg 3, 6, 9 
> for more even allocation.
> Also try moving to java 1.7u25 as that is the general agreed version to 
> run. We run u51 with no issues though so that might be worth trialling if 
> you can.
>
> Finally, what are you using to monitor the actual cluster? Something like 
> ElasticHQ or Marvel will probably provide greater insights into what is 
> happening and what you can do to improve performance.
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com 
> web: www.campaignmonitor.com
>
>
> On 16 April 2014 19:06, R. Toma > wrote:
>
>> Hi all,
>>
>> At bol.com we use ELK for a logsearch platform, using 3 machines.
>>
>> We need fast indexing (to not loose events) and want fast & near realtime 
>> search. The search is currently not fast enough. Simple "give me the last 
>> 50 events from the last 15 minutes, from any type, from todays indices, 
>> without any terms" search queries may take 1.0 sec. Sometimes even passing 
>> 30 seconds.
>>
>> It currently does 3k docs added per second, but we expect 8k/sec end of 
>> this year.
>>
>> I have included lots of specs/config at bottom of this e-mail.
>>
>>  
>> We found 2 reliable knobs to turn:
>>
>>1. index.refresh_interval. At 1 sec fast search seems impossible. 
>>When upping the refresh to 5 sec, search gets faster. At 10 sec its even 
>>faster. But when you search during the refresh (wouldn't a splay be 
>> nice?) 
>>its slow again. And a refresh every 10 seconds is not near realtime 
>>anymore. No obvious bottlenecks present: cpu, network, memory, disk i/o 
>> all 
>>OK.
>>2. deleting old indices. No clue why this improves things. And we 
>>really do not want to delete old data, since we want to keep at least 60 
>>days of data online. But after deleting old data to search speed slowly 
>>crawls back up again... 
>>
>>
>> We have zillions of metrics ("measure everything") of OS, ES and JVM 
>> using Diamond and Graphite. Too much to include here.
>> We use a nagios check to simulates Kibana queries to monitor the search 
>> speed every 5 minute.
>>
>>
>> When comparing behaviour at refresh_interval 1s vs 5s we see:
>>
>>- system% cpu load: depends per server: 150 vs 80, 100 vs 50, 40 vs 
>>25 == lower 
>>- ParNew GC run freqency: 1 vs 0.6 (per second) == less
>>- GMS GC run frequency: 1 vs 4 (per hour) == more
>>- avg index time: 8 vs 2.5 (ms) == lower 
>>- refresh frequency: 22 vs 12 (per second) -- still high numbers at 5 
>>sec because we have 17 active indices every day == less
>>- merge frequency: 12 vs 7 (per second) == less 
>>- flush frequency: no difference
>>- search speed: at 1s way too slow, at 5s (at tests timed between the 
>>refresh bursts) search calls ~50ms. 
>>
>>
>> We already looked at the threadpools:
>>
>>- we increased the bulk pool
>>- we currently do not have any rejects in any pools
>>- only pool that has queueing (a spike per 1 or 2 hours) is the 
>>'management' pool (but thats probably Diamond) 
>>
>>
>> We have 

ELK stack needs tuning

2014-04-16 Thread R. Toma
Hi all,

At bol.com we use ELK for a logsearch platform, using 3 machines.

We need fast indexing (to not loose events) and want fast & near realtime 
search. The search is currently not fast enough. Simple "give me the last 
50 events from the last 15 minutes, from any type, from todays indices, 
without any terms" search queries may take 1.0 sec. Sometimes even passing 
30 seconds.

It currently does 3k docs added per second, but we expect 8k/sec end of 
this year.

I have included lots of specs/config at bottom of this e-mail.


We found 2 reliable knobs to turn:

   1. index.refresh_interval. At 1 sec fast search seems impossible. When 
   upping the refresh to 5 sec, search gets faster. At 10 sec its even faster. 
   But when you search during the refresh (wouldn't a splay be nice?) its slow 
   again. And a refresh every 10 seconds is not near realtime anymore. No 
   obvious bottlenecks present: cpu, network, memory, disk i/o all OK.
   2. deleting old indices. No clue why this improves things. And we really 
   do not want to delete old data, since we want to keep at least 60 days of 
   data online. But after deleting old data to search speed slowly crawls back 
   up again...


We have zillions of metrics ("measure everything") of OS, ES and JVM using 
Diamond and Graphite. Too much to include here.
We use a nagios check to simulates Kibana queries to monitor the search 
speed every 5 minute.


When comparing behaviour at refresh_interval 1s vs 5s we see:

   - system% cpu load: depends per server: 150 vs 80, 100 vs 50, 40 vs 25 
   == lower
   - ParNew GC run freqency: 1 vs 0.6 (per second) == less
   - GMS GC run frequency: 1 vs 4 (per hour) == more
   - avg index time: 8 vs 2.5 (ms) == lower
   - refresh frequency: 22 vs 12 (per second) -- still high numbers at 5 
   sec because we have 17 active indices every day == less
   - merge frequency: 12 vs 7 (per second) == less
   - flush frequency: no difference
   - search speed: at 1s way too slow, at 5s (at tests timed between the 
   refresh bursts) search calls ~50ms.


We already looked at the threadpools:

   - we increased the bulk pool
   - we currently do not have any rejects in any pools
   - only pool that has queueing (a spike per 1 or 2 hours) is the 
   'management' pool (but thats probably Diamond)


We have a feeling something blocks/locks upon high index and high search 
frequency. But what? I have looked at nearly all metrics and _cat output.


Our current list of untested/wild ideas:

   - Is the index.codec.bloom.load=false on yesterday's indices really the 
   magic bullet? We haven't tried it.
   - Adding a 2nd JVM per machine is an option, but as long as we do not 
   know the real cause its not a real option (yet).
   - Lowering the heap from 48GB to 30GB, to avoid the 64-bit overhead.


What knobs do you suggest we start turning?

Any help is much appreciated!


A little present from me in return: I suggest you read 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-scripting.html
 
and decide if you need dynamic scripting enabled (the default) as it allows 
for remote code execution via the rest api. Credits go to Byron at Trifork!



More details:

Versions:

   - ES 1.0.1 on: java version "1.7.0_17", Java(TM) SE Runtime Environment 
   (build 1.7.0_17-b02), Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, 
   mixed mode)
   - Logstash 1.1.13 (with a backported elasticsearch_http plugin, for 
   idle_flush_time support)
   - Kibana 2
   

Setup:

   - we use several types of shippers/feeders, all sending logging to a set 
   of redis servers (the log4j and accesslog shippers/feeders use the logstash 
   json format to avoid grokking at logstash side)
   - several logstash instances consume the redis list, process and store 
   in ES using the bulk API (we use bulk because we dislike the version lockin 
   using the native transport)
   - we use bulk async (we thought it would speed up indexing, which it 
   doesn't)
   - we use bulk batch size of 1000 and idle flush of 1.0 second
   

Hardware for ES:

   - 3x HP 360G8 24x core
   - each machine has 256GB RAM (1 ES jvm running per machine with 48GB 
   heap, so lots of free RAM for caching)
   - each machine has 8x 1TB SAS (1 for OS and 7 as separate disks for use 
   in ES' -Des.path.data=)
   

Logstash integration:

   - using Bulk API, to avoid the version lockin (maybe slower, which we 
   can fix by scaling out / adding more logstash instances)
   - 17 new indices every day (e.g. syslog, accesslogging, log4j + 
   stacktraces)
   

ES configuration:

   - ES_HEAP_SIZE: 48gb
   - index.number_of_shards: 5
   - index.number_of_replicas: 1
   - index.refresh_interval: 1s
   - index.store.compress.stored: true
   - index.translog.flush_threshold_ops: 5
   - indices.memory.index_buffer_size: 50%
   - default index mapping


Regards,
Renzo Toma
Bol.com


p.s. we are hiring! :-)


-- 
You received this message because you are subscrib