Re: Limit the amount of data generated by Marvel with marvel.agent.interval ?

2014-05-22 Thread Logan Hardy
Just a follow up on my results with this. I was able t get the daily marvel 
index down from hundreds of GBs per day replicated to about 22GB per day by 
setting replicas to 0 on the marvel index and marvel.agent.interval: to 
30s. 

I also noticed that after upgrading my production Elasticsearch cluster 
from 0.90.12 to 1.1.1 (now the same version as the monitoring cluster) the 
daily marvel index went down to around 2GB per day. Hooray!

-Logan-

On Saturday, May 3, 2014 6:00:51 PM UTC-6, Logan Hardy wrote:
>
> Thanks Mark. I'm aware of the licensing and have been in contact with 
> Elasticsearch about this. I just need to make sure we can use Marvel with 
> our two monitoring nodes before we commit to buying licenses. I'm 
> disinclined to use Marvel if I have to buy licenses and buy a bunch 
> additional monitoring machines. I really like Marvel but if I can't make it 
> work for a reasonable price I think I go back to using SPM. 
>
> On Wednesday, April 30, 2014 5:39:47 PM UTC-6, Mark Walkom wrote:
>>
>> That's pretty sane. I believe the newest version of marvel increased the 
>> default from 5s to 10s.
>>
>> But be aware, you are breaking the license for Marvel with that number of 
>> nodes - http://www.elasticsearch.org/overview/marvel/
>>
>> Regards,
>> Mark Walkom
>>
>> Infrastructure Engineer
>> Campaign Monitor
>> email: ma...@campaignmonitor.com
>> web: www.campaignmonitor.com
>>
>>
>> On 1 May 2014 06:52, Logan Hardy  wrote:
>>
>>> I'm managing a pretty badass 11 node Elasticsearch cluster that is 
>>> powering a customer facing dashboard reporting platform. 20 cores per node, 
>>> 64GB RAM, SSDs, Dual 10 GbE of awesome. I evaluated Marvel while we were 
>>> still in development on the new platform and I found it to be a very 
>>> valuable tool. At first Marvel was indexing to the same cluster we were 
>>> monitoring and this was okay while we were in development as there were 
>>> plenty of extra cycles in the cluster to handle the load but now that we 
>>> are in production it doesn't make sense to burden the cluster with this. 
>>> The nature of our reporting system requires us to to have an index for each 
>>> customer so we're currently at 328 indexes and over 10,000 shards total. 
>>> The amount of data indexed by Marvel increases dramatically as the number 
>>> of indices increases so once we got over 300 indices in the system the 
>>> daily marvel index ended up at around 400 GB replicated and was indexing 
>>> around 2,000 documents a second by itself. 
>>>
>>> What I want to do is have Marvel index to a not as awesome 2 node 
>>> Elasticsearch monitoring cluster. 12 cores, 64 GB RAM and spinning disks. 
>>> But in practice these 2 nodes are unable to keep up with the load and get 
>>> completely bogged down. I'm thinking I can sacrifice redundancy and buy 
>>> myself some cycles by not using any replicas on the Marvel index. My other 
>>> idea is to set marvel.agent.interval from the default 10s to something like 
>>> 30s on the assumption that this will cut the amount of data generated by a 
>>> third. Does this sound sane or do you have anyone have other ideas on what 
>>> I can try to limited the load?  
>>>
>>> marvel.agent.interval 
>>>
>>> Controls the interval between data samples. Defaults to 10s. Set to -1 to 
>>> temporarily disable exporting.
>>>
>>> This setting is update-able via the Cluster Update Settings API.
>>>
>>>
>>> Thanks -Logan-
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/5884045a-49f7-48d4-a3cb-93a5f70c53cf%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/5884045a-49f7-48d4-a3cb-93a5f70c53cf%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d320e16b-e6ee-44ec-b084-96ec51e637d2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Limit the amount of data generated by Marvel with marvel.agent.interval ?

2014-05-03 Thread Logan Hardy
Thanks Mark. I'm aware of the licensing and have been in contact with 
Elasticsearch about this. I just need to make sure we can use Marvel with 
our two monitoring nodes before we commit to buying licenses. I'm 
disinclined to use Marvel if I have to buy licenses and buy a bunch 
additional monitoring machines. I really like Marvel but if I can't make it 
work for a reasonable price I think I go back to using SPM. 

On Wednesday, April 30, 2014 5:39:47 PM UTC-6, Mark Walkom wrote:
>
> That's pretty sane. I believe the newest version of marvel increased the 
> default from 5s to 10s.
>
> But be aware, you are breaking the license for Marvel with that number of 
> nodes - http://www.elasticsearch.org/overview/marvel/
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com 
> web: www.campaignmonitor.com
>
>
> On 1 May 2014 06:52, Logan Hardy > wrote:
>
>> I'm managing a pretty badass 11 node Elasticsearch cluster that is 
>> powering a customer facing dashboard reporting platform. 20 cores per node, 
>> 64GB RAM, SSDs, Dual 10 GbE of awesome. I evaluated Marvel while we were 
>> still in development on the new platform and I found it to be a very 
>> valuable tool. At first Marvel was indexing to the same cluster we were 
>> monitoring and this was okay while we were in development as there were 
>> plenty of extra cycles in the cluster to handle the load but now that we 
>> are in production it doesn't make sense to burden the cluster with this. 
>> The nature of our reporting system requires us to to have an index for each 
>> customer so we're currently at 328 indexes and over 10,000 shards total. 
>> The amount of data indexed by Marvel increases dramatically as the number 
>> of indices increases so once we got over 300 indices in the system the 
>> daily marvel index ended up at around 400 GB replicated and was indexing 
>> around 2,000 documents a second by itself. 
>>
>> What I want to do is have Marvel index to a not as awesome 2 node 
>> Elasticsearch monitoring cluster. 12 cores, 64 GB RAM and spinning disks. 
>> But in practice these 2 nodes are unable to keep up with the load and get 
>> completely bogged down. I'm thinking I can sacrifice redundancy and buy 
>> myself some cycles by not using any replicas on the Marvel index. My other 
>> idea is to set marvel.agent.interval from the default 10s to something like 
>> 30s on the assumption that this will cut the amount of data generated by a 
>> third. Does this sound sane or do you have anyone have other ideas on what 
>> I can try to limited the load?  
>>
>> marvel.agent.interval 
>>
>> Controls the interval between data samples. Defaults to 10s. Set to -1 to 
>> temporarily disable exporting.
>>
>> This setting is update-able via the Cluster Update Settings API.
>>
>>
>> Thanks -Logan-
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/5884045a-49f7-48d4-a3cb-93a5f70c53cf%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/5884045a-49f7-48d4-a3cb-93a5f70c53cf%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/77324f57-837d-4ba8-8b39-2c11d3103f98%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Limit the amount of data generated by Marvel with marvel.agent.interval ?

2014-04-30 Thread Logan Hardy
I'm managing a pretty badass 11 node Elasticsearch cluster that is powering 
a customer facing dashboard reporting platform. 20 cores per node, 64GB 
RAM, SSDs, Dual 10 GbE of awesome. I evaluated Marvel while we were still 
in development on the new platform and I found it to be a very valuable 
tool. At first Marvel was indexing to the same cluster we were monitoring 
and this was okay while we were in development as there were plenty of 
extra cycles in the cluster to handle the load but now that we are in 
production it doesn't make sense to burden the cluster with this. The 
nature of our reporting system requires us to to have an index for each 
customer so we're currently at 328 indexes and over 10,000 shards total. 
The amount of data indexed by Marvel increases dramatically as the number 
of indices increases so once we got over 300 indices in the system the 
daily marvel index ended up at around 400 GB replicated and was indexing 
around 2,000 documents a second by itself. 

What I want to do is have Marvel index to a not as awesome 2 node 
Elasticsearch monitoring cluster. 12 cores, 64 GB RAM and spinning disks. 
But in practice these 2 nodes are unable to keep up with the load and get 
completely bogged down. I'm thinking I can sacrifice redundancy and buy 
myself some cycles by not using any replicas on the Marvel index. My other 
idea is to set marvel.agent.interval from the default 10s to something like 
30s on the assumption that this will cut the amount of data generated by a 
third. Does this sound sane or do you have anyone have other ideas on what 
I can try to limited the load?  

marvel.agent.interval

Controls the interval between data samples. Defaults to 10s. Set to -1 to 
temporarily disable exporting.

This setting is update-able via the Cluster Update Settings API.


Thanks -Logan-

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5884045a-49f7-48d4-a3cb-93a5f70c53cf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.