Re: Storage capacity (planning)?

2014-02-07 Thread Binh Ly
Jeroen,

If your objective is to keep the ES storage as minimal as possible, you'd 
probably want to understand first what your search requirements are and 
then optimize the ES indexes accordingly. For example, if you don't need 
replicas, then you can set it to 0. If you don't need the _all field, you 
can disable it (using index templates for example). If you don't need every 
single field from your log event indexed, then you can direct your LS 
filters to only output specific fields that you are interested in. Etc, 
etc...

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/105b2dac-33aa-44ad-8961-229b3aad4905%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Storage capacity (planning)?

2014-02-07 Thread Mark Walkom
If you're not, you should put kibana into the mix. This will give you a
better understanding of what is going into ES (in fact this is what it was
designed for).
Also install elastichq, kopf and bigdesk for some cluster monitoring. There
is also elasticsearch-monitoring which is pretty good for longer term stats.

Once you have those you will better understand your cluster and throughput.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com


On 7 February 2014 20:05, Jeroen van Meeuwen (Kolab Systems) <
vanmeeu...@kolabsys.com> wrote:

> Hi there,
>
> The reason I'm looking at Elastic Search being a totally different one ^1,
> I set up a development environment with about 20 servers that use rsyslog
> to send off their logs to a logstash server (input, you guessed it,
> syslog), and through Redis ultimately makes the syslog entries end up in
> Elastic Search. I suppose this is the next-next-finish setup documented on
> [1].
>
> To my surprise, it only takes a day or so to get up to a storage volume of
> ~25 GB in /var/lib/elasticsearch/.
>
> It is particularly surprising to me, because the environment is largely
> idle, other than some monitoring and some cron jobs -- there's not a lot of
> syslog messages compared to a production environment, not at all.
>
> Furthermore, using this rsyslog -> logstash collector -> redis -> logstash
> indexer -> elasticsearch setup, I'm seeing the throughput on the logical
> volume for the root filesystem rise continuously -- it's now at about 4
> MB/s. `iotop` merely suggests this is all Elasticsearch doing the I/O, but
> its payload is on the aforementioned logical volume mounted on
> /var/lib/elasticsearch/.
>
> I'm fairly certain I can tweak the number of log entries being sent off to
> the centralized log server, and it's not unlikely I'm doing something
> wrong, but I was wondering whether anybody out there had gone through such
> exercise before, and whether my expectations are correct.
>
> Thanks, in advance,
>
> Kind regards,
>
> Jeroen van Meeuwen
>
> ^1: Kolab Groupware is looking in to developing a singular application
> suite for the topics of Archival, Backup/Restore and e-Discovery. Very much
> a work-in-progress, we're putting down some notes [2] and are doing the
> initial probing at potential storage backend solutions.
>
> [1] http://logstash.net/docs/1.3.3/tutorials/getting-started-centralized
> [2] http://docs.kolab.org/architecture-and-design/bonnie.html
>
> --
> Systems Architect, Kolab Systems AG
>
> e: vanmeeuwen at kolabsys.com
> m: +44 74 2516 3817
> w: http://www.kolabsys.com
>
> pgp: 9342 BF08
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/elasticsearch/2fd3cb3bb2327950a8c1429e85949f3e%40kolabsys.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEM624boYaFTD8RV6JcVXsVB%2B2Wjf1ow09iugj5U3Ps7Me-JCg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Storage capacity (planning)?

2014-02-07 Thread Jeroen van Meeuwen (Kolab Systems)

Hi there,

The reason I'm looking at Elastic Search being a totally different one 
^1, I set up a development environment with about 20 servers that use 
rsyslog to send off their logs to a logstash server (input, you guessed 
it, syslog), and through Redis ultimately makes the syslog entries end 
up in Elastic Search. I suppose this is the next-next-finish setup 
documented on [1].


To my surprise, it only takes a day or so to get up to a storage volume 
of ~25 GB in /var/lib/elasticsearch/.


It is particularly surprising to me, because the environment is largely 
idle, other than some monitoring and some cron jobs -- there's not a lot 
of syslog messages compared to a production environment, not at all.


Furthermore, using this rsyslog -> logstash collector -> redis -> 
logstash indexer -> elasticsearch setup, I'm seeing the throughput on 
the logical volume for the root filesystem rise continuously -- it's now 
at about 4 MB/s. `iotop` merely suggests this is all Elasticsearch doing 
the I/O, but its payload is on the aforementioned logical volume mounted 
on /var/lib/elasticsearch/.


I'm fairly certain I can tweak the number of log entries being sent off 
to the centralized log server, and it's not unlikely I'm doing something 
wrong, but I was wondering whether anybody out there had gone through 
such exercise before, and whether my expectations are correct.


Thanks, in advance,

Kind regards,

Jeroen van Meeuwen

^1: Kolab Groupware is looking in to developing a singular application 
suite for the topics of Archival, Backup/Restore and e-Discovery. Very 
much a work-in-progress, we're putting down some notes [2] and are doing 
the initial probing at potential storage backend solutions.


[1] http://logstash.net/docs/1.3.3/tutorials/getting-started-centralized
[2] http://docs.kolab.org/architecture-and-design/bonnie.html

--
Systems Architect, Kolab Systems AG

e: vanmeeuwen at kolabsys.com
m: +44 74 2516 3817
w: http://www.kolabsys.com

pgp: 9342 BF08

--
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2fd3cb3bb2327950a8c1429e85949f3e%40kolabsys.com.
For more options, visit https://groups.google.com/groups/opt_out.