Operating on large cluster

2014-10-23 Thread Alain RODRIGUEZ
Hi,

I was wondering about how do you guys handle a large cluster (50+ machines).

I mean there is sometime you need to change configuration (cassandra.yaml)
or send a command to one, some or all nodes (cleanup, upgradesstables,
setstramthoughput or whatever).

So far we have been using things like custom scripts for repairs or any
routine maintenance and cssh for specific and one shot actions on the
cluster. But I guess this doesn't really scale, I guess we coul use pssh
instead. For configuration changes we use Capistrano that might scale
properly.

So I would like to known, what are the methods that operators use on large
cluster out there ? Have some of you built some open sourced "cluster
management" interfaces or scripts that could make things easier while
operating on large Cassandra clusters ?

Alain


Re: Operating on large cluster

2014-10-23 Thread Jens Rantil
Hi,


While I am nowhere close to 50+ machines I've been using Saltstack for both 
configuration management as well as remote execution. I has worked great for me 
and supposedly scales to 1000+ machines.




Cheers,

Jens


—
Sent from Mailbox

On Thu, Oct 23, 2014 at 11:18 AM, Alain RODRIGUEZ 
wrote:

> Hi,
> I was wondering about how do you guys handle a large cluster (50+ machines).
> I mean there is sometime you need to change configuration (cassandra.yaml)
> or send a command to one, some or all nodes (cleanup, upgradesstables,
> setstramthoughput or whatever).
> So far we have been using things like custom scripts for repairs or any
> routine maintenance and cssh for specific and one shot actions on the
> cluster. But I guess this doesn't really scale, I guess we coul use pssh
> instead. For configuration changes we use Capistrano that might scale
> properly.
> So I would like to known, what are the methods that operators use on large
> cluster out there ? Have some of you built some open sourced "cluster
> management" interfaces or scripts that could make things easier while
> operating on large Cassandra clusters ?
> Alain

Re: Operating on large cluster

2014-10-23 Thread Ranjib Dey
We use chef for configuration management and blender for on demand jobs

https://github.com/opscode/chef
https://github.com/PagerDuty/blender
 On Oct 23, 2014 2:18 AM, "Alain RODRIGUEZ"  wrote:

> Hi,
>
> I was wondering about how do you guys handle a large cluster (50+
> machines).
>
> I mean there is sometime you need to change configuration (cassandra.yaml)
> or send a command to one, some or all nodes (cleanup, upgradesstables,
> setstramthoughput or whatever).
>
> So far we have been using things like custom scripts for repairs or any
> routine maintenance and cssh for specific and one shot actions on the
> cluster. But I guess this doesn't really scale, I guess we coul use pssh
> instead. For configuration changes we use Capistrano that might scale
> properly.
>
> So I would like to known, what are the methods that operators use on large
> cluster out there ? Have some of you built some open sourced "cluster
> management" interfaces or scripts that could make things easier while
> operating on large Cassandra clusters ?
>
> Alain
>


Re: Operating on large cluster

2014-10-23 Thread Michael Shuler

On 10/23/2014 04:18 AM, Alain RODRIGUEZ wrote:

I was wondering about how do you guys handle a large cluster (50+ machines).


Configuration management tools are awesome, until they aren't. Having 
used or played with all the popular ones, and having been bitten by 
failures of those tools on large clusters, my long-time preference has 
been using a VCS to check configs and scripts in/out and parallel ssh 
(whichever one you like). Simple is good. If you don't deeply understand 
the config management system you have chosen, the unexpected may(will?) 
eventually happen. To all the servers at once.


Even when you are careful, we are human. No tool can prevent *all* 
mistakes. Test everything in a staging environment, first!


--
Kind regards,
Michael

PS. even staging doesn't prevent fallibility.. :)
https://twitter.com/mshuler/status/520667739615395840


Re: Operating on large cluster

2014-10-23 Thread Eric Plowe
I am a big fan of perl-ssh-tools (https://github.com/tobert/perl-ssh-tools)
to let me manage my nodes and SVN to store configs.

~Eric Plowe


On Thu, Oct 23, 2014 at 3:07 PM, Michael Shuler 
wrote:

> On 10/23/2014 04:18 AM, Alain RODRIGUEZ wrote:
>
>> I was wondering about how do you guys handle a large cluster (50+
>> machines).
>>
>
> Configuration management tools are awesome, until they aren't. Having used
> or played with all the popular ones, and having been bitten by failures of
> those tools on large clusters, my long-time preference has been using a VCS
> to check configs and scripts in/out and parallel ssh (whichever one you
> like). Simple is good. If you don't deeply understand the config management
> system you have chosen, the unexpected may(will?) eventually happen. To all
> the servers at once.
>
> Even when you are careful, we are human. No tool can prevent *all*
> mistakes. Test everything in a staging environment, first!
>
> --
> Kind regards,
> Michael
>
> PS. even staging doesn't prevent fallibility.. :)
> https://twitter.com/mshuler/status/520667739615395840
>


Re: Operating on large cluster

2014-10-23 Thread Roni Balthazar
Hi,

We use Puppet to manage our Cassandra configuration. (http://puppetlabs.com)

You can use Cluster SSH to send commands to the server as well.

Another good choice is Saltstack.

Regards,

Roni

On Thu, Oct 23, 2014 at 5:18 AM, Alain RODRIGUEZ  wrote:

> Hi,
>
> I was wondering about how do you guys handle a large cluster (50+
> machines).
>
> I mean there is sometime you need to change configuration (cassandra.yaml)
> or send a command to one, some or all nodes (cleanup, upgradesstables,
> setstramthoughput or whatever).
>
> So far we have been using things like custom scripts for repairs or any
> routine maintenance and cssh for specific and one shot actions on the
> cluster. But I guess this doesn't really scale, I guess we coul use pssh
> instead. For configuration changes we use Capistrano that might scale
> properly.
>
> So I would like to known, what are the methods that operators use on large
> cluster out there ? Have some of you built some open sourced "cluster
> management" interfaces or scripts that could make things easier while
> operating on large Cassandra clusters ?
>
> Alain
>


Re: Operating on large cluster

2014-10-23 Thread Otis Gospodnetic
Hi Alain,

We use Puppet and introducing Ansible at Sematext.  Not for Cassandra, but
for other similar tech.

Otis
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/


On Thu, Oct 23, 2014 at 5:18 AM, Alain RODRIGUEZ  wrote:

> Hi,
>
> I was wondering about how do you guys handle a large cluster (50+
> machines).
>
> I mean there is sometime you need to change configuration (cassandra.yaml)
> or send a command to one, some or all nodes (cleanup, upgradesstables,
> setstramthoughput or whatever).
>
> So far we have been using things like custom scripts for repairs or any
> routine maintenance and cssh for specific and one shot actions on the
> cluster. But I guess this doesn't really scale, I guess we coul use pssh
> instead. For configuration changes we use Capistrano that might scale
> properly.
>
> So I would like to known, what are the methods that operators use on large
> cluster out there ? Have some of you built some open sourced "cluster
> management" interfaces or scripts that could make things easier while
> operating on large Cassandra clusters ?
>
> Alain
>