Operating on large cluster
Hi, I was wondering about how do you guys handle a large cluster (50+ machines). I mean there is sometime you need to change configuration (cassandra.yaml) or send a command to one, some or all nodes (cleanup, upgradesstables, setstramthoughput or whatever). So far we have been using things like custom scripts for repairs or any routine maintenance and cssh for specific and one shot actions on the cluster. But I guess this doesn't really scale, I guess we coul use pssh instead. For configuration changes we use Capistrano that might scale properly. So I would like to known, what are the methods that operators use on large cluster out there ? Have some of you built some open sourced "cluster management" interfaces or scripts that could make things easier while operating on large Cassandra clusters ? Alain
Re: Operating on large cluster
Hi, While I am nowhere close to 50+ machines I've been using Saltstack for both configuration management as well as remote execution. I has worked great for me and supposedly scales to 1000+ machines. Cheers, Jens — Sent from Mailbox On Thu, Oct 23, 2014 at 11:18 AM, Alain RODRIGUEZ wrote: > Hi, > I was wondering about how do you guys handle a large cluster (50+ machines). > I mean there is sometime you need to change configuration (cassandra.yaml) > or send a command to one, some or all nodes (cleanup, upgradesstables, > setstramthoughput or whatever). > So far we have been using things like custom scripts for repairs or any > routine maintenance and cssh for specific and one shot actions on the > cluster. But I guess this doesn't really scale, I guess we coul use pssh > instead. For configuration changes we use Capistrano that might scale > properly. > So I would like to known, what are the methods that operators use on large > cluster out there ? Have some of you built some open sourced "cluster > management" interfaces or scripts that could make things easier while > operating on large Cassandra clusters ? > Alain
Re: Operating on large cluster
We use chef for configuration management and blender for on demand jobs https://github.com/opscode/chef https://github.com/PagerDuty/blender On Oct 23, 2014 2:18 AM, "Alain RODRIGUEZ" wrote: > Hi, > > I was wondering about how do you guys handle a large cluster (50+ > machines). > > I mean there is sometime you need to change configuration (cassandra.yaml) > or send a command to one, some or all nodes (cleanup, upgradesstables, > setstramthoughput or whatever). > > So far we have been using things like custom scripts for repairs or any > routine maintenance and cssh for specific and one shot actions on the > cluster. But I guess this doesn't really scale, I guess we coul use pssh > instead. For configuration changes we use Capistrano that might scale > properly. > > So I would like to known, what are the methods that operators use on large > cluster out there ? Have some of you built some open sourced "cluster > management" interfaces or scripts that could make things easier while > operating on large Cassandra clusters ? > > Alain >
Re: Operating on large cluster
On 10/23/2014 04:18 AM, Alain RODRIGUEZ wrote: I was wondering about how do you guys handle a large cluster (50+ machines). Configuration management tools are awesome, until they aren't. Having used or played with all the popular ones, and having been bitten by failures of those tools on large clusters, my long-time preference has been using a VCS to check configs and scripts in/out and parallel ssh (whichever one you like). Simple is good. If you don't deeply understand the config management system you have chosen, the unexpected may(will?) eventually happen. To all the servers at once. Even when you are careful, we are human. No tool can prevent *all* mistakes. Test everything in a staging environment, first! -- Kind regards, Michael PS. even staging doesn't prevent fallibility.. :) https://twitter.com/mshuler/status/520667739615395840
Re: Operating on large cluster
I am a big fan of perl-ssh-tools (https://github.com/tobert/perl-ssh-tools) to let me manage my nodes and SVN to store configs. ~Eric Plowe On Thu, Oct 23, 2014 at 3:07 PM, Michael Shuler wrote: > On 10/23/2014 04:18 AM, Alain RODRIGUEZ wrote: > >> I was wondering about how do you guys handle a large cluster (50+ >> machines). >> > > Configuration management tools are awesome, until they aren't. Having used > or played with all the popular ones, and having been bitten by failures of > those tools on large clusters, my long-time preference has been using a VCS > to check configs and scripts in/out and parallel ssh (whichever one you > like). Simple is good. If you don't deeply understand the config management > system you have chosen, the unexpected may(will?) eventually happen. To all > the servers at once. > > Even when you are careful, we are human. No tool can prevent *all* > mistakes. Test everything in a staging environment, first! > > -- > Kind regards, > Michael > > PS. even staging doesn't prevent fallibility.. :) > https://twitter.com/mshuler/status/520667739615395840 >
Re: Operating on large cluster
Hi, We use Puppet to manage our Cassandra configuration. (http://puppetlabs.com) You can use Cluster SSH to send commands to the server as well. Another good choice is Saltstack. Regards, Roni On Thu, Oct 23, 2014 at 5:18 AM, Alain RODRIGUEZ wrote: > Hi, > > I was wondering about how do you guys handle a large cluster (50+ > machines). > > I mean there is sometime you need to change configuration (cassandra.yaml) > or send a command to one, some or all nodes (cleanup, upgradesstables, > setstramthoughput or whatever). > > So far we have been using things like custom scripts for repairs or any > routine maintenance and cssh for specific and one shot actions on the > cluster. But I guess this doesn't really scale, I guess we coul use pssh > instead. For configuration changes we use Capistrano that might scale > properly. > > So I would like to known, what are the methods that operators use on large > cluster out there ? Have some of you built some open sourced "cluster > management" interfaces or scripts that could make things easier while > operating on large Cassandra clusters ? > > Alain >
Re: Operating on large cluster
Hi Alain, We use Puppet and introducing Ansible at Sematext. Not for Cassandra, but for other similar tech. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Thu, Oct 23, 2014 at 5:18 AM, Alain RODRIGUEZ wrote: > Hi, > > I was wondering about how do you guys handle a large cluster (50+ > machines). > > I mean there is sometime you need to change configuration (cassandra.yaml) > or send a command to one, some or all nodes (cleanup, upgradesstables, > setstramthoughput or whatever). > > So far we have been using things like custom scripts for repairs or any > routine maintenance and cssh for specific and one shot actions on the > cluster. But I guess this doesn't really scale, I guess we coul use pssh > instead. For configuration changes we use Capistrano that might scale > properly. > > So I would like to known, what are the methods that operators use on large > cluster out there ? Have some of you built some open sourced "cluster > management" interfaces or scripts that could make things easier while > operating on large Cassandra clusters ? > > Alain >