Re: Automating nodetool repair
Staggering the repairs also gives the DynamicSnitch a chance to route around nodes which maybe running slow. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 29/08/2012, at 11:19 AM, Omid Aladini omidalad...@gmail.com wrote: Secondly, what's the need for sleep 120? just give the cluster a chance to settle down between repairs... there's no real need for it, just is there because. Actually, repair could cause unreplicated data to be streamed and new sstables to be created. New sstables could cause pending compactions and increase the potential number of sstables a row could be spread across. Therefore you might need more disk seeks to read a row and have slower read response time. If the read response time is critical, it's a good idea to wait for pending compactions to settle before repairing other neighbouring ranges that overlap replicas. -- Omid -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin carpe diem quam minimum credula postero
Re: Automating nodetool repair
You can consider adding -pr. When iterating through all your hosts like this. -pr means primary range, and will do less duplicated work. On Mon, Aug 27, 2012 at 8:05 PM, Aaron Turner synfina...@gmail.com wrote: I use cron. On one box I just do: for n in node1 node2 node3 node4 ; do nodetool -h $n repair sleep 120 done A lot easier then managing a bunch of individual crontabs IMHO although I suppose I could of done it with puppet, but then you always have to keep an eye out that your repairs don't overlap over time. On Mon, Aug 27, 2012 at 4:52 PM, Edward Sargisson edward.sargis...@globalrelay.net wrote: Hi all, So nodetool repair has to be run regularly on all nodes. Does anybody have any interesting strategies or tools for doing this or is everybody just setting up cron to do it? For example, one could write some Puppet code to splay the cron times around so that only one should be running at once. Or, perhaps, a central orchestrator that is given some known quiet time and works its way through the list, running nodetool repair one at a time (using RPC?) until it runs out of time. Cheers, Edward -- Edward Sargisson senior java developer Global Relay edward.sargis...@globalrelay.net 866.484.6630 New York | Chicago | Vancouver | London (+44.0800.032.9829) | Singapore (+65.3158.1301) Global Relay Archive supports email, instant messaging, BlackBerry, Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, Facebook and more. Ask about Global Relay Message — The Future of Collaboration in the Financial Services World All email sent to or from this address will be retained by Global Relay’s email archiving system. This message is intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. Global Relay will not be liable for any compliance or technical information provided herein. All trademarks are the property of their respective owners. -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin carpe diem quam minimum credula postero
Re: Automating nodetool repair
Funny you mention that... i just was hearing on #cassandra this morning that it repairs the replica set by default. I was thinking of repairing every 3rd node (RF=3), but running -pr seems cleaner. Do you know if this (repairing a replica vs node) was introduced in 1.0 or 1.1? On Tue, Aug 28, 2012 at 7:03 AM, Edward Capriolo edlinuxg...@gmail.com wrote: You can consider adding -pr. When iterating through all your hosts like this. -pr means primary range, and will do less duplicated work. On Mon, Aug 27, 2012 at 8:05 PM, Aaron Turner synfina...@gmail.com wrote: I use cron. On one box I just do: for n in node1 node2 node3 node4 ; do nodetool -h $n repair sleep 120 done A lot easier then managing a bunch of individual crontabs IMHO although I suppose I could of done it with puppet, but then you always have to keep an eye out that your repairs don't overlap over time. On Mon, Aug 27, 2012 at 4:52 PM, Edward Sargisson edward.sargis...@globalrelay.net wrote: Hi all, So nodetool repair has to be run regularly on all nodes. Does anybody have any interesting strategies or tools for doing this or is everybody just setting up cron to do it? For example, one could write some Puppet code to splay the cron times around so that only one should be running at once. Or, perhaps, a central orchestrator that is given some known quiet time and works its way through the list, running nodetool repair one at a time (using RPC?) until it runs out of time. Cheers, Edward -- Edward Sargisson senior java developer Global Relay edward.sargis...@globalrelay.net 866.484.6630 New York | Chicago | Vancouver | London (+44.0800.032.9829) | Singapore (+65.3158.1301) Global Relay Archive supports email, instant messaging, BlackBerry, Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, Facebook and more. Ask about Global Relay Message — The Future of Collaboration in the Financial Services World All email sent to or from this address will be retained by Global Relay’s email archiving system. This message is intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. Global Relay will not be liable for any compliance or technical information provided herein. All trademarks are the property of their respective owners. -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin carpe diem quam minimum credula postero -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin carpe diem quam minimum credula postero
Re: Automating nodetool repair
Is there any reason why cassandra doesn't do nodetool repair out of the box at some fixed intervals? On Tue, Aug 28, 2012 at 9:08 PM, Aaron Turner synfina...@gmail.com wrote: Funny you mention that... i just was hearing on #cassandra this morning that it repairs the replica set by default. I was thinking of repairing every 3rd node (RF=3), but running -pr seems cleaner. Do you know if this (repairing a replica vs node) was introduced in 1.0 or 1.1? On Tue, Aug 28, 2012 at 7:03 AM, Edward Capriolo edlinuxg...@gmail.com wrote: You can consider adding -pr. When iterating through all your hosts like this. -pr means primary range, and will do less duplicated work. On Mon, Aug 27, 2012 at 8:05 PM, Aaron Turner synfina...@gmail.com wrote: I use cron. On one box I just do: for n in node1 node2 node3 node4 ; do nodetool -h $n repair sleep 120 done A lot easier then managing a bunch of individual crontabs IMHO although I suppose I could of done it with puppet, but then you always have to keep an eye out that your repairs don't overlap over time. On Mon, Aug 27, 2012 at 4:52 PM, Edward Sargisson edward.sargis...@globalrelay.net wrote: Hi all, So nodetool repair has to be run regularly on all nodes. Does anybody have any interesting strategies or tools for doing this or is everybody just setting up cron to do it? For example, one could write some Puppet code to splay the cron times around so that only one should be running at once. Or, perhaps, a central orchestrator that is given some known quiet time and works its way through the list, running nodetool repair one at a time (using RPC?) until it runs out of time. Cheers, Edward -- Edward Sargisson senior java developer Global Relay edward.sargis...@globalrelay.net 866.484.6630 New York | Chicago | Vancouver | London (+44.0800.032.9829) | Singapore (+65.3158.1301) Global Relay Archive supports email, instant messaging, BlackBerry, Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, Facebook and more. Ask about Global Relay Message — The Future of Collaboration in the Financial Services World All email sent to or from this address will be retained by Global Relay’s email archiving system. This message is intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. Global Relay will not be liable for any compliance or technical information provided herein. All trademarks are the property of their respective owners. -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin carpe diem quam minimum credula postero -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin carpe diem quam minimum credula postero
Re: Automating nodetool repair
Thanks a very nice approach. If every nodetool repair uses -pr does that satisfy the requirement to run a repair before GCGraceSeconds expires? In otherwords, will we get a correct result using -pr everywhere. Secondly, what's the need for sleep 120? Cheers, Edward On 12-08-28 07:03 AM, Edward Capriolo wrote: You can consider adding -pr. When iterating through all your hosts like this. -pr means primary range, and will do less duplicated work. On Mon, Aug 27, 2012 at 8:05 PM, Aaron Turner synfina...@gmail.com wrote: I use cron. On one box I just do: for n in node1 node2 node3 node4 ; do nodetool -h $n repair sleep 120 done A lot easier then managing a bunch of individual crontabs IMHO although I suppose I could of done it with puppet, but then you always have to keep an eye out that your repairs don't overlap over time. On Mon, Aug 27, 2012 at 4:52 PM, Edward Sargisson edward.sargis...@globalrelay.net wrote: Hi all, So nodetool repair has to be run regularly on all nodes. Does anybody have any interesting strategies or tools for doing this or is everybody just setting up cron to do it? For example, one could write some Puppet code to splay the cron times around so that only one should be running at once. Or, perhaps, a central orchestrator that is given some known quiet time and works its way through the list, running nodetool repair one at a time (using RPC?) until it runs out of time. Cheers, Edward -- Edward Sargisson senior java developer Global Relay edward.sargis...@globalrelay.net 866.484.6630 New York | Chicago | Vancouver | London (+44.0800.032.9829) | Singapore (+65.3158.1301) Global Relay Archive supports email, instant messaging, BlackBerry, Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, Facebook and more. Ask about Global Relay Message — The Future of Collaboration in the Financial Services World All email sent to or from this address will be retained by Global Relay’s email archiving system. This message is intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. Global Relay will not be liable for any compliance or technical information provided herein. All trademarks are the property of their respective owners. -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin carpe diem quam minimum credula postero -- Edward Sargisson senior java developer Global Relay edward.sargis...@globalrelay.net mailto:edward.sargis...@globalrelay.net *866.484.6630* New York | Chicago | Vancouver | London (+44.0800.032.9829) | Singapore (+65.3158.1301) Global Relay Archive supports email, instant messaging, BlackBerry, Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, Facebook and more. Ask about *Global Relay Message* http://www.globalrelay.com/services/message*— *The Future of Collaboration in the Financial Services World * *All email sent to or from this address will be retained by Global Relay’s email archiving system. This message is intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. Global Relay will not be liable for any compliance or technical information provided herein. All trademarks are the property of their respective owners.
Re: Automating nodetool repair
On Tue, Aug 28, 2012 at 1:42 PM, Edward Sargisson edward.sargis...@globalrelay.net wrote: Thanks a very nice approach. If every nodetool repair uses -pr does that satisfy the requirement to run a repair before GCGraceSeconds expires? In otherwords, will we get a correct result using -pr everywhere. Yep. Secondly, what's the need for sleep 120? just give the cluster a chance to settle down between repairs... there's no real need for it, just is there because. -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin carpe diem quam minimum credula postero
Re: Automating nodetool repair
Secondly, what's the need for sleep 120? just give the cluster a chance to settle down between repairs... there's no real need for it, just is there because. Actually, repair could cause unreplicated data to be streamed and new sstables to be created. New sstables could cause pending compactions and increase the potential number of sstables a row could be spread across. Therefore you might need more disk seeks to read a row and have slower read response time. If the read response time is critical, it's a good idea to wait for pending compactions to settle before repairing other neighbouring ranges that overlap replicas. -- Omid -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin carpe diem quam minimum credula postero
Automating nodetool repair
Hi all, So nodetool repair has to be run regularly on all nodes. Does anybody have any interesting strategies or tools for doing this or is everybody just setting up cron to do it? For example, one could write some Puppet code to splay the cron times around so that only one should be running at once. Or, perhaps, a central orchestrator that is given some known quiet time and works its way through the list, running nodetool repair one at a time (using RPC?) until it runs out of time. Cheers, Edward -- Edward Sargisson senior java developer Global Relay edward.sargis...@globalrelay.net mailto:edward.sargis...@globalrelay.net *866.484.6630* New York | Chicago | Vancouver | London (+44.0800.032.9829) | Singapore (+65.3158.1301) Global Relay Archive supports email, instant messaging, BlackBerry, Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, Facebook and more. Ask about *Global Relay Message* http://www.globalrelay.com/services/message*--- *The Future of Collaboration in the Financial Services World * *All email sent to or from this address will be retained by Global Relay's email archiving system. This message is intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. Global Relay will not be liable for any compliance or technical information provided herein. All trademarks are the property of their respective owners.
Re: Automating nodetool repair
I use cron. On one box I just do: for n in node1 node2 node3 node4 ; do nodetool -h $n repair sleep 120 done A lot easier then managing a bunch of individual crontabs IMHO although I suppose I could of done it with puppet, but then you always have to keep an eye out that your repairs don't overlap over time. On Mon, Aug 27, 2012 at 4:52 PM, Edward Sargisson edward.sargis...@globalrelay.net wrote: Hi all, So nodetool repair has to be run regularly on all nodes. Does anybody have any interesting strategies or tools for doing this or is everybody just setting up cron to do it? For example, one could write some Puppet code to splay the cron times around so that only one should be running at once. Or, perhaps, a central orchestrator that is given some known quiet time and works its way through the list, running nodetool repair one at a time (using RPC?) until it runs out of time. Cheers, Edward -- Edward Sargisson senior java developer Global Relay edward.sargis...@globalrelay.net 866.484.6630 New York | Chicago | Vancouver | London (+44.0800.032.9829) | Singapore (+65.3158.1301) Global Relay Archive supports email, instant messaging, BlackBerry, Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, Facebook and more. Ask about Global Relay Message — The Future of Collaboration in the Financial Services World All email sent to or from this address will be retained by Global Relay’s email archiving system. This message is intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. Global Relay will not be liable for any compliance or technical information provided herein. All trademarks are the property of their respective owners. -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin carpe diem quam minimum credula postero