Re: Cassandra backup and restore procedures
oh interesting, it uses IP for hinted hand off? which brings up another interesting question, if the node that went down never came up again, how long will the hinted handoff keep going? Indefinitely? On the first topic of backup/restore, you suggested copying the sstable files over from the 2 neighboring nodes. But what if the file names that you copy over from the nodes are identical. Will that cause a problem? Can we rename the sstable file names when we copy them over? On Wed, Nov 18, 2009 at 11:03 AM, Jonathan Ellis jbel...@gmail.com wrote: Tokens can change, so IP is used for node identification, e.g. for hinted handoff. On Wed, Nov 18, 2009 at 1:00 PM, Ramzi Rabah rra...@playdom.com wrote: Hey Jonathan, why should a replacement node keep the same IP address/DNS name as the original node? Wouldn't having the same token as the node that went down be sufficient (provided that you did the steps above of copying the data from the 2 neighboring nodes)? On Wed, Nov 18, 2009 at 10:30 AM, Jonathan Ellis jbel...@gmail.com wrote: On Wed, Nov 18, 2009 at 12:05 PM, Jon Graham sjclou...@gmail.com wrote: Hello Everyone, Is there a recommended backup/restore procedure to be able to recover a failed node? Until tickets 193 and 520 are done, the easiest thing is to copy all the sstables from the other nodes that have replicas for the ranges it is responsible for (e.g. for replication factor of 3 on rack unaware partitioner, the nodes before it and the node after it on the right would suffice), and then run nodeprobe cleanup to clear out the excess. How does Cassandra keep track of a node's identity? It stores it in the system table. Should a replacement node keep the same IP address/DNS name as the original node? Yes. Does a node still receive data while a nodeprobe snapshot command runs? Yes. -Jonathan
Re: Cassandra backup and restore procedures
So, is there anyway to recover if you can't guarantee the same IP address? Since we are running on EC2 (as I'm sure are others on the list), and there is no way to make this guarantee. Is this sort of recoverability on the roadmap anywhere? Thanks, -Anthony On Wed, Nov 18, 2009 at 01:50:20PM -0600, Jonathan Ellis wrote: No, bootstrap is currently only for adding new nodes, not replacing dead ones. On Wed, Nov 18, 2009 at 1:47 PM, Simon Smith simongsm...@gmail.com wrote: I'm sorry if this was covered before, but if you lose a node and cannot bring it (or a replacement) back with the same IP address or DNS name, is your only option to restart the entire cluster? E.g. if I have nodes 1, 2, and 3 with replication factor 3, and then I lose node 3, is it possible to bring up a new node 3 with a new IP (and a Seed of either node 1 or node 2) and bootstrap it? Thanks, Simon On Wed, Nov 18, 2009 at 2:03 PM, Jonathan Ellis jbel...@gmail.com wrote: Tokens can change, so IP is used for node identification, e.g. for hinted handoff. On Wed, Nov 18, 2009 at 1:00 PM, Ramzi Rabah rra...@playdom.com wrote: Hey Jonathan, why should a replacement node keep the same IP address/DNS name as the original node? Wouldn't having the same token as the node that went down be sufficient (provided that you did the steps above of copying the data from the 2 neighboring nodes)? -- Anthony Molinaro antho...@alumni.caltech.edu
Re: Cassandra backup and restore procedures
Currently, no. Feel free to open a ticket. It would be fairly easy to make the decommission code in trunk handle this. On Thu, Nov 19, 2009 at 1:13 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: So, is there anyway to recover if you can't guarantee the same IP address? Since we are running on EC2 (as I'm sure are others on the list), and there is no way to make this guarantee. Is this sort of recoverability on the roadmap anywhere? Thanks, -Anthony On Wed, Nov 18, 2009 at 01:50:20PM -0600, Jonathan Ellis wrote: No, bootstrap is currently only for adding new nodes, not replacing dead ones. On Wed, Nov 18, 2009 at 1:47 PM, Simon Smith simongsm...@gmail.com wrote: I'm sorry if this was covered before, but if you lose a node and cannot bring it (or a replacement) back with the same IP address or DNS name, is your only option to restart the entire cluster? E.g. if I have nodes 1, 2, and 3 with replication factor 3, and then I lose node 3, is it possible to bring up a new node 3 with a new IP (and a Seed of either node 1 or node 2) and bootstrap it? Thanks, Simon On Wed, Nov 18, 2009 at 2:03 PM, Jonathan Ellis jbel...@gmail.com wrote: Tokens can change, so IP is used for node identification, e.g. for hinted handoff. On Wed, Nov 18, 2009 at 1:00 PM, Ramzi Rabah rra...@playdom.com wrote: Hey Jonathan, why should a replacement node keep the same IP address/DNS name as the original node? Wouldn't having the same token as the node that went down be sufficient (provided that you did the steps above of copying the data from the 2 neighboring nodes)? -- Anthony Molinaro antho...@alumni.caltech.edu
RE: Cassandra backup and restore procedures
I'm not going to be on Amazon, but I'm planning to use hostnames instead of IP's and a dynamically generated /etc/hosts file and I think that would deal with this problem. I'm sure a private DNS server would be just as good. My real motive in saying this is so someone will scream at me if I'm wrong and save me the time of exploring the bad solution. :-). Tim Freeman Email: tim.free...@hp.com Desk in Palo Alto: (650) 857-2581 Home: (408) 774-1298 Cell: (408) 348-7536 (No reception business hours Monday, Tuesday, and Thursday; call my desk instead.) -Original Message- From: Anthony Molinaro [mailto:antho...@alumni.caltech.edu] Sent: Thursday, November 19, 2009 11:14 AM To: cassandra-user@incubator.apache.org Subject: Re: Cassandra backup and restore procedures So, is there anyway to recover if you can't guarantee the same IP address? Since we are running on EC2 (as I'm sure are others on the list), and there is no way to make this guarantee. Is this sort of recoverability on the roadmap anywhere? Thanks, -Anthony On Wed, Nov 18, 2009 at 01:50:20PM -0600, Jonathan Ellis wrote: No, bootstrap is currently only for adding new nodes, not replacing dead ones. On Wed, Nov 18, 2009 at 1:47 PM, Simon Smith simongsm...@gmail.com wrote: I'm sorry if this was covered before, but if you lose a node and cannot bring it (or a replacement) back with the same IP address or DNS name, is your only option to restart the entire cluster? E.g. if I have nodes 1, 2, and 3 with replication factor 3, and then I lose node 3, is it possible to bring up a new node 3 with a new IP (and a Seed of either node 1 or node 2) and bootstrap it? Thanks, Simon On Wed, Nov 18, 2009 at 2:03 PM, Jonathan Ellis jbel...@gmail.com wrote: Tokens can change, so IP is used for node identification, e.g. for hinted handoff. On Wed, Nov 18, 2009 at 1:00 PM, Ramzi Rabah rra...@playdom.com wrote: Hey Jonathan, why should a replacement node keep the same IP address/DNS name as the original node? Wouldn't having the same token as the node that went down be sufficient (provided that you did the steps above of copying the data from the 2 neighboring nodes)? -- Anthony Molinaro antho...@alumni.caltech.edu
Re: Cassandra backup and restore procedures
On Thu, Nov 19, 2009 at 1:18 PM, Freeman, Tim tim.free...@hp.com wrote: I'm not going to be on Amazon, but I'm planning to use hostnames instead of IP's and a dynamically generated /etc/hosts file and I think that would deal with this problem. I'm sure a private DNS server would be just as good. My real motive in saying this is so someone will scream at me if I'm wrong and save me the time of exploring the bad solution. :-). This is exactly what I do and it has worked great for me. -Brandon
Re: Cassandra backup and restore procedures
Done, https://issues.apache.org/jira/browse/CASSANDRA-564 On Thu, Nov 19, 2009 at 01:17:16PM -0600, Jonathan Ellis wrote: Currently, no. Feel free to open a ticket. It would be fairly easy to make the decommission code in trunk handle this. On Thu, Nov 19, 2009 at 1:13 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: So, is there anyway to recover if you can't guarantee the same IP address? Since we are running on EC2 (as I'm sure are others on the list), and there is no way to make this guarantee. Is this sort of recoverability on the roadmap anywhere? Thanks, -Anthony On Wed, Nov 18, 2009 at 01:50:20PM -0600, Jonathan Ellis wrote: No, bootstrap is currently only for adding new nodes, not replacing dead ones. On Wed, Nov 18, 2009 at 1:47 PM, Simon Smith simongsm...@gmail.com wrote: I'm sorry if this was covered before, but if you lose a node and cannot bring it (or a replacement) back with the same IP address or DNS name, is your only option to restart the entire cluster? E.g. if I have nodes 1, 2, and 3 with replication factor 3, and then I lose node 3, is it possible to bring up a new node 3 with a new IP (and a Seed of either node 1 or node 2) and bootstrap it? Thanks, Simon On Wed, Nov 18, 2009 at 2:03 PM, Jonathan Ellis jbel...@gmail.com wrote: Tokens can change, so IP is used for node identification, e.g. for hinted handoff. On Wed, Nov 18, 2009 at 1:00 PM, Ramzi Rabah rra...@playdom.com wrote: Hey Jonathan, why should a replacement node keep the same IP address/DNS name as the original node? Wouldn't having the same token as the node that went down be sufficient (provided that you did the steps above of copying the data from the 2 neighboring nodes)? -- Anthony Molinaro antho...@alumni.caltech.edu -- Anthony Molinaro antho...@alumni.caltech.edu
Cassandra backup and restore procedures
Hello Everyone, Is there a recommended backup/restore procedure to be able to recover a failed node? How does Cassandra keep track of a node's identity? Should a replacement node keep the same IP address/DNS name as the original node? Does a node still receive data while a nodeprobe snapshot command runs? Thanks, Jon
Re: Cassandra backup and restore procedures
On Wed, Nov 18, 2009 at 12:05 PM, Jon Graham sjclou...@gmail.com wrote: Hello Everyone, Is there a recommended backup/restore procedure to be able to recover a failed node? Until tickets 193 and 520 are done, the easiest thing is to copy all the sstables from the other nodes that have replicas for the ranges it is responsible for (e.g. for replication factor of 3 on rack unaware partitioner, the nodes before it and the node after it on the right would suffice), and then run nodeprobe cleanup to clear out the excess. How does Cassandra keep track of a node's identity? It stores it in the system table. Should a replacement node keep the same IP address/DNS name as the original node? Yes. Does a node still receive data while a nodeprobe snapshot command runs? Yes. -Jonathan
Re: Cassandra backup and restore procedures
Hey Jonathan, why should a replacement node keep the same IP address/DNS name as the original node? Wouldn't having the same token as the node that went down be sufficient (provided that you did the steps above of copying the data from the 2 neighboring nodes)? On Wed, Nov 18, 2009 at 10:30 AM, Jonathan Ellis jbel...@gmail.com wrote: On Wed, Nov 18, 2009 at 12:05 PM, Jon Graham sjclou...@gmail.com wrote: Hello Everyone, Is there a recommended backup/restore procedure to be able to recover a failed node? Until tickets 193 and 520 are done, the easiest thing is to copy all the sstables from the other nodes that have replicas for the ranges it is responsible for (e.g. for replication factor of 3 on rack unaware partitioner, the nodes before it and the node after it on the right would suffice), and then run nodeprobe cleanup to clear out the excess. How does Cassandra keep track of a node's identity? It stores it in the system table. Should a replacement node keep the same IP address/DNS name as the original node? Yes. Does a node still receive data while a nodeprobe snapshot command runs? Yes. -Jonathan
Re: Cassandra backup and restore procedures
Tokens can change, so IP is used for node identification, e.g. for hinted handoff. On Wed, Nov 18, 2009 at 1:00 PM, Ramzi Rabah rra...@playdom.com wrote: Hey Jonathan, why should a replacement node keep the same IP address/DNS name as the original node? Wouldn't having the same token as the node that went down be sufficient (provided that you did the steps above of copying the data from the 2 neighboring nodes)? On Wed, Nov 18, 2009 at 10:30 AM, Jonathan Ellis jbel...@gmail.com wrote: On Wed, Nov 18, 2009 at 12:05 PM, Jon Graham sjclou...@gmail.com wrote: Hello Everyone, Is there a recommended backup/restore procedure to be able to recover a failed node? Until tickets 193 and 520 are done, the easiest thing is to copy all the sstables from the other nodes that have replicas for the ranges it is responsible for (e.g. for replication factor of 3 on rack unaware partitioner, the nodes before it and the node after it on the right would suffice), and then run nodeprobe cleanup to clear out the excess. How does Cassandra keep track of a node's identity? It stores it in the system table. Should a replacement node keep the same IP address/DNS name as the original node? Yes. Does a node still receive data while a nodeprobe snapshot command runs? Yes. -Jonathan
Re: Cassandra backup and restore procedures
Hello Jonathan, Is the system table information contained in the system/Location* files? How do I know which nodes hold the replicated copies when using ordered preserving partitioning? Are replicas always stored in neighboring nodes? Are the left (before) and right (after) nodes you mentioned determined by the position in the cluster ring? Thanks, Jon On Wed, Nov 18, 2009 at 10:30 AM, Jonathan Ellis jbel...@gmail.com wrote: On Wed, Nov 18, 2009 at 12:05 PM, Jon Graham sjclou...@gmail.com wrote: Hello Everyone, Is there a recommended backup/restore procedure to be able to recover a failed node? Until tickets 193 and 520 are done, the easiest thing is to copy all the sstables from the other nodes that have replicas for the ranges it is responsible for (e.g. for replication factor of 3 on rack unaware partitioner, the nodes before it and the node after it on the right would suffice), and then run nodeprobe cleanup to clear out the excess. How does Cassandra keep track of a node's identity? It stores it in the system table. Should a replacement node keep the same IP address/DNS name as the original node? Yes. Does a node still receive data while a nodeprobe snapshot command runs? Yes. -Jonathan
Re: Cassandra backup and restore procedures
I'm sorry if this was covered before, but if you lose a node and cannot bring it (or a replacement) back with the same IP address or DNS name, is your only option to restart the entire cluster? E.g. if I have nodes 1, 2, and 3 with replication factor 3, and then I lose node 3, is it possible to bring up a new node 3 with a new IP (and a Seed of either node 1 or node 2) and bootstrap it? Thanks, Simon On Wed, Nov 18, 2009 at 2:03 PM, Jonathan Ellis jbel...@gmail.com wrote: Tokens can change, so IP is used for node identification, e.g. for hinted handoff. On Wed, Nov 18, 2009 at 1:00 PM, Ramzi Rabah rra...@playdom.com wrote: Hey Jonathan, why should a replacement node keep the same IP address/DNS name as the original node? Wouldn't having the same token as the node that went down be sufficient (provided that you did the steps above of copying the data from the 2 neighboring nodes)?
Re: Cassandra backup and restore procedures
No, bootstrap is currently only for adding new nodes, not replacing dead ones. On Wed, Nov 18, 2009 at 1:47 PM, Simon Smith simongsm...@gmail.com wrote: I'm sorry if this was covered before, but if you lose a node and cannot bring it (or a replacement) back with the same IP address or DNS name, is your only option to restart the entire cluster? E.g. if I have nodes 1, 2, and 3 with replication factor 3, and then I lose node 3, is it possible to bring up a new node 3 with a new IP (and a Seed of either node 1 or node 2) and bootstrap it? Thanks, Simon On Wed, Nov 18, 2009 at 2:03 PM, Jonathan Ellis jbel...@gmail.com wrote: Tokens can change, so IP is used for node identification, e.g. for hinted handoff. On Wed, Nov 18, 2009 at 1:00 PM, Ramzi Rabah rra...@playdom.com wrote: Hey Jonathan, why should a replacement node keep the same IP address/DNS name as the original node? Wouldn't having the same token as the node that went down be sufficient (provided that you did the steps above of copying the data from the 2 neighboring nodes)?