Re: auto_bootstrap=false broken?
I think reading the relevant documentation might have helped. http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html On Fri, Aug 7, 2015 at 9:04 AM, horschi hors...@gmail.com wrote: Hi Cyril, thanks for backing me up. I'm under siege from all sides here ;-) That something we're trying to do too. However disabling clients connections (closing thrift and native ports) does not prevent other nodes (acting as a coordinators) to request it ... Honestly we'd like to restart a node that need to deploy HH and to make it serve reads only when it's done. And to be more precise, we know when it's done and don't need it to work by itself (automatically). If you use auto_bootstrap=false in the same DC, then I think you are screwed. Afaik auto_bootstrap=false can only work in a new DC, where you can control reads via LOCAL-Consistencieslevels. kind regards, Christian
Re: auto_bootstrap=false broken?
Jeff, That something we're trying to do too. However disabling clients connections (closing thrift and native ports) does not prevent other nodes (acting as a coordinators) to request it ... Honestly we'd like to restart a node that need to deploy HH and to make it serve reads only when it's done. And to be more precise, we know when it's done and don't need it to work by itself (automatically). On Aug 6, 2015, at 17:49, Jeff Jirsa jeff.ji...@crowdstrike.com wrote: Don’t want to serve reads? Disable thrift and native proto, start the with auto-bootstrap set to whatever you want but thrift and native proto disabled, then enable thrift and native proto again to enable reads from clients when ready. Until then, make sure you’re using a consistency level appropriate for your requirements.
Re: auto_bootstrap=false broken?
Hi Jeff, You’re trying to force your view onto an established ecosystem. It is not my intent to force anyone to do anything. I apologize if my title was too provocative. I just wanted to clickbait ;-) It’s not “wrong only because its currently bootstrapping”, it’s not bootstrapping at all, you told it not to bootstrap. Let me correct myself. It should be: its wrong because it isn't bootstrapped. But that does not change what I am proposing: It still should not serve reads. ‘auto_bootstrap’ is the knob that tells cassandra whether or not you want to stream data from other replicas when you join the ring. Period. That’s all it does. If you set it to false, you’re telling cassandra it already has the data. The switch implies nothing else. There is no option to “join the ring but don’t serve reads until I tell you it’s ready”, and changing auto-bootstrap to be that is unlikely to ever happen. I know that it does only that. But I would have made a different design decision (to not serve reads in such a state). Don’t want to serve reads? Disable thrift and native proto, start the with auto-bootstrap set to whatever you want but thrift and native proto disabled, then enable thrift and native proto again to enable reads from clients when ready. Until then, make sure you’re using a consistency level appropriate for your requirements. Of course it can be worked around. I just think its error prone to do that manually. That is why I was proposing a change. You’re mis-using a knob that doesn’t do what you think it does, and is unlikely to ever be changed to do what you think it should. I wanted to change the definition of what auto_bootstrap=false is. I dont know if that makes it better or worse ;-) I hope I did not consume too much of your time. Thanks for all the responses. I will experiment a bit with write_survey and see if it already does what I need. kind regards, Christian
Re: auto_bootstrap=false broken?
Hi Cyril, thanks for backing me up. I'm under siege from all sides here ;-) That something we're trying to do too. However disabling clients connections (closing thrift and native ports) does not prevent other nodes (acting as a coordinators) to request it ... Honestly we'd like to restart a node that need to deploy HH and to make it serve reads only when it's done. And to be more precise, we know when it's done and don't need it to work by itself (automatically). If you use auto_bootstrap=false in the same DC, then I think you are screwed. Afaik auto_bootstrap=false can only work in a new DC, where you can control reads via LOCAL-Consistencieslevels. kind regards, Christian
Re: auto_bootstrap=false broken?
Hi Rob, Your asking the wrong nodes for data in the rebuild-a-new-DC case does not indicate a problem with the auto_bootstrap false + rebuild paradigm. The node is wrong only because its currently bootstrapping. So imho Cassandra should not serve any reads in such a case. What makes you think you should be using it in any case other than rebuild-new-DC or I restored my node's data with tablesnap? These two cases are my example use-cases. I don't know if there are any other cases, perhaps there is. As the bug I pasted you indicates, if you want to repair a node before having it join, use join_ring=false, repair it, and then join. https://issues.apache.org/jira/browse/CASSANDRA-6961 In what way does this functionality not meet your needs? I see two reasons why join_ring=false does not help with my issue or I misunderstand: - When I set up a new node and I start it with join_ring=false, then it does not get any tokens, right? (I tried it, and that was what I got) How can I run nodetool repair when the node doesn't have any tokens? - A node that is not joining the ring, will not receive any writes, right? So if I run repair in a unjoined state for X hours, then I will miss X hours worth of data afterwards. The only solution I see seems to be write_survey. I will do some tests with it, once 2.0.17 is out. I will post my results :-) kind regards, Christian
Re: auto_bootstrap=false broken?
You’re trying to force your view onto an established ecosystem. It’s not “wrong only because its currently bootstrapping”, it’s not bootstrapping at all, you told it not to bootstrap. ‘auto_bootstrap’ is the knob that tells cassandra whether or not you want to stream data from other replicas when you join the ring. Period. That’s all it does. If you set it to false, you’re telling cassandra it already has the data. The switch implies nothing else. There is no option to “join the ring but don’t serve reads until I tell you it’s ready”, and changing auto-bootstrap to be that is unlikely to ever happen. Don’t want to serve reads? Disable thrift and native proto, start the with auto-bootstrap set to whatever you want but thrift and native proto disabled, then enable thrift and native proto again to enable reads from clients when ready. Until then, make sure you’re using a consistency level appropriate for your requirements. You’re mis-using a knob that doesn’t do what you think it does, and is unlikely to ever be changed to do what you think it should. From: horschi Reply-To: user@cassandra.apache.org Date: Thursday, August 6, 2015 at 3:58 AM To: user@cassandra.apache.org Subject: Re: auto_bootstrap=false broken? Hi Rob, Your asking the wrong nodes for data in the rebuild-a-new-DC case does not indicate a problem with the auto_bootstrap false + rebuild paradigm. The node is wrong only because its currently bootstrapping. So imho Cassandra should not serve any reads in such a case. What makes you think you should be using it in any case other than rebuild-new-DC or I restored my node's data with tablesnap? These two cases are my example use-cases. I don't know if there are any other cases, perhaps there is. As the bug I pasted you indicates, if you want to repair a node before having it join, use join_ring=false, repair it, and then join. https://issues.apache.org/jira/browse/CASSANDRA-6961 In what way does this functionality not meet your needs? I see two reasons why join_ring=false does not help with my issue or I misunderstand: - When I set up a new node and I start it with join_ring=false, then it does not get any tokens, right? (I tried it, and that was what I got) How can I run nodetool repair when the node doesn't have any tokens? - A node that is not joining the ring, will not receive any writes, right? So if I run repair in a unjoined state for X hours, then I will miss X hours worth of data afterwards. The only solution I see seems to be write_survey. I will do some tests with it, once 2.0.17 is out. I will post my results :-) kind regards, Christian smime.p7s Description: S/MIME cryptographic signature
Re: auto_bootstrap=false broken?
Hi Rob, let me try to give examples why auto_bootstrap=false is dangerous: I just yesterday had the issue that we wanted to set up a new DC: Unfortunetaly we had one application that used CL.ONE (because its only querying static data and its read heavy). That application stopped working after we brought up the new DC, because it was querying against the new nodes. We are now changing it to LOCAL_ONE, then it should be Ok. But nevertheless: I think it would have been cleaner if the new node would not have served reads in the first place. Instead the operations people have to worry about the applications using the correct CL. Another, more general, issue with auto_bootstrap=false: When adding a new node to an existing cluster, you are basically lowering your CL by one. RF=3 with with quorum will read from two nodes. One might be the bootstrapped node, which has no data. Then you are relying on a single node to be 100% consistent. So what I am trying to say is: Everytime you use auto_bootstrap=false, you are entering a dangerous path. And I think this could be fixed, if auto_bootstrap=false would leave the node in a write-only state. Then the operator could still decide to override it with nodetool. Disclaimer: I am using C* 2.0. kind regards, Christian On Tue, Aug 4, 2015 at 10:02 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Aug 4, 2015 at 11:40 AM, horschi hors...@gmail.com wrote: unless you specify auto_bootstrap=false :) ... so why are you doing that? Two experts are confused as to what you're trying to do; why do you think you need to do it? =Rob
Re: auto_bootstrap=false broken?
On Wed, Aug 5, 2015 at 2:08 AM, horschi hors...@gmail.com wrote: So what I am trying to say is: Everytime you use auto_bootstrap=false, you are entering a dangerous path. And I think this could be fixed, if auto_bootstrap=false would leave the node in a write-only state. Then the operator could still decide to override it with nodetool. That's why you only use auto_bootstrap=false in cases where you have already restored the data on the node, or when you are rebuilding a new DC and know not to ask the new nodes at CL.ONE. Your asking the wrong nodes for data in the rebuild-a-new-DC case does not indicate a problem with the auto_bootstrap false + rebuild paradigm. What makes you think you should be using it in any case other than rebuild-new-DC or I restored my node's data with tablesnap? As the bug I pasted you indicates, if you want to repair a node before having it join, use join_ring=false, repair it, and then join. https://issues.apache.org/jira/browse/CASSANDRA-6961 In what way does this functionality not meet your needs? I continue to believe you are trying to use auto_bootstrap:false in wrong cases and that the solution to your problem is to stop doing that. =Rob
RE: auto_bootstrap=false broken?
I had problems with write_survey. I opened a bug : https://issues.apache.org/jira/browse/CASSANDRA-9934 De : horschi [mailto:hors...@gmail.com] Envoyé : mardi 4 août 2015 15:20 À : user@cassandra.apache.org Objet : Re: auto_bootstrap=false broken? Hi Paulo, thanks for your feedback, but I think this is not what I am looking for. Starting with join_ring does not take any tokens in the ring. And the nodetool join afterwards will again do token-selection and data loading in one step. I would like to separate these steps: 1. assign tokens 2. have the node in a joining state, so that I can copy in data 3. mark the node as ready I just saw that perhaps write_survey could be misused for that. Did anyone ever use write_survey for such a partial bootstrapping? Do I have to worry about data-loss when using multiple write_survey nodes in one cluster? kind regards, Christian On Tue, Aug 4, 2015 at 2:24 PM, Paulo Motta pauloricard...@gmail.commailto:pauloricard...@gmail.com wrote: Hello Christian, You may use the start-up parameter -Dcassandra.join_ring=false if you don't want the node to join the ring on startup. More about this parameter here: http://docs.datastax.com/en/cassandra/2.0/cassandra/tools/toolsCUtility_t.html You can later join the ring via nodetool join command: http://docs.datastax.com/en/cassandra/2.0/cassandra/tools/toolsJoin.html auto_bootstrap=false is typically used to bootstrap new datacenters or clusters, or nodes with data already on it before starting the process. Cheers, Paulo 2015-08-04 8:50 GMT-03:00 horschi hors...@gmail.commailto:hors...@gmail.com: Hi everyone, I'll just ask my question as provocative as possible ;-) Isnt't auto_bootstrap=false broken the way it is currently implemented? What currently happens: New node starts with auto_bootstrap=false and it starts serving reads immediately without having any data. Would the following be more correct: - New node should stay in a joining state - Operator loads data (e.g. using nodetool rebuild or putting in backupped files or whatever) - Operator has to manually switch from joining into normal state using nodetool (only then it will start serving reads) Wouldn't this behaviour more consistent? kind regards, Christian _ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
Re: auto_bootstrap=false broken?
Hi Paulo, thanks for your feedback, but I think this is not what I am looking for. Starting with join_ring does not take any tokens in the ring. And the nodetool join afterwards will again do token-selection and data loading in one step. I would like to separate these steps: 1. assign tokens 2. have the node in a joining state, so that I can copy in data 3. mark the node as ready I just saw that perhaps write_survey could be misused for that. Did anyone ever use write_survey for such a partial bootstrapping? Do I have to worry about data-loss when using multiple write_survey nodes in one cluster? kind regards, Christian On Tue, Aug 4, 2015 at 2:24 PM, Paulo Motta pauloricard...@gmail.com wrote: Hello Christian, You may use the start-up parameter -Dcassandra.join_ring=false if you don't want the node to join the ring on startup. More about this parameter here: http://docs.datastax.com/en/cassandra/2.0/cassandra/tools/toolsCUtility_t.html You can later join the ring via nodetool join command: http://docs.datastax.com/en/cassandra/2.0/cassandra/tools/toolsJoin.html auto_bootstrap=false is typically used to bootstrap new datacenters or clusters, or nodes with data already on it before starting the process. Cheers, Paulo 2015-08-04 8:50 GMT-03:00 horschi hors...@gmail.com: Hi everyone, I'll just ask my question as provocative as possible ;-) Isnt't auto_bootstrap=false broken the way it is currently implemented? What currently happens: New node starts with auto_bootstrap=false and it starts serving reads immediately without having any data. Would the following be more correct: - New node should stay in a joining state - Operator loads data (e.g. using nodetool rebuild or putting in backupped files or whatever) - Operator has to manually switch from joining into normal state using nodetool (only then it will start serving reads) Wouldn't this behaviour more consistent? kind regards, Christian
Re: auto_bootstrap=false broken?
Hello Christian, You may use the start-up parameter -Dcassandra.join_ring=false if you don't want the node to join the ring on startup. More about this parameter here: http://docs.datastax.com/en/cassandra/2.0/cassandra/tools/toolsCUtility_t.html You can later join the ring via nodetool join command: http://docs.datastax.com/en/cassandra/2.0/cassandra/tools/toolsJoin.html auto_bootstrap=false is typically used to bootstrap new datacenters or clusters, or nodes with data already on it before starting the process. Cheers, Paulo 2015-08-04 8:50 GMT-03:00 horschi hors...@gmail.com: Hi everyone, I'll just ask my question as provocative as possible ;-) Isnt't auto_bootstrap=false broken the way it is currently implemented? What currently happens: New node starts with auto_bootstrap=false and it starts serving reads immediately without having any data. Would the following be more correct: - New node should stay in a joining state - Operator loads data (e.g. using nodetool rebuild or putting in backupped files or whatever) - Operator has to manually switch from joining into normal state using nodetool (only then it will start serving reads) Wouldn't this behaviour more consistent? kind regards, Christian
Re: auto_bootstrap=false broken?
Hi Robert, sorry for the confusion. Perhaps write_survey is not my solution (unfortunetaly I cant get it to work, so I dont really know). I just thought that it *could* be my solution. What I actually want: I want to be able to start a new node, without it starting to serve reads prematurely. I want cassandra to wait for me to confirm everything is ok, now serve reads. Possible solutions so far: A) When starting a new node with auto_bootstrap=false, then I get a node that has no data, but serves reads. In my opinion it would be cleaner if it would stay in a joining state where it only receives writes. B) Disabling join_ring on my new node does nothing. The new node will not have a token. I cant see it in nodetool status. Therefore I assume it will not receive any writes. C) write_survey unfortunetaly does not seem to work for me: My new node, which I start with survey-mode, gets writes from other nodes and shows as joining in the ring. Which is good! But does not get a schema, so it throws exceptions when receiving these writes. I assume its just a bug in 2.0. Disclaimer: I am using C* 2.0, with which I can't get the desire behaviour (or at least I don't know how). kind regards, Christian On Tue, Aug 4, 2015 at 7:12 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Aug 4, 2015 at 6:19 AM, horschi hors...@gmail.com wrote: I would like to separate these steps: 1. assign tokens 2. have the node in a joining state, so that I can copy in data 3. mark the node as ready Did anyone ever use write_survey for such a partial bootstrapping? What you're asking doesn't make sense to me. What does partial bootstrap mean? Where are you getting the data from? How are you copying in data and why do you need the node to be in a joining state to do that? https://issues.apache.org/jira/browse/CASSANDRA-6961 Explains a method by which you can repair a partially joined node. In what way does this differ from what you want? =Rob
Re: auto_bootstrap=false broken?
You're trying to solve a problem that doesn't exist. Cassandra only starts serving reads when it's ready. On Tue, Aug 4, 2015 at 10:51 AM horschi hors...@gmail.com wrote: Hi Robert, sorry for the confusion. Perhaps write_survey is not my solution (unfortunetaly I cant get it to work, so I dont really know). I just thought that it *could* be my solution. What I actually want: I want to be able to start a new node, without it starting to serve reads prematurely. I want cassandra to wait for me to confirm everything is ok, now serve reads. Possible solutions so far: A) When starting a new node with auto_bootstrap=false, then I get a node that has no data, but serves reads. In my opinion it would be cleaner if it would stay in a joining state where it only receives writes. B) Disabling join_ring on my new node does nothing. The new node will not have a token. I cant see it in nodetool status. Therefore I assume it will not receive any writes. C) write_survey unfortunetaly does not seem to work for me: My new node, which I start with survey-mode, gets writes from other nodes and shows as joining in the ring. Which is good! But does not get a schema, so it throws exceptions when receiving these writes. I assume its just a bug in 2.0. Disclaimer: I am using C* 2.0, with which I can't get the desire behaviour (or at least I don't know how). kind regards, Christian On Tue, Aug 4, 2015 at 7:12 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Aug 4, 2015 at 6:19 AM, horschi hors...@gmail.com wrote: I would like to separate these steps: 1. assign tokens 2. have the node in a joining state, so that I can copy in data 3. mark the node as ready Did anyone ever use write_survey for such a partial bootstrapping? What you're asking doesn't make sense to me. What does partial bootstrap mean? Where are you getting the data from? How are you copying in data and why do you need the node to be in a joining state to do that? https://issues.apache.org/jira/browse/CASSANDRA-6961 Explains a method by which you can repair a partially joined node. In what way does this differ from what you want? =Rob
Re: auto_bootstrap=false broken?
On Tue, Aug 4, 2015 at 6:19 AM, horschi hors...@gmail.com wrote: I would like to separate these steps: 1. assign tokens 2. have the node in a joining state, so that I can copy in data 3. mark the node as ready Did anyone ever use write_survey for such a partial bootstrapping? What you're asking doesn't make sense to me. What does partial bootstrap mean? Where are you getting the data from? How are you copying in data and why do you need the node to be in a joining state to do that? https://issues.apache.org/jira/browse/CASSANDRA-6961 Explains a method by which you can repair a partially joined node. In what way does this differ from what you want? =Rob
Re: auto_bootstrap=false broken?
Hi Aeljami, thanks for the ticket. I'll keep an eye on it. I can't get the survey to work at all on 2.0 (I am not getting any schema on the survey node). So I guess the survey is not going to be a solution for now. kind regards, Christian On Tue, Aug 4, 2015 at 3:29 PM, aeljami@orange.com wrote: I had problems with write_survey. I opened a bug : https://issues.apache.org/jira/browse/CASSANDRA-9934 *De :* horschi [mailto:hors...@gmail.com] *Envoyé :* mardi 4 août 2015 15:20 *À :* user@cassandra.apache.org *Objet :* Re: auto_bootstrap=false broken? Hi Paulo, thanks for your feedback, but I think this is not what I am looking for. Starting with join_ring does not take any tokens in the ring. And the nodetool join afterwards will again do token-selection and data loading in one step. I would like to separate these steps: 1. assign tokens 2. have the node in a joining state, so that I can copy in data 3. mark the node as ready I just saw that perhaps write_survey could be misused for that. Did anyone ever use write_survey for such a partial bootstrapping? Do I have to worry about data-loss when using multiple write_survey nodes in one cluster? kind regards, Christian On Tue, Aug 4, 2015 at 2:24 PM, Paulo Motta pauloricard...@gmail.com wrote: Hello Christian, You may use the start-up parameter -Dcassandra.join_ring=false if you don't want the node to join the ring on startup. More about this parameter here: http://docs.datastax.com/en/cassandra/2.0/cassandra/tools/toolsCUtility_t.html You can later join the ring via nodetool join command: http://docs.datastax.com/en/cassandra/2.0/cassandra/tools/toolsJoin.html auto_bootstrap=false is typically used to bootstrap new datacenters or clusters, or nodes with data already on it before starting the process. Cheers, Paulo 2015-08-04 8:50 GMT-03:00 horschi hors...@gmail.com: Hi everyone, I'll just ask my question as provocative as possible ;-) Isnt't auto_bootstrap=false broken the way it is currently implemented? What currently happens: New node starts with auto_bootstrap=false and it starts serving reads immediately without having any data. Would the following be more correct: - New node should stay in a joining state - Operator loads data (e.g. using nodetool rebuild or putting in backupped files or whatever) - Operator has to manually switch from joining into normal state using nodetool (only then it will start serving reads) Wouldn't this behaviour more consistent? kind regards, Christian _ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
Re: auto_bootstrap=false broken?
Hi Jonathan, unless you specify auto_bootstrap=false :) kind regards, Christian On Tue, Aug 4, 2015 at 7:54 PM, Jonathan Haddad j...@jonhaddad.com wrote: You're trying to solve a problem that doesn't exist. Cassandra only starts serving reads when it's ready. On Tue, Aug 4, 2015 at 10:51 AM horschi hors...@gmail.com wrote: Hi Robert, sorry for the confusion. Perhaps write_survey is not my solution (unfortunetaly I cant get it to work, so I dont really know). I just thought that it *could* be my solution. What I actually want: I want to be able to start a new node, without it starting to serve reads prematurely. I want cassandra to wait for me to confirm everything is ok, now serve reads. Possible solutions so far: A) When starting a new node with auto_bootstrap=false, then I get a node that has no data, but serves reads. In my opinion it would be cleaner if it would stay in a joining state where it only receives writes. B) Disabling join_ring on my new node does nothing. The new node will not have a token. I cant see it in nodetool status. Therefore I assume it will not receive any writes. C) write_survey unfortunetaly does not seem to work for me: My new node, which I start with survey-mode, gets writes from other nodes and shows as joining in the ring. Which is good! But does not get a schema, so it throws exceptions when receiving these writes. I assume its just a bug in 2.0. Disclaimer: I am using C* 2.0, with which I can't get the desire behaviour (or at least I don't know how). kind regards, Christian On Tue, Aug 4, 2015 at 7:12 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Aug 4, 2015 at 6:19 AM, horschi hors...@gmail.com wrote: I would like to separate these steps: 1. assign tokens 2. have the node in a joining state, so that I can copy in data 3. mark the node as ready Did anyone ever use write_survey for such a partial bootstrapping? What you're asking doesn't make sense to me. What does partial bootstrap mean? Where are you getting the data from? How are you copying in data and why do you need the node to be in a joining state to do that? https://issues.apache.org/jira/browse/CASSANDRA-6961 Explains a method by which you can repair a partially joined node. In what way does this differ from what you want? =Rob
Re: auto_bootstrap=false broken?
On Tue, Aug 4, 2015 at 11:40 AM, horschi hors...@gmail.com wrote: unless you specify auto_bootstrap=false :) ... so why are you doing that? Two experts are confused as to what you're trying to do; why do you think you need to do it? =Rob