Re: Downtime-Limit for a node in Network-Topology-Replication-Cluster?
On Wed, Oct 28, 2015 at 2:20 PM, Vasileios Vlachos < vasileiosvlac...@gmail.com> wrote: > Thanks for clarifying this Rob. > > -However no step other than 2) provides a *guarantee* of consistency. And > it only provides that guarantee for data that exists when the repair starts. > > I read the above as "consistency is a strong term..." :) But that's > understandable. > Yes, the Coli Conjecture is that if your app is suited for a distributed database, consistency probably matters to you less than you think it does. The foundation for this conjecture is that there are many circumstances and modes of operation in which distributed database operators have unexpectedly lost consistency, and almost none of them (or their customers) noticed =Rob
Re: Downtime-Limit for a node in Network-Topology-Replication-Cluster?
Rob, Would you mind to elaborate further on this? I am a little concerned that my understanding (nodetool repair is *not* the only way one can achieve consistency) is not correct. I understand that if people use CL < QUORUM, nodetool repair is the only way to go, but I just cannot see how can that be the only way irrespective of everything else. Thanks in advance for your input! On Sat, Oct 24, 2015 at 10:02 PM, Vasileios Vlachos < vasileiosvlac...@gmail.com> wrote: > >> All other means of repair are optimizations which require a certain >> amount of luck to happen to result in consistency. >> > > Is that true regardless of the CL one uses? So, for example if writing > QUORUM and reading QUORUM, wouldn't an increased read_repair_chance > probability be sufficient? If not, is there a case where nodetool repair > wouldn't be required (given consistency is a requirement)? > > Thanks >
Re: Downtime-Limit for a node in Network-Topology-Replication-Cluster?
On Sat, Oct 24, 2015 at 2:02 PM, Vasileios Vlachos < vasileiosvlac...@gmail.com> wrote: > >> All other means of repair are optimizations which require a certain >> amount of luck to happen to result in consistency. >> > > Is that true regardless of the CL one uses? So, for example if writing > QUORUM and reading QUORUM, wouldn't an increased read_repair_chance > probability be sufficient? If not, is there a case where nodetool repair > wouldn't be required (given consistency is a requirement)? > The only thing which guarantees consistency[1] is repair. It's likely true that if the following conditions are met : 1) you read and write with QUORUM or ALL 2) you repair periodically 3) you have not dropped any mutations or had a crashed node since you last repaired 4) you have not discarded any hints which happened to be stored for whatever reason 5) you have not failed to store any hints due to hint backpressure That you have a system with the property of consistency. However no step other than 2) provides a *guarantee* of consistency. And it only provides that guarantee for data that exists when the repair starts. In a related concept, no step other than 2) *guarantees* that all data is repaired within gc_grace_seconds, which is essential to consistency. =Rob [1] Durability and consistency are commingled in Cassandra, you are more likely to fully achieve the former than the latter in the typical RF=3 case.
Re: Downtime-Limit for a node in Network-Topology-Replication-Cluster?
I am not sure I fully understand the question, because nodetool repair is one of the three ways for Cassandra to ensure consistency. If by "affect" you mean "make your data consistent and ensure all replicas are up-to-date", then yes, that's what I think it does. And yes, I would expect nodetool repair (especially depending on the options appended to it) to have a performance impact, but how big that impact is going to be depends on many things. We currently perform no scheduled repairs because of our workload and the consistency level that we use. So, as you can understand I am certainly not the best person to analyse that bit... Regards, Vasilis On Sat, Oct 24, 2015 at 5:09 PM, Ajay Gargwrote: > Thanks a ton Vasileios !! > > Just one last question :: > Does running "nodetool repair" affect the functionality of cluster for > current-live data? > > It's ok if the insertions/deletions of current-live data become a little > slow during the process, but data-consistency must be maintained. If that > is the case, I think we are good. > > > Thanks and Regards, > Ajay > > On Sat, Oct 24, 2015 at 6:03 PM, Vasileios Vlachos < > vasileiosvlac...@gmail.com> wrote: > >> Hello Ajay, >> >> Here is a good link: >> >> http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/opsRepairNodesManualRepair.html >> >> Generally, I find the DataStax docs to be OK. You could consult them for >> all usual operations etc. Ofc there are occasions where a given concept is >> not as clear, but you can always ask this list for clarification. >> >> If you find that something is wrong in the docs just email them (more >> info and contact email here: http://docs.datastax.com/en/ ). >> >> Regards, >> Vasilis >> >> On Sat, Oct 24, 2015 at 1:04 PM, Ajay Garg >> wrote: >> >>> Thanks Vasileios for the reply !!! >>> That makes sense !!! >>> >>> I will be grateful if you could point me to the node-repair command for >>> Cassandra-2.1.10. >>> I don't want to get stuck in a wrong-versioned documentation (already >>> bitten once hard when setting up replication). >>> >>> Thanks again... >>> >>> >>> Thanks and Regards, >>> Ajay >>> >>> On Sat, Oct 24, 2015 at 4:14 PM, Vasileios Vlachos < >>> vasileiosvlac...@gmail.com> wrote: >>> Hello Ajay, Have a look in the *max_hint_window_in_ms* : http://docs.datastax.com/en/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html My understanding is that if a node remains down for more than *max_hint_window_in_ms*, then you will need to repair that node. Thanks, Vasilis On Sat, Oct 24, 2015 at 7:48 AM, Ajay Garg wrote: > If a node in the cluster goes down and comes up, the data gets synced > up on this downed node. > Is there a limit on the interval for which the node can remain down? > Or the data will be synced up even if the node remains down for > weeks/months/years? > > > > -- > Regards, > Ajay > >>> >>> >>> -- >>> Regards, >>> Ajay >>> >> >> > > > -- > Regards, > Ajay >
Re: Downtime-Limit for a node in Network-Topology-Replication-Cluster?
Max hint window is only part of the equation. If it is down longer than Max hint window, a repair will still fix up the node for you. The max time a node can be down before it must be re built is determined by the lowest gc grace setting on your various tables. By default gc grace is 10 days, but it is configurable on a per table basis. If a node is down longer than gc grace you risk reviving deleted data. This of course depends on your tables and data mutation practices, but it is a good rule to follow in general. Clint On Oct 24, 2015 6:44 AM, "Vasileios Vlachos"wrote: > Hello Ajay, > > Have a look in the *max_hint_window_in_ms* : > > > http://docs.datastax.com/en/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html > > My understanding is that if a node remains down for more than > *max_hint_window_in_ms*, then you will need to repair that node. > > Thanks, > Vasilis > > On Sat, Oct 24, 2015 at 7:48 AM, Ajay Garg wrote: > >> If a node in the cluster goes down and comes up, the data gets synced up >> on this downed node. >> Is there a limit on the interval for which the node can remain down? Or >> the data will be synced up even if the node remains down for >> weeks/months/years? >> >> >> >> -- >> Regards, >> Ajay >> > >
Re: Downtime-Limit for a node in Network-Topology-Replication-Cluster?
Never mind Vasileios, you have been a great help !! Thanks a ton again !!! Thanks and Regards, Ajay On Sat, Oct 24, 2015 at 10:17 PM, Vasileios Vlachos < vasileiosvlac...@gmail.com> wrote: > I am not sure I fully understand the question, because nodetool repair is > one of the three ways for Cassandra to ensure consistency. If by "affect" > you mean "make your data consistent and ensure all replicas are > up-to-date", then yes, that's what I think it does. > > And yes, I would expect nodetool repair (especially depending on the > options appended to it) to have a performance impact, but how big that > impact is going to be depends on many things. > > We currently perform no scheduled repairs because of our workload and the > consistency level that we use. So, as you can understand I am certainly not > the best person to analyse that bit... > > Regards, > Vasilis > > On Sat, Oct 24, 2015 at 5:09 PM, Ajay Gargwrote: > >> Thanks a ton Vasileios !! >> >> Just one last question :: >> Does running "nodetool repair" affect the functionality of cluster for >> current-live data? >> >> It's ok if the insertions/deletions of current-live data become a little >> slow during the process, but data-consistency must be maintained. If that >> is the case, I think we are good. >> >> >> Thanks and Regards, >> Ajay >> >> On Sat, Oct 24, 2015 at 6:03 PM, Vasileios Vlachos < >> vasileiosvlac...@gmail.com> wrote: >> >>> Hello Ajay, >>> >>> Here is a good link: >>> >>> http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/opsRepairNodesManualRepair.html >>> >>> Generally, I find the DataStax docs to be OK. You could consult them for >>> all usual operations etc. Ofc there are occasions where a given concept is >>> not as clear, but you can always ask this list for clarification. >>> >>> If you find that something is wrong in the docs just email them (more >>> info and contact email here: http://docs.datastax.com/en/ ). >>> >>> Regards, >>> Vasilis >>> >>> On Sat, Oct 24, 2015 at 1:04 PM, Ajay Garg >>> wrote: >>> Thanks Vasileios for the reply !!! That makes sense !!! I will be grateful if you could point me to the node-repair command for Cassandra-2.1.10. I don't want to get stuck in a wrong-versioned documentation (already bitten once hard when setting up replication). Thanks again... Thanks and Regards, Ajay On Sat, Oct 24, 2015 at 4:14 PM, Vasileios Vlachos < vasileiosvlac...@gmail.com> wrote: > Hello Ajay, > > Have a look in the *max_hint_window_in_ms* : > > > http://docs.datastax.com/en/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html > > My understanding is that if a node remains down for more than > *max_hint_window_in_ms*, then you will need to repair that node. > > Thanks, > Vasilis > > On Sat, Oct 24, 2015 at 7:48 AM, Ajay Garg > wrote: > >> If a node in the cluster goes down and comes up, the data gets synced >> up on this downed node. >> Is there a limit on the interval for which the node can remain down? >> Or the data will be synced up even if the node remains down for >> weeks/months/years? >> >> >> >> -- >> Regards, >> Ajay >> > > -- Regards, Ajay >>> >>> >> >> >> -- >> Regards, >> Ajay >> > > -- Regards, Ajay
Re: Downtime-Limit for a node in Network-Topology-Replication-Cluster?
Thanks a ton Vasileios !! Just one last question :: Does running "nodetool repair" affect the functionality of cluster for current-live data? It's ok if the insertions/deletions of current-live data become a little slow during the process, but data-consistency must be maintained. If that is the case, I think we are good. Thanks and Regards, Ajay On Sat, Oct 24, 2015 at 6:03 PM, Vasileios Vlachos < vasileiosvlac...@gmail.com> wrote: > Hello Ajay, > > Here is a good link: > > http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/opsRepairNodesManualRepair.html > > Generally, I find the DataStax docs to be OK. You could consult them for > all usual operations etc. Ofc there are occasions where a given concept is > not as clear, but you can always ask this list for clarification. > > If you find that something is wrong in the docs just email them (more info > and contact email here: http://docs.datastax.com/en/ ). > > Regards, > Vasilis > > On Sat, Oct 24, 2015 at 1:04 PM, Ajay Gargwrote: > >> Thanks Vasileios for the reply !!! >> That makes sense !!! >> >> I will be grateful if you could point me to the node-repair command for >> Cassandra-2.1.10. >> I don't want to get stuck in a wrong-versioned documentation (already >> bitten once hard when setting up replication). >> >> Thanks again... >> >> >> Thanks and Regards, >> Ajay >> >> On Sat, Oct 24, 2015 at 4:14 PM, Vasileios Vlachos < >> vasileiosvlac...@gmail.com> wrote: >> >>> Hello Ajay, >>> >>> Have a look in the *max_hint_window_in_ms* : >>> >>> >>> http://docs.datastax.com/en/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html >>> >>> My understanding is that if a node remains down for more than >>> *max_hint_window_in_ms*, then you will need to repair that node. >>> >>> Thanks, >>> Vasilis >>> >>> On Sat, Oct 24, 2015 at 7:48 AM, Ajay Garg >>> wrote: >>> If a node in the cluster goes down and comes up, the data gets synced up on this downed node. Is there a limit on the interval for which the node can remain down? Or the data will be synced up even if the node remains down for weeks/months/years? -- Regards, Ajay >>> >>> >> >> >> -- >> Regards, >> Ajay >> > > -- Regards, Ajay
Re: Downtime-Limit for a node in Network-Topology-Replication-Cluster?
On Sat, Oct 24, 2015 at 9:47 AM, Vasileios Vlachos < vasileiosvlac...@gmail.com> wrote: > I am not sure I fully understand the question, because nodetool repair is > one of the three ways for Cassandra to ensure consistency. If by "affect" > you mean "make your data consistent and ensure all replicas are > up-to-date", then yes, that's what I think it does. > nodetool repair is the *only* way that Cassandra *ensures* consistency. All other means of repair are optimizations which require a certain amount of luck to happen to result in consistency. =Rob
Re: Downtime-Limit for a node in Network-Topology-Replication-Cluster?
> > > All other means of repair are optimizations which require a certain amount > of luck to happen to result in consistency. > Is that true regardless of the CL one uses? So, for example if writing QUORUM and reading QUORUM, wouldn't an increased read_repair_chance probability be sufficient? If not, is there a case where nodetool repair wouldn't be required (given consistency is a requirement)? Thanks
Re: Downtime-Limit for a node in Network-Topology-Replication-Cluster?
Hello Ajay, Here is a good link: http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/opsRepairNodesManualRepair.html Generally, I find the DataStax docs to be OK. You could consult them for all usual operations etc. Ofc there are occasions where a given concept is not as clear, but you can always ask this list for clarification. If you find that something is wrong in the docs just email them (more info and contact email here: http://docs.datastax.com/en/ ). Regards, Vasilis On Sat, Oct 24, 2015 at 1:04 PM, Ajay Gargwrote: > Thanks Vasileios for the reply !!! > That makes sense !!! > > I will be grateful if you could point me to the node-repair command for > Cassandra-2.1.10. > I don't want to get stuck in a wrong-versioned documentation (already > bitten once hard when setting up replication). > > Thanks again... > > > Thanks and Regards, > Ajay > > On Sat, Oct 24, 2015 at 4:14 PM, Vasileios Vlachos < > vasileiosvlac...@gmail.com> wrote: > >> Hello Ajay, >> >> Have a look in the *max_hint_window_in_ms* : >> >> >> http://docs.datastax.com/en/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html >> >> My understanding is that if a node remains down for more than >> *max_hint_window_in_ms*, then you will need to repair that node. >> >> Thanks, >> Vasilis >> >> On Sat, Oct 24, 2015 at 7:48 AM, Ajay Garg >> wrote: >> >>> If a node in the cluster goes down and comes up, the data gets synced up >>> on this downed node. >>> Is there a limit on the interval for which the node can remain down? Or >>> the data will be synced up even if the node remains down for >>> weeks/months/years? >>> >>> >>> >>> -- >>> Regards, >>> Ajay >>> >> >> > > > -- > Regards, > Ajay >
Downtime-Limit for a node in Network-Topology-Replication-Cluster?
If a node in the cluster goes down and comes up, the data gets synced up on this downed node. Is there a limit on the interval for which the node can remain down? Or the data will be synced up even if the node remains down for weeks/months/years? -- Regards, Ajay
Re: Downtime-Limit for a node in Network-Topology-Replication-Cluster?
Thanks Vasileios for the reply !!! That makes sense !!! I will be grateful if you could point me to the node-repair command for Cassandra-2.1.10. I don't want to get stuck in a wrong-versioned documentation (already bitten once hard when setting up replication). Thanks again... Thanks and Regards, Ajay On Sat, Oct 24, 2015 at 4:14 PM, Vasileios Vlachos < vasileiosvlac...@gmail.com> wrote: > Hello Ajay, > > Have a look in the *max_hint_window_in_ms* : > > > http://docs.datastax.com/en/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html > > My understanding is that if a node remains down for more than > *max_hint_window_in_ms*, then you will need to repair that node. > > Thanks, > Vasilis > > On Sat, Oct 24, 2015 at 7:48 AM, Ajay Gargwrote: > >> If a node in the cluster goes down and comes up, the data gets synced up >> on this downed node. >> Is there a limit on the interval for which the node can remain down? Or >> the data will be synced up even if the node remains down for >> weeks/months/years? >> >> >> >> -- >> Regards, >> Ajay >> > > -- Regards, Ajay
Re: Downtime-Limit for a node in Network-Topology-Replication-Cluster?
Hello Ajay, Have a look in the *max_hint_window_in_ms* : http://docs.datastax.com/en/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html My understanding is that if a node remains down for more than *max_hint_window_in_ms*, then you will need to repair that node. Thanks, Vasilis On Sat, Oct 24, 2015 at 7:48 AM, Ajay Gargwrote: > If a node in the cluster goes down and comes up, the data gets synced up > on this downed node. > Is there a limit on the interval for which the node can remain down? Or > the data will be synced up even if the node remains down for > weeks/months/years? > > > > -- > Regards, > Ajay >