Re: (unofficial) Community Poll for Production Operators : Repair
On Fri, May 10, 2013 at 11:24 AM, Robert Coli wrote: > I have been wondering how Repair is actually used by operators. If > people operating Cassandra in production could answer the following > questions, I would greatly appreciate it. > https://issues.apache.org/jira/browse/CASSANDRA-5850 Filed based in part on feedback from this thread. Thanks to all participants! :D =Rob
Re: (unofficial) Community Poll for Production Operators : Repair
I indeed had some of those in the past. But my point is not that much to understand how I can get different counts depending on the node (I consider this as a weakness of counters and I am aware of it), my wonder is more why those inconsistent, distinct counters never converge even after a repair. Your last comment on this JIRA summarize quite well our problem. I hope that commiters will find out something. 2013/5/16 Janne Jalkanen > > Might you be experiencing this? > https://issues.apache.org/jira/browse/CASSANDRA-4417 > > /Janne > > On May 16, 2013, at 14:49 , Alain RODRIGUEZ wrote: > > @Rob: Thanks about the feedback. > > Yet I have a weird behavior still unexplained about repairing. Are > counters supposed to be "repaired" too ? I mean, while reading at CL.ONE I > can have different values depending on what node is answering. Even after a > read repair or a full repair. Shouldn't a repair fix these discrepancies ? > > The only way I found to get always the same count is to read data at > CL.QUORUM, but this is a workaround since the data itself remains wrong on > some nodes. > > Any clue on it ? > > Alain > > 2013/5/15 Edward Capriolo > >> http://basho.com/introducing-riak-1-3/ >> >> Introduced Active Anti-Entropy. Riak now has active anti-entropy. In >> distributed systems, inconsistencies can arise between replicas due to >> failure modes, concurrent updates, and physical data loss or corruption. >> Pre-1.3 Riak already had several features for repairing this “entropy”, but >> they all required some form of user intervention. Riak 1.3 introduces >> automatic, self-healing properties that repair entropy on an ongoing basis. >> >> >> On Wed, May 15, 2013 at 5:32 PM, Robert Coli wrote: >> >>> On Wed, May 15, 2013 at 1:27 AM, Alain RODRIGUEZ >>> wrote: >>> > Rob, I was wondering something. Are you a commiter working on >>> improving the >>> > repair or something similar ? >>> >>> I am not a committer [1], but I have an active interest in potential >>> improvements to the best practices for repair. The specific change >>> that I am considering is a modification to the default >>> gc_grace_seconds value, which seems picked out of a hat at 10 days. My >>> view is that the current implementation of repair has such negative >>> performance consequences that I do not believe that holding onto >>> tombstones for longer than 10 days could possibly be as bad as the >>> fixed cost of running repair once every 10 days. I believe that this >>> value is too low for a default (it also does not map cleanly to the >>> work week!) and likely should be increased to 14, 21 or 28 days. >>> >>> > Anyway, if a commiter (or any other expert) could give us some >>> feedback on >>> > our comments (Are we doing well or not, whether things we observe are >>> normal >>> > or unexplained, what is going to be improved in the future about >>> repair...) >>> >>> 1) you are doing things according to best practice >>> 2) unfortunately your experience with significantly degraded >>> performance, including a blocked go-live due to repair bloat is pretty >>> typical >>> 3) the things you are experiencing are part of the current >>> implementation of repair and are also typical, however I do not >>> believe they are fully "explained" [2] >>> 4) as has been mentioned further down thread, there are discussions >>> regarding (and some already committed) improvements to both the >>> current repair paradigm and an evolution to a new paradigm >>> >>> Thanks to all for the responses so far, please keep them coming! :D >>> >>> =Rob >>> [1] hence the (unofficial) tag for this thread. I do have minor >>> patches accepted to the codebase, but always merged by an actual >>> committer. :) >>> [2] driftx@#cassandra feels that these things are explained/understood >>> by core team, and points to >>> https://issues.apache.org/jira/browse/CASSANDRA-5280 as a useful >>> approach to minimize same. >>> >> >> > >
Re: (unofficial) Community Poll for Production Operators : Repair
Might you be experiencing this? https://issues.apache.org/jira/browse/CASSANDRA-4417 /Janne On May 16, 2013, at 14:49 , Alain RODRIGUEZ wrote: > @Rob: Thanks about the feedback. > > Yet I have a weird behavior still unexplained about repairing. Are counters > supposed to be "repaired" too ? I mean, while reading at CL.ONE I can have > different values depending on what node is answering. Even after a read > repair or a full repair. Shouldn't a repair fix these discrepancies ? > > The only way I found to get always the same count is to read data at > CL.QUORUM, but this is a workaround since the data itself remains wrong on > some nodes. > > Any clue on it ? > > Alain > > 2013/5/15 Edward Capriolo > http://basho.com/introducing-riak-1-3/ > > Introduced Active Anti-Entropy. Riak now has active anti-entropy. In > distributed systems, inconsistencies can arise between replicas due to > failure modes, concurrent updates, and physical data loss or corruption. > Pre-1.3 Riak already had several features for repairing this “entropy”, but > they all required some form of user intervention. Riak 1.3 introduces > automatic, self-healing properties that repair entropy on an ongoing basis. > > > On Wed, May 15, 2013 at 5:32 PM, Robert Coli wrote: > On Wed, May 15, 2013 at 1:27 AM, Alain RODRIGUEZ wrote: > > Rob, I was wondering something. Are you a commiter working on improving the > > repair or something similar ? > > I am not a committer [1], but I have an active interest in potential > improvements to the best practices for repair. The specific change > that I am considering is a modification to the default > gc_grace_seconds value, which seems picked out of a hat at 10 days. My > view is that the current implementation of repair has such negative > performance consequences that I do not believe that holding onto > tombstones for longer than 10 days could possibly be as bad as the > fixed cost of running repair once every 10 days. I believe that this > value is too low for a default (it also does not map cleanly to the > work week!) and likely should be increased to 14, 21 or 28 days. > > > Anyway, if a commiter (or any other expert) could give us some feedback on > > our comments (Are we doing well or not, whether things we observe are normal > > or unexplained, what is going to be improved in the future about repair...) > > 1) you are doing things according to best practice > 2) unfortunately your experience with significantly degraded > performance, including a blocked go-live due to repair bloat is pretty > typical > 3) the things you are experiencing are part of the current > implementation of repair and are also typical, however I do not > believe they are fully "explained" [2] > 4) as has been mentioned further down thread, there are discussions > regarding (and some already committed) improvements to both the > current repair paradigm and an evolution to a new paradigm > > Thanks to all for the responses so far, please keep them coming! :D > > =Rob > [1] hence the (unofficial) tag for this thread. I do have minor > patches accepted to the codebase, but always merged by an actual > committer. :) > [2] driftx@#cassandra feels that these things are explained/understood > by core team, and points to > https://issues.apache.org/jira/browse/CASSANDRA-5280 as a useful > approach to minimize same. > >
Re: (unofficial) Community Poll for Production Operators : Repair
@Rob: Thanks about the feedback. Yet I have a weird behavior still unexplained about repairing. Are counters supposed to be "repaired" too ? I mean, while reading at CL.ONE I can have different values depending on what node is answering. Even after a read repair or a full repair. Shouldn't a repair fix these discrepancies ? The only way I found to get always the same count is to read data at CL.QUORUM, but this is a workaround since the data itself remains wrong on some nodes. Any clue on it ? Alain 2013/5/15 Edward Capriolo > http://basho.com/introducing-riak-1-3/ > > Introduced Active Anti-Entropy. Riak now has active anti-entropy. In > distributed systems, inconsistencies can arise between replicas due to > failure modes, concurrent updates, and physical data loss or corruption. > Pre-1.3 Riak already had several features for repairing this “entropy”, but > they all required some form of user intervention. Riak 1.3 introduces > automatic, self-healing properties that repair entropy on an ongoing basis. > > > On Wed, May 15, 2013 at 5:32 PM, Robert Coli wrote: > >> On Wed, May 15, 2013 at 1:27 AM, Alain RODRIGUEZ >> wrote: >> > Rob, I was wondering something. Are you a commiter working on improving >> the >> > repair or something similar ? >> >> I am not a committer [1], but I have an active interest in potential >> improvements to the best practices for repair. The specific change >> that I am considering is a modification to the default >> gc_grace_seconds value, which seems picked out of a hat at 10 days. My >> view is that the current implementation of repair has such negative >> performance consequences that I do not believe that holding onto >> tombstones for longer than 10 days could possibly be as bad as the >> fixed cost of running repair once every 10 days. I believe that this >> value is too low for a default (it also does not map cleanly to the >> work week!) and likely should be increased to 14, 21 or 28 days. >> >> > Anyway, if a commiter (or any other expert) could give us some feedback >> on >> > our comments (Are we doing well or not, whether things we observe are >> normal >> > or unexplained, what is going to be improved in the future about >> repair...) >> >> 1) you are doing things according to best practice >> 2) unfortunately your experience with significantly degraded >> performance, including a blocked go-live due to repair bloat is pretty >> typical >> 3) the things you are experiencing are part of the current >> implementation of repair and are also typical, however I do not >> believe they are fully "explained" [2] >> 4) as has been mentioned further down thread, there are discussions >> regarding (and some already committed) improvements to both the >> current repair paradigm and an evolution to a new paradigm >> >> Thanks to all for the responses so far, please keep them coming! :D >> >> =Rob >> [1] hence the (unofficial) tag for this thread. I do have minor >> patches accepted to the codebase, but always merged by an actual >> committer. :) >> [2] driftx@#cassandra feels that these things are explained/understood >> by core team, and points to >> https://issues.apache.org/jira/browse/CASSANDRA-5280 as a useful >> approach to minimize same. >> > >
Re: (unofficial) Community Poll for Production Operators : Repair
http://basho.com/introducing-riak-1-3/ Introduced Active Anti-Entropy. Riak now has active anti-entropy. In distributed systems, inconsistencies can arise between replicas due to failure modes, concurrent updates, and physical data loss or corruption. Pre-1.3 Riak already had several features for repairing this “entropy”, but they all required some form of user intervention. Riak 1.3 introduces automatic, self-healing properties that repair entropy on an ongoing basis. On Wed, May 15, 2013 at 5:32 PM, Robert Coli wrote: > On Wed, May 15, 2013 at 1:27 AM, Alain RODRIGUEZ > wrote: > > Rob, I was wondering something. Are you a commiter working on improving > the > > repair or something similar ? > > I am not a committer [1], but I have an active interest in potential > improvements to the best practices for repair. The specific change > that I am considering is a modification to the default > gc_grace_seconds value, which seems picked out of a hat at 10 days. My > view is that the current implementation of repair has such negative > performance consequences that I do not believe that holding onto > tombstones for longer than 10 days could possibly be as bad as the > fixed cost of running repair once every 10 days. I believe that this > value is too low for a default (it also does not map cleanly to the > work week!) and likely should be increased to 14, 21 or 28 days. > > > Anyway, if a commiter (or any other expert) could give us some feedback > on > > our comments (Are we doing well or not, whether things we observe are > normal > > or unexplained, what is going to be improved in the future about > repair...) > > 1) you are doing things according to best practice > 2) unfortunately your experience with significantly degraded > performance, including a blocked go-live due to repair bloat is pretty > typical > 3) the things you are experiencing are part of the current > implementation of repair and are also typical, however I do not > believe they are fully "explained" [2] > 4) as has been mentioned further down thread, there are discussions > regarding (and some already committed) improvements to both the > current repair paradigm and an evolution to a new paradigm > > Thanks to all for the responses so far, please keep them coming! :D > > =Rob > [1] hence the (unofficial) tag for this thread. I do have minor > patches accepted to the codebase, but always merged by an actual > committer. :) > [2] driftx@#cassandra feels that these things are explained/understood > by core team, and points to > https://issues.apache.org/jira/browse/CASSANDRA-5280 as a useful > approach to minimize same. >
Re: (unofficial) Community Poll for Production Operators : Repair
On Wed, May 15, 2013 at 1:27 AM, Alain RODRIGUEZ wrote: > Rob, I was wondering something. Are you a commiter working on improving the > repair or something similar ? I am not a committer [1], but I have an active interest in potential improvements to the best practices for repair. The specific change that I am considering is a modification to the default gc_grace_seconds value, which seems picked out of a hat at 10 days. My view is that the current implementation of repair has such negative performance consequences that I do not believe that holding onto tombstones for longer than 10 days could possibly be as bad as the fixed cost of running repair once every 10 days. I believe that this value is too low for a default (it also does not map cleanly to the work week!) and likely should be increased to 14, 21 or 28 days. > Anyway, if a commiter (or any other expert) could give us some feedback on > our comments (Are we doing well or not, whether things we observe are normal > or unexplained, what is going to be improved in the future about repair...) 1) you are doing things according to best practice 2) unfortunately your experience with significantly degraded performance, including a blocked go-live due to repair bloat is pretty typical 3) the things you are experiencing are part of the current implementation of repair and are also typical, however I do not believe they are fully "explained" [2] 4) as has been mentioned further down thread, there are discussions regarding (and some already committed) improvements to both the current repair paradigm and an evolution to a new paradigm Thanks to all for the responses so far, please keep them coming! :D =Rob [1] hence the (unofficial) tag for this thread. I do have minor patches accepted to the codebase, but always merged by an actual committer. :) [2] driftx@#cassandra feels that these things are explained/understood by core team, and points to https://issues.apache.org/jira/browse/CASSANDRA-5280 as a useful approach to minimize same.
Re: (unofficial) Community Poll for Production Operators : Repair
I have actually tested repair in many interesting scenarios: Once I joined a node and forgot autobootstrap=true So the data looked like this in the ring left node 8GB new node 0GB right node 8GB After repair left node 10 GB new node 13 gb right node 12 gb We do not run repair at all. It is better then the 0.6 and 0.7 days, but a missed delete does not mean much to us. The difference between an 8gb sstable and a 12 gb one could be major performance for us since we do thousands of reads/sec. On Wed, May 15, 2013 at 5:37 AM, André Cruz wrote: > On May 10, 2013, at 7:24 PM, Robert Coli wrote: > > > 1) What version of Cassandra do you run, on what hardware? > > 1.1.5 - 6 nodes, 32GB RAM, 300GB data per node, 900GB 10k RAID1, Intel(R) > Xeon(R) CPU E5-2609 0 @ 2.40GHz. > > > 2) What consistency level do you write at? Do you do DELETEs? > > QUORUM. Yes, we do deletes. > > > 3) Do you run a regularly scheduled repair? > > Yes. > > > 4) If you answered "yes" to 3, what is the frequency of the repair? > > Every 2 days. > > > 5) What has been your subjective experience with the performance of > > repair? (Does it work as you would expect? Does its overhead have a > > significant impact on the performance of your cluster?) > > It works as we expect, it has some impact on performance, but it takes a > long time. We used to run daily repairs, but they started overlapping. The > reason for more frequent repairs is that we do a lot of deletes so we > lowered the gc_grace_period otherwise the dataset would grow too large. > > André > >
Re: (unofficial) Community Poll for Production Operators : Repair
On May 10, 2013, at 7:24 PM, Robert Coli wrote: > 1) What version of Cassandra do you run, on what hardware? 1.1.5 - 6 nodes, 32GB RAM, 300GB data per node, 900GB 10k RAID1, Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz. > 2) What consistency level do you write at? Do you do DELETEs? QUORUM. Yes, we do deletes. > 3) Do you run a regularly scheduled repair? Yes. > 4) If you answered "yes" to 3, what is the frequency of the repair? Every 2 days. > 5) What has been your subjective experience with the performance of > repair? (Does it work as you would expect? Does its overhead have a > significant impact on the performance of your cluster?) It works as we expect, it has some impact on performance, but it takes a long time. We used to run daily repairs, but they started overlapping. The reason for more frequent repairs is that we do a lot of deletes so we lowered the gc_grace_period otherwise the dataset would grow too large. André
Re: (unofficial) Community Poll for Production Operators : Repair
Hi Alain, have you had a look at the following tickets? CASSANDRA-4905 - Repair should exclude gcable tombstones from merkle-tree computation CASSANDRA-4932 - Agree on a gcbefore/expirebefore value for all replica during validation compaction CASSANDRA-4917 - Optimize tombstone creation for ExpiringColumns CASSANDRA-5398 - Remove localTimestamp from merkle-tree calculation (for tombstones) Imho these should reduce the over-repair to some degree. Especially when using TTL. Some of them are already fixed in 1.2. The rest will (hopefully) follow :-) cheers, Christian On Wed, May 15, 2013 at 10:27 AM, Alain RODRIGUEZ wrote: > Rob, I was wondering something. Are you a commiter working on improving > the repair or something similar ? > > Anyway, if a commiter (or any other expert) could give us some feedback on > our comments (Are we doing well or not, whether things we observe are > normal or unexplained, what is going to be improved in the future about > repair...) > > I am always interested on hearing about how things work and whether I am > doing well or not. > > Alain > >
Re: (unofficial) Community Poll for Production Operators : Repair
Rob, I was wondering something. Are you a commiter working on improving the repair or something similar ? Anyway, if a commiter (or any other expert) could give us some feedback on our comments (Are we doing well or not, whether things we observe are normal or unexplained, what is going to be improved in the future about repair...) I am always interested on hearing about how things work and whether I am doing well or not. Alain 2013/5/14 Wei Zhu > 1) 1.1.6 on 5 nodes, 24CPU, 72 RAM > 2) local quorum (we only have one DC though). We do delete through TTL > 3) yes > 4) once a week rolling repairs -pr using cron job > 5) it definitely has negative impact on the performance. Our data size is > around 100G per node and during repair it brings in additional 60G - 80G > data and created about 7K compaction (We use LCS with SSTable size of 10M > which was a mistake we made at the beginning). It takes more than a day for > the compaction tasks to clear and by then the next compaction starts. We > had to set client side (Hector) timeout to deal with it and the SLA is > still under control for now. > But we had to halt go live for another cluster due to the unanticipated > "double" the space during the repair. > > Per Dean's question to simulate the slow response, someone in the IRC > mentioned a trick to start Cassandra with -f and ctrl-z and it works for > our test. > > -Wei > -- > *From: *"Dean Hiller" > *To: *user@cassandra.apache.org > *Sent: *Tuesday, May 14, 2013 4:48:02 AM > > *Subject: *Re: (unofficial) Community Poll for Production Operators : > Repair > > We had to roll out a fix in cassandra as a slow node was slowing down our > clients of cassandra in 1.2.2 for some reason. Every time we had a slow > node, we found out fast as performance degraded. We tested this in QA and > had the same issue. This means a repair made that node slow which made our > clients slow. With this fix which I think one our team is going to try to > get it back into cassandra, the slow node does not affect our clients > anymore. > > I am curious though, if someone else would use the "tc" program to > simulate linux packet delay on a single node, does your client's response > time get much slower? We simulated a 500ms delay on the node to simulate > the slow node….it seems the co-ordinator node was incorrectly waiting for > BOTH responses on CL_QUOROM instead of just one (as itself was one as well) > or something like that. (I don't know too much as my colleague was the one > that debugged this issue) > > Dean > > From: Alain RODRIGUEZ mailto:arodr...@gmail.com>> > Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" < > user@cassandra.apache.org<mailto:user@cassandra.apache.org>> > Date: Tuesday, May 14, 2013 1:42 AM > To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" < > user@cassandra.apache.org<mailto:user@cassandra.apache.org>> > Subject: Re: (unofficial) Community Poll for Production Operators : Repair > > Hi Rob, > > 1) 1.2.2 on 6 to 12 EC2 m1.xlarge > 2) Quorum R&W . Almost no deletes (just some TTL) > 3) Yes > 4) On each node once a week (rolling repairs using crontab) > 5) The only behavior that is quite odd or unexplained to me is why a > repair doesn't fix a counter mismatch between 2 nodes. I mean when I read > my counters with a CL.One I have inconsistency (the counter value may > change anytime I read it, depending, I guess, on what node I read from. > Reading with CL.Quorum fixes this bug, but the data is still wrong on some > nodes. About performance, it's quite expensive to run a repair but doing it > in a low charge period and in a rolling fashion works quite well and has no > impact on the service. > > Hope this will help somehow. Let me know if you need more information. > > Alain > > > > 2013/5/10 Robert Coli mailto:rc...@eventbrite.com>> > Hi! > > I have been wondering how Repair is actually used by operators. If > people operating Cassandra in production could answer the following > questions, I would greatly appreciate it. > > 1) What version of Cassandra do you run, on what hardware? > 2) What consistency level do you write at? Do you do DELETEs? > 3) Do you run a regularly scheduled repair? > 4) If you answered "yes" to 3, what is the frequency of the repair? > 5) What has been your subjective experience with the performance of > repair? (Does it work as you would expect? Does its overhead have a > significant impact on the performance of your cluster?) > > Thanks! > > =Rob > > >
Re: (unofficial) Community Poll for Production Operators : Repair
1) 1.1.6 on 5 nodes, 24CPU, 72 RAM 2) local quorum (we only have one DC though). We do delete through TTL 3) yes 4) once a week rolling repairs -pr using cron job 5) it definitely has negative impact on the performance. Our data size is around 100G per node and during repair it brings in additional 60G - 80G data and created about 7K compaction (We use LCS with SSTable size of 10M which was a mistake we made at the beginning). It takes more than a day for the compaction tasks to clear and by then the next compaction starts. We had to set client side (Hector) timeout to deal with it and the SLA is still under control for now. But we had to halt go live for another cluster due to the unanticipated "double" the space during the repair. Per Dean's question to simulate the slow response, someone in the IRC mentioned a trick to start Cassandra with -f and ctrl-z and it works for our test. -Wei - Original Message - From: "Dean Hiller" To: user@cassandra.apache.org Sent: Tuesday, May 14, 2013 4:48:02 AM Subject: Re: (unofficial) Community Poll for Production Operators : Repair We had to roll out a fix in cassandra as a slow node was slowing down our clients of cassandra in 1.2.2 for some reason. Every time we had a slow node, we found out fast as performance degraded. We tested this in QA and had the same issue. This means a repair made that node slow which made our clients slow. With this fix which I think one our team is going to try to get it back into cassandra, the slow node does not affect our clients anymore. I am curious though, if someone else would use the "tc" program to simulate linux packet delay on a single node, does your client's response time get much slower? We simulated a 500ms delay on the node to simulate the slow node….it seems the co-ordinator node was incorrectly waiting for BOTH responses on CL_QUOROM instead of just one (as itself was one as well) or something like that. (I don't know too much as my colleague was the one that debugged this issue) Dean From: Alain RODRIGUEZ mailto:arodr...@gmail.com>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Date: Tuesday, May 14, 2013 1:42 AM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Subject: Re: (unofficial) Community Poll for Production Operators : Repair Hi Rob, 1) 1.2.2 on 6 to 12 EC2 m1.xlarge 2) Quorum R&W . Almost no deletes (just some TTL) 3) Yes 4) On each node once a week (rolling repairs using crontab) 5) The only behavior that is quite odd or unexplained to me is why a repair doesn't fix a counter mismatch between 2 nodes. I mean when I read my counters with a CL.One I have inconsistency (the counter value may change anytime I read it, depending, I guess, on what node I read from. Reading with CL.Quorum fixes this bug, but the data is still wrong on some nodes. About performance, it's quite expensive to run a repair but doing it in a low charge period and in a rolling fashion works quite well and has no impact on the service. Hope this will help somehow. Let me know if you need more information. Alain 2013/5/10 Robert Coli mailto:rc...@eventbrite.com>> Hi! I have been wondering how Repair is actually used by operators. If people operating Cassandra in production could answer the following questions, I would greatly appreciate it. 1) What version of Cassandra do you run, on what hardware? 2) What consistency level do you write at? Do you do DELETEs? 3) Do you run a regularly scheduled repair? 4) If you answered "yes" to 3, what is the frequency of the repair? 5) What has been your subjective experience with the performance of repair? (Does it work as you would expect? Does its overhead have a significant impact on the performance of your cluster?) Thanks! =Rob
RE: (unofficial) Community Poll for Production Operators : Repair
> 1) What version of Cassandra do you run, on what hardware? 1.0.12 (upgrade to 1.2.x is planned) Blade servers with 1x6 CPU cores with HT (12 vcores) (upgradable to 2x CPUs) 96GB RAM (upgrade is planned to 128GB, 256GB max) 1x300GB 15k Data and 1x300GB 10k CommitLog/System SAS HDDs > 2) What consistency level do you write at? Do you do DELETEs? Write/Delete failover policy (where needed): try QUORUM then ONE finally ANY. > 3) Do you run a regularly scheduled repair? NO, read repair is enough (where needed). > 4) If you answered "yes" to 3, what is the frequency of the repair? If we'll do it, we'll do it once a day. > 5) What has been your subjective experience with the performance of > repair? (Does it work as you would expect? Does its overhead have a > significant impact on the performance of your cluster?) For our use case it has too much significant impact on performance of the cluster without real value. Best regards / Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.jevdoki...@adform.com Phone: +370 5 212 3063 Fax: +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.
Re: (unofficial) Community Poll for Production Operators : Repair
We had to roll out a fix in cassandra as a slow node was slowing down our clients of cassandra in 1.2.2 for some reason. Every time we had a slow node, we found out fast as performance degraded. We tested this in QA and had the same issue. This means a repair made that node slow which made our clients slow. With this fix which I think one our team is going to try to get it back into cassandra, the slow node does not affect our clients anymore. I am curious though, if someone else would use the "tc" program to simulate linux packet delay on a single node, does your client's response time get much slower? We simulated a 500ms delay on the node to simulate the slow node….it seems the co-ordinator node was incorrectly waiting for BOTH responses on CL_QUOROM instead of just one (as itself was one as well) or something like that. (I don't know too much as my colleague was the one that debugged this issue) Dean From: Alain RODRIGUEZ mailto:arodr...@gmail.com>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Date: Tuesday, May 14, 2013 1:42 AM To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Subject: Re: (unofficial) Community Poll for Production Operators : Repair Hi Rob, 1) 1.2.2 on 6 to 12 EC2 m1.xlarge 2) Quorum R&W . Almost no deletes (just some TTL) 3) Yes 4) On each node once a week (rolling repairs using crontab) 5) The only behavior that is quite odd or unexplained to me is why a repair doesn't fix a counter mismatch between 2 nodes. I mean when I read my counters with a CL.One I have inconsistency (the counter value may change anytime I read it, depending, I guess, on what node I read from. Reading with CL.Quorum fixes this bug, but the data is still wrong on some nodes. About performance, it's quite expensive to run a repair but doing it in a low charge period and in a rolling fashion works quite well and has no impact on the service. Hope this will help somehow. Let me know if you need more information. Alain 2013/5/10 Robert Coli mailto:rc...@eventbrite.com>> Hi! I have been wondering how Repair is actually used by operators. If people operating Cassandra in production could answer the following questions, I would greatly appreciate it. 1) What version of Cassandra do you run, on what hardware? 2) What consistency level do you write at? Do you do DELETEs? 3) Do you run a regularly scheduled repair? 4) If you answered "yes" to 3, what is the frequency of the repair? 5) What has been your subjective experience with the performance of repair? (Does it work as you would expect? Does its overhead have a significant impact on the performance of your cluster?) Thanks! =Rob
Re: (unofficial) Community Poll for Production Operators : Repair
Hi Rob, 1) 1.2.2 on 6 to 12 EC2 m1.xlarge 2) Quorum R&W . Almost no deletes (just some TTL) 3) Yes 4) On each node once a week (rolling repairs using crontab) 5) The only behavior that is quite odd or unexplained to me is why a repair doesn't fix a counter mismatch between 2 nodes. I mean when I read my counters with a CL.One I have inconsistency (the counter value may change anytime I read it, depending, I guess, on what node I read from. Reading with CL.Quorum fixes this bug, but the data is still wrong on some nodes. About performance, it's quite expensive to run a repair but doing it in a low charge period and in a rolling fashion works quite well and has no impact on the service. Hope this will help somehow. Let me know if you need more information. Alain 2013/5/10 Robert Coli > Hi! > > I have been wondering how Repair is actually used by operators. If > people operating Cassandra in production could answer the following > questions, I would greatly appreciate it. > > 1) What version of Cassandra do you run, on what hardware? > 2) What consistency level do you write at? Do you do DELETEs? > 3) Do you run a regularly scheduled repair? > 4) If you answered "yes" to 3, what is the frequency of the repair? > 5) What has been your subjective experience with the performance of > repair? (Does it work as you would expect? Does its overhead have a > significant impact on the performance of your cluster?) > > Thanks! > > =Rob >
(unofficial) Community Poll for Production Operators : Repair
Hi! I have been wondering how Repair is actually used by operators. If people operating Cassandra in production could answer the following questions, I would greatly appreciate it. 1) What version of Cassandra do you run, on what hardware? 2) What consistency level do you write at? Do you do DELETEs? 3) Do you run a regularly scheduled repair? 4) If you answered "yes" to 3, what is the frequency of the repair? 5) What has been your subjective experience with the performance of repair? (Does it work as you would expect? Does its overhead have a significant impact on the performance of your cluster?) Thanks! =Rob