Re: Upgrade to a different version?
I'm at a crossroads right now. We built an application around .7 and the features in .7, so going back to .6 wasn't an option for us. Now, we are in the middle of setting up dual mysql and cassandra support so that we can fallback to mysql if Cassandra can't handle the workload properly. It's a stupid amount of extra work, but I think it's unavoidable for us given the state of things with .7. It also gives us the benefit of seeing the true benefit of Cassandra over mysql in our particular application and make a decision from there. Paul On 3/16/2011 9:03 PM, Joshua Partogi wrote: So did you downgraded it back to 0.6.x series?
Re: Upgrade to a different version?
If it helps you to sleep better, we use cassandra (0.7.2 with the flush fix) in production on 100 servers. Thibaut On Thu, Mar 17, 2011 at 5:58 PM, Paul Pak p...@yellowseo.com wrote: I'm at a crossroads right now. We built an application around .7 and the features in .7, so going back to .6 wasn't an option for us. Now, we are in the middle of setting up dual mysql and cassandra support so that we can fallback to mysql if Cassandra can't handle the workload properly. It's a stupid amount of extra work, but I think it's unavoidable for us given the state of things with .7. It also gives us the benefit of seeing the true benefit of Cassandra over mysql in our particular application and make a decision from there. Paul On 3/16/2011 9:03 PM, Joshua Partogi wrote: So did you downgraded it back to 0.6.x series?
Re: Upgrade to a different version?
On 3/17/2011 1:06 PM, Thibaut Britz wrote: If it helps you to sleep better, we use cassandra (0.7.2 with the flush fix) in production on 100 servers. Thibaut Thanks Thibaut, believe it or not, it does. :) Is your use case a typical web app or something like a scientific/data mining app? I ask because I'm wondering how you have managed to deal with the stop-the-world garbage collection issues that seems to hit most clusters that have significant load and cause application timeouts. Have you found that cassandra scales in read/write capacity reasonably well as you add nodes? Also, you may also want to backport these fixes at a minimum? * reduce memory use during streaming of multiple sstables (CASSANDRA-2301) * update memtable_throughput to be a long (CASSANDRA-2158)
Re: Upgrade to a different version?
As for the version, we will wait a few more days, and if nothing really bad shows up, move to 0.7.4. On Thu, Mar 17, 2011 at 10:40 PM, Thibaut Britz thibaut.br...@trendiction.com wrote: Hi Paul, It's more of a scientific mining app. We crawl websites and extract information from these websites for our clients. For us, it doesn't really matter if one cassandra node replies after 1 second or a few ms, as long as the throughput over time stays high. And so far, this seems to be the case. If you are using hector, be sure to use the latest hector version. There were a few bugs related to error handling in earlier versions. (e.g also threads hanging forever waiting for an answer). I occasionaly see timeouts, but we then just move to another node and retry. Thibaut On Thu, Mar 17, 2011 at 6:53 PM, Paul Pak p...@yellowseo.com wrote: On 3/17/2011 1:06 PM, Thibaut Britz wrote: If it helps you to sleep better, we use cassandra (0.7.2 with the flush fix) in production on 100 servers. Thibaut Thanks Thibaut, believe it or not, it does. :) Is your use case a typical web app or something like a scientific/data mining app? I ask because I'm wondering how you have managed to deal with the stop-the-world garbage collection issues that seems to hit most clusters that have significant load and cause application timeouts. Have you found that cassandra scales in read/write capacity reasonably well as you add nodes? Also, you may also want to backport these fixes at a minimum? * reduce memory use during streaming of multiple sstables (CASSANDRA-2301) * update memtable_throughput to be a long (CASSANDRA-2158)
Re: Upgrade to a different version?
Hi Paul, It's more of a scientific mining app. We crawl websites and extract information from these websites for our clients. For us, it doesn't really matter if one cassandra node replies after 1 second or a few ms, as long as the throughput over time stays high. And so far, this seems to be the case. If you are using hector, be sure to use the latest hector version. There were a few bugs related to error handling in earlier versions. (e.g also threads hanging forever waiting for an answer). I occasionaly see timeouts, but we then just move to another node and retry. Thibaut On Thu, Mar 17, 2011 at 6:53 PM, Paul Pak p...@yellowseo.com wrote: On 3/17/2011 1:06 PM, Thibaut Britz wrote: If it helps you to sleep better, we use cassandra (0.7.2 with the flush fix) in production on 100 servers. Thibaut Thanks Thibaut, believe it or not, it does. :) Is your use case a typical web app or something like a scientific/data mining app? I ask because I'm wondering how you have managed to deal with the stop-the-world garbage collection issues that seems to hit most clusters that have significant load and cause application timeouts. Have you found that cassandra scales in read/write capacity reasonably well as you add nodes? Also, you may also want to backport these fixes at a minimum? * reduce memory use during streaming of multiple sstables (CASSANDRA-2301) * update memtable_throughput to be a long (CASSANDRA-2158)
Re: Upgrade to a different version?
Do people have success stories with 0.7.4? It seems like the list only hears if there's a major problem with a release, which means that if you're trying to judge the stability of a release you're looking for silence. But maybe that means not many people have tried it yet. Is there a record of this anywhere? On Thu, Mar 17, 2011 at 5:41 PM, Thibaut Britz thibaut.br...@trendiction.com wrote: As for the version, we will wait a few more days, and if nothing really bad shows up, move to 0.7.4. On Thu, Mar 17, 2011 at 10:40 PM, Thibaut Britz thibaut.br...@trendiction.com wrote: Hi Paul, It's more of a scientific mining app. We crawl websites and extract information from these websites for our clients. For us, it doesn't really matter if one cassandra node replies after 1 second or a few ms, as long as the throughput over time stays high. And so far, this seems to be the case. If you are using hector, be sure to use the latest hector version. There were a few bugs related to error handling in earlier versions. (e.g also threads hanging forever waiting for an answer). I occasionaly see timeouts, but we then just move to another node and retry. Thibaut On Thu, Mar 17, 2011 at 6:53 PM, Paul Pak p...@yellowseo.com wrote: On 3/17/2011 1:06 PM, Thibaut Britz wrote: If it helps you to sleep better, we use cassandra (0.7.2 with the flush fix) in production on 100 servers. Thibaut Thanks Thibaut, believe it or not, it does. :) Is your use case a typical web app or something like a scientific/data mining app? I ask because I'm wondering how you have managed to deal with the stop-the-world garbage collection issues that seems to hit most clusters that have significant load and cause application timeouts. Have you found that cassandra scales in read/write capacity reasonably well as you add nodes? Also, you may also want to backport these fixes at a minimum? * reduce memory use during streaming of multiple sstables (CASSANDRA-2301) * update memtable_throughput to be a long (CASSANDRA-2158)
Upgrade to a different version?
We are running 0.6.6 and are considering upgrading to either 0.6.8 or one of the 0.7.x releases. What is the recommended version and procedure? What are the issues we face? Are there any specific storage gotchas we need to be aware of? Are there any docs around this process for review? Thanks, jake -- Jake Maizel Soundcloud Mail GTalk: j...@soundcloud.com Skype: jakecloud Rosenthaler strasse 13, 101 19, Berlin, DE
Re: Upgrade to a different version?
Hi Jake, I'm sending this privately, because I wanted to tell you my opinion frankly. I don't know about the .6 series or .74, but so far, all of the .7 series of cassandra has been a disaster. I would think twice about switching to anything in .7 series to production until things stabilize and at least one reasonably large site starts using cassandra .7. Jonathan claims reddit is using cassandra, but it can't be a good experience with the type of bugs that have been found. .70 had data corruption issues .71 also had data corruption issues, had major issues with anything over 2 gigs in memory .72 issues with reading properly .73 had major issues with anything over 2 gigs in memory, had issues with performance due to flushing rules being broken, many people had huge issues with large amounts of insertions, and a few had startup issues. .74 too new to say. In either case, do a lot of testing for your use case before switching as things in the .7 series are still way in development. I've talked to Jonathan about putting it into beta status because of the severity of the bugs, but so far, there has been no decision to do so. Good luck. Paul On 3/16/2011 1:21 PM, Jake Maizel wrote: We are running 0.6.6 and are considering upgrading to either 0.6.8 or one of the 0.7.x releases. What is the recommended version and procedure? What are the issues we face? Are there any specific storage gotchas we need to be aware of? Are there any docs around this process for review? Thanks, jake
Re: Upgrade to a different version?
Sorry guys, that was meant to be private. My opinion stands, but I didn't want to hurt any of the dev's feelings by being too frank. I think the progress has been good in new features, but I feel we have taken a step back in relability and scalability since so many features were added without adequate testing. Hopefully, at some point soon, it will get better and doing a data import job won't take a cassandra cluster to it's knees or we won't experience stop the world GC issues and have out of memory errors from routine usage. Paul On 3/16/2011 2:13 PM, Paul Pak wrote: Hi Jake, I'm sending this privately, because I wanted to tell you my opinion frankly. I don't know about the .6 series or .74, but so far, all of the .7 series of cassandra has been a disaster. I would think twice about switching to anything in .7 series to production until things stabilize and at least one reasonably large site starts using cassandra .7. Jonathan claims reddit is using cassandra, but it can't be a good experience with the type of bugs that have been found. .70 had data corruption issues .71 also had data corruption issues, had major issues with anything over 2 gigs in memory .72 issues with reading properly .73 had major issues with anything over 2 gigs in memory, had issues with performance due to flushing rules being broken, many people had huge issues with large amounts of insertions, and a few had startup issues. .74 too new to say. In either case, do a lot of testing for your use case before switching as things in the .7 series are still way in development. I've talked to Jonathan about putting it into beta status because of the severity of the bugs, but so far, there has been no decision to do so. Good luck. Paul On 3/16/2011 1:21 PM, Jake Maizel wrote: We are running 0.6.6 and are considering upgrading to either 0.6.8 or one of the 0.7.x releases. What is the recommended version and procedure? What are the issues we face? Are there any specific storage gotchas we need to be aware of? Are there any docs around this process for review? Thanks, jake
Re: Upgrade to a different version?
So did you downgraded it back to 0.6.x series? On Thu, Mar 17, 2011 at 6:36 AM, Paul Pak p...@yellowseo.com wrote: Sorry guys, that was meant to be private. My opinion stands, but I didn't want to hurt any of the dev's feelings by being too frank. I think the progress has been good in new features, but I feel we have taken a step back in relability and scalability since so many features were added without adequate testing. Hopefully, at some point soon, it will get better and doing a data import job won't take a cassandra cluster to it's knees or we won't experience stop the world GC issues and have out of memory errors from routine usage. Paul On 3/16/2011 2:13 PM, Paul Pak wrote: Hi Jake, I'm sending this privately, because I wanted to tell you my opinion frankly. I don't know about the .6 series or .74, but so far, all of the .7 series of cassandra has been a disaster. I would think twice about switching to anything in .7 series to production until things stabilize and at least one reasonably large site starts using cassandra .7. Jonathan claims reddit is using cassandra, but it can't be a good experience with the type of bugs that have been found. .70 had data corruption issues .71 also had data corruption issues, had major issues with anything over 2 gigs in memory .72 issues with reading properly .73 had major issues with anything over 2 gigs in memory, had issues with performance due to flushing rules being broken, many people had huge issues with large amounts of insertions, and a few had startup issues. .74 too new to say. In either case, do a lot of testing for your use case before switching as things in the .7 series are still way in development. I've talked to Jonathan about putting it into beta status because of the severity of the bugs, but so far, there has been no decision to do so. Good luck. Paul On 3/16/2011 1:21 PM, Jake Maizel wrote: We are running 0.6.6 and are considering upgrading to either 0.6.8 or one of the 0.7.x releases. What is the recommended version and procedure? What are the issues we face? Are there any specific storage gotchas we need to be aware of? Are there any docs around this process for review? Thanks, jake -- http://twitter.com/jpartogi
Re: Upgrade to a different version?
Paul, Don't feel like you have to hold back when it comes to feedback. There is a place to vote on releases. If you have something that could potentially be critical that you can isolate, by all means chime in. Even if your vote isn't binding if you are not a committer, votes with something credible behind them get taken seriously. Votes happen on the dev@cassandra mailing list. Alternately, feel free to create Jira tickets any time. Also, there are unit tests, integration tests, and distributed tests. If you feel like you can add to any of these, please get involved. It sounds like you already do internal testing so it might be fairly simple to add to some of these tests. Wrt the distributed tests, some devs at twitter along with others have contributed a distributed test harness for Cassandra which has been in 0.7 since 0.7.1. See CASSANDRA-1859 for the beginning and http://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.7/test/ for the latest. This uses apache whirr to spin up some nodes and runs tests over them. In any case, we all want to make a solid release and if you have specifics on what can make it better, it would benefit the whole community. Jeremy On Mar 16, 2011, at 2:36 PM, Paul Pak wrote: Sorry guys, that was meant to be private. My opinion stands, but I didn't want to hurt any of the dev's feelings by being too frank. I think the progress has been good in new features, but I feel we have taken a step back in relability and scalability since so many features were added without adequate testing. Hopefully, at some point soon, it will get better and doing a data import job won't take a cassandra cluster to it's knees or we won't experience stop the world GC issues and have out of memory errors from routine usage. Paul On 3/16/2011 2:13 PM, Paul Pak wrote: Hi Jake, I'm sending this privately, because I wanted to tell you my opinion frankly. I don't know about the .6 series or .74, but so far, all of the .7 series of cassandra has been a disaster. I would think twice about switching to anything in .7 series to production until things stabilize and at least one reasonably large site starts using cassandra .7. Jonathan claims reddit is using cassandra, but it can't be a good experience with the type of bugs that have been found. .70 had data corruption issues .71 also had data corruption issues, had major issues with anything over 2 gigs in memory .72 issues with reading properly .73 had major issues with anything over 2 gigs in memory, had issues with performance due to flushing rules being broken, many people had huge issues with large amounts of insertions, and a few had startup issues. .74 too new to say. In either case, do a lot of testing for your use case before switching as things in the .7 series are still way in development. I've talked to Jonathan about putting it into beta status because of the severity of the bugs, but so far, there has been no decision to do so. Good luck. Paul On 3/16/2011 1:21 PM, Jake Maizel wrote: We are running 0.6.6 and are considering upgrading to either 0.6.8 or one of the 0.7.x releases. What is the recommended version and procedure? What are the issues we face? Are there any specific storage gotchas we need to be aware of? Are there any docs around this process for review? Thanks, jake