Re: performance is drastically degraded after 0.7.8 -- 1.0.11 upgrade
Sorry but you will need to provide details of a specific query or workload that goes slower in 1.0.11. As I said tests have shown improvements in performance in every new release. If you are seeing a significant decrease in performance it may be a workload that has not being considered or a known edge case. Whatever the cause we would need more details to help you. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 5/09/2012, at 4:19 PM, Илья Шипицин chipits...@gmail.com wrote: all tests use similar data access patterns, so every test on 1.0.11 is slower than 0.7.8 recent micros confirms that. 2012/9/5 aaron morton aa...@thelastpickle.com That's slower. the Recent* metrics are the best to look at. They recent each time you look at them. So read them, then run the test, then read them again. You'll need to narrow it down still. e.g. Is there a single test taking a very long time or are all tests running slower ? The Histogram stats can help with that as they provide a spread of latencies. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 5/09/2012, at 12:27 AM, Илья Шипицин chipits...@gmail.com wrote: it was good idea to have a look at StorageProxy :-) 1.0.10 Performance Tests StorageProxy RangeOperations: 546 ReadOperations: 694563 TotalHints: 0 TotalRangeLatencyMicros: 4469484 TotalReadLatencyMicros:245669679 TotalWriteLatencyMicros: 57819722 WriteOperations:208741 0.7.10 Performance Tests StorageProxy RangeOperations: 520 ReadOperations: 671476 TotalRangeLatencyMicros: 2208902 TotalReadLatencyMicros: 162186009 TotalWriteLatencyMicros: 33911222 WriteOperations: 204806 2012/9/3 aaron morton aa...@thelastpickle.com The whole test run is taking longer ? So it could be slower queries or slower test setup / tear down? If you are creating and truncate the KS for each of the 500 tests is that taking longer ? (Schema code has changed a lot 0.7 1.0) Can you log the execution time for tests and find ones that are taking longer ? There are full request metrics available on the StorageProxy JMX object. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 31/08/2012, at 4:45 PM, Илья Шипицин chipits...@gmail.com wrote: we are using functional tests ( ~500 tests in time). it is hard to tell which query is slower, it is slower in general. same hardware. 1 node, 32Gb RAM, 8Gb heap. default cassandra settings. as we are talking about functional tests, so we recreate KS just before tests are run. I do not know how to record queries (there are a lot of them), if you are interested, I can set up a special stand for you. 2012/8/31 aaron morton aa...@thelastpickle.com we are running somewhat queue-like with aggressive write-read patterns. We'll need some more details… How much data ? How many machines ? What is the machine spec ? How many clients ? Is there an example of a slow request ? How are you measuring that it's slow ? Is there anything unusual in the log ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 31/08/2012, at 3:30 AM, Edward Capriolo edlinuxg...@gmail.com wrote: If you move from 7.X to 0.8X or 1.0X you have to rebuild sstables as soon as possible. If you have large bloomfilters you can hit a bug where the bloom filters will not work properly. On Thu, Aug 30, 2012 at 9:44 AM, Илья Шипицин chipits...@gmail.com wrote: we are running somewhat queue-like with aggressive write-read patterns. I was looking for scripting queries from live Cassandra installation, but I didn't find any. is there something like thrift-proxy or other query logging/scripting engine ? 2012/8/30 aaron morton aa...@thelastpickle.com in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 We've not had any reports of a performance drop off. All tests so far have show improvements in both read and write performance. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. The sha1 is created so we can diagnose corruptions in the -Data component of the SSTables. They are not used to save network IO. It is calculated while streaming the Memtable to disk so has no impact on disk IO. While not the fasted algorithm I would assume it's CPU overhead in this case is minimal. there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. Bloom filters are used to test if a row key may exist in an SSTable. any explanation ? If you can provide some more information on your use case we may be able to help. Cheers - Aaron Morton Freelance Developer
Re: performance is drastically degraded after 0.7.8 -- 1.0.11 upgrade
it was good idea to have a look at StorageProxy :-) 1.0.10 Performance Tests StorageProxy RangeOperations: 546 ReadOperations: 694563 TotalHints: 0 TotalRangeLatencyMicros: 4469484 TotalReadLatencyMicros:245669679 TotalWriteLatencyMicros: 57819722 WriteOperations:208741 0.7.10 Performance Tests StorageProxy RangeOperations: 520 ReadOperations: 671476 TotalRangeLatencyMicros: 2208902 TotalReadLatencyMicros: 162186009 TotalWriteLatencyMicros: 33911222 WriteOperations: 204806 2012/9/3 aaron morton aa...@thelastpickle.com The whole test run is taking longer ? So it could be slower queries or slower test setup / tear down? If you are creating and truncate the KS for each of the 500 tests is that taking longer ? (Schema code has changed a lot 0.7 1.0) Can you log the execution time for tests and find ones that are taking longer ? There are full request metrics available on the StorageProxy JMX object. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 31/08/2012, at 4:45 PM, Илья Шипицин chipits...@gmail.com wrote: we are using functional tests ( ~500 tests in time). it is hard to tell which query is slower, it is slower in general. same hardware. 1 node, 32Gb RAM, 8Gb heap. default cassandra settings. as we are talking about functional tests, so we recreate KS just before tests are run. I do not know how to record queries (there are a lot of them), if you are interested, I can set up a special stand for you. 2012/8/31 aaron morton aa...@thelastpickle.com we are running somewhat queue-like with aggressive write-read patterns. We'll need some more details... How much data ? How many machines ? What is the machine spec ? How many clients ? Is there an example of a slow request ? How are you measuring that it's slow ? Is there anything unusual in the log ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 31/08/2012, at 3:30 AM, Edward Capriolo edlinuxg...@gmail.com wrote: If you move from 7.X to 0.8X or 1.0X you have to rebuild sstables as soon as possible. If you have large bloomfilters you can hit a bug where the bloom filters will not work properly. On Thu, Aug 30, 2012 at 9:44 AM, Илья Шипицин chipits...@gmail.com wrote: we are running somewhat queue-like with aggressive write-read patterns. I was looking for scripting queries from live Cassandra installation, but I didn't find any. is there something like thrift-proxy or other query logging/scripting engine ? 2012/8/30 aaron morton aa...@thelastpickle.com in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 We've not had any reports of a performance drop off. All tests so far have show improvements in both read and write performance. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. The sha1 is created so we can diagnose corruptions in the -Data component of the SSTables. They are not used to save network IO. It is calculated while streaming the Memtable to disk so has no impact on disk IO. While not the fasted algorithm I would assume it's CPU overhead in this case is minimal. there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. Bloom filters are used to test if a row key may exist in an SSTable. any explanation ? If you can provide some more information on your use case we may be able to help. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 30/08/2012, at 5:18 AM, Илья Шипицин chipits...@gmail.com wrote: in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 after some investigation carried out I noticed files with sha1 extension (which are missing for Cassandra-0.7.8) in maybeWriteDigest() function I see no option fot switching sha1 digests off. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. why to use one more digest (which have to be calculated), there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. any explanation ? Ilya Shipitsin
Re: performance is drastically degraded after 0.7.8 -- 1.0.11 upgrade
That's slower. the Recent* metrics are the best to look at. They recent each time you look at them. So read them, then run the test, then read them again. You'll need to narrow it down still. e.g. Is there a single test taking a very long time or are all tests running slower ? The Histogram stats can help with that as they provide a spread of latencies. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 5/09/2012, at 12:27 AM, Илья Шипицин chipits...@gmail.com wrote: it was good idea to have a look at StorageProxy :-) 1.0.10 Performance Tests StorageProxy RangeOperations: 546 ReadOperations: 694563 TotalHints: 0 TotalRangeLatencyMicros: 4469484 TotalReadLatencyMicros:245669679 TotalWriteLatencyMicros: 57819722 WriteOperations:208741 0.7.10 Performance Tests StorageProxy RangeOperations: 520 ReadOperations: 671476 TotalRangeLatencyMicros: 2208902 TotalReadLatencyMicros: 162186009 TotalWriteLatencyMicros: 33911222 WriteOperations: 204806 2012/9/3 aaron morton aa...@thelastpickle.com The whole test run is taking longer ? So it could be slower queries or slower test setup / tear down? If you are creating and truncate the KS for each of the 500 tests is that taking longer ? (Schema code has changed a lot 0.7 1.0) Can you log the execution time for tests and find ones that are taking longer ? There are full request metrics available on the StorageProxy JMX object. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 31/08/2012, at 4:45 PM, Илья Шипицин chipits...@gmail.com wrote: we are using functional tests ( ~500 tests in time). it is hard to tell which query is slower, it is slower in general. same hardware. 1 node, 32Gb RAM, 8Gb heap. default cassandra settings. as we are talking about functional tests, so we recreate KS just before tests are run. I do not know how to record queries (there are a lot of them), if you are interested, I can set up a special stand for you. 2012/8/31 aaron morton aa...@thelastpickle.com we are running somewhat queue-like with aggressive write-read patterns. We'll need some more details… How much data ? How many machines ? What is the machine spec ? How many clients ? Is there an example of a slow request ? How are you measuring that it's slow ? Is there anything unusual in the log ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 31/08/2012, at 3:30 AM, Edward Capriolo edlinuxg...@gmail.com wrote: If you move from 7.X to 0.8X or 1.0X you have to rebuild sstables as soon as possible. If you have large bloomfilters you can hit a bug where the bloom filters will not work properly. On Thu, Aug 30, 2012 at 9:44 AM, Илья Шипицин chipits...@gmail.com wrote: we are running somewhat queue-like with aggressive write-read patterns. I was looking for scripting queries from live Cassandra installation, but I didn't find any. is there something like thrift-proxy or other query logging/scripting engine ? 2012/8/30 aaron morton aa...@thelastpickle.com in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 We've not had any reports of a performance drop off. All tests so far have show improvements in both read and write performance. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. The sha1 is created so we can diagnose corruptions in the -Data component of the SSTables. They are not used to save network IO. It is calculated while streaming the Memtable to disk so has no impact on disk IO. While not the fasted algorithm I would assume it's CPU overhead in this case is minimal. there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. Bloom filters are used to test if a row key may exist in an SSTable. any explanation ? If you can provide some more information on your use case we may be able to help. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 30/08/2012, at 5:18 AM, Илья Шипицин chipits...@gmail.com wrote: in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 after some investigation carried out I noticed files with sha1 extension (which are missing for Cassandra-0.7.8) in maybeWriteDigest() function I see no option fot switching sha1 digests off. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. why to use one more digest (which have to be calculated), there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. any explanation ? Ilya
Re: performance is drastically degraded after 0.7.8 -- 1.0.11 upgrade
all tests use similar data access patterns, so every test on 1.0.11 is slower than 0.7.8 recent micros confirms that. 2012/9/5 aaron morton aa...@thelastpickle.com That's slower. the Recent* metrics are the best to look at. They recent each time you look at them. So read them, then run the test, then read them again. You'll need to narrow it down still. e.g. Is there a single test taking a very long time or are all tests running slower ? The Histogram stats can help with that as they provide a spread of latencies. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 5/09/2012, at 12:27 AM, Илья Шипицин chipits...@gmail.com wrote: it was good idea to have a look at StorageProxy :-) 1.0.10 Performance Tests StorageProxy RangeOperations: 546 ReadOperations: 694563 TotalHints: 0 TotalRangeLatencyMicros: 4469484 TotalReadLatencyMicros:245669679 TotalWriteLatencyMicros: 57819722 WriteOperations:208741 0.7.10 Performance Tests StorageProxy RangeOperations: 520 ReadOperations: 671476 TotalRangeLatencyMicros: 2208902 TotalReadLatencyMicros: 162186009 TotalWriteLatencyMicros: 33911222 WriteOperations: 204806 2012/9/3 aaron morton aa...@thelastpickle.com The whole test run is taking longer ? So it could be slower queries or slower test setup / tear down? If you are creating and truncate the KS for each of the 500 tests is that taking longer ? (Schema code has changed a lot 0.7 1.0) Can you log the execution time for tests and find ones that are taking longer ? There are full request metrics available on the StorageProxy JMX object. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 31/08/2012, at 4:45 PM, Илья Шипицин chipits...@gmail.com wrote: we are using functional tests ( ~500 tests in time). it is hard to tell which query is slower, it is slower in general. same hardware. 1 node, 32Gb RAM, 8Gb heap. default cassandra settings. as we are talking about functional tests, so we recreate KS just before tests are run. I do not know how to record queries (there are a lot of them), if you are interested, I can set up a special stand for you. 2012/8/31 aaron morton aa...@thelastpickle.com we are running somewhat queue-like with aggressive write-read patterns. We'll need some more details... How much data ? How many machines ? What is the machine spec ? How many clients ? Is there an example of a slow request ? How are you measuring that it's slow ? Is there anything unusual in the log ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 31/08/2012, at 3:30 AM, Edward Capriolo edlinuxg...@gmail.com wrote: If you move from 7.X to 0.8X or 1.0X you have to rebuild sstables as soon as possible. If you have large bloomfilters you can hit a bug where the bloom filters will not work properly. On Thu, Aug 30, 2012 at 9:44 AM, Илья Шипицин chipits...@gmail.com wrote: we are running somewhat queue-like with aggressive write-read patterns. I was looking for scripting queries from live Cassandra installation, but I didn't find any. is there something like thrift-proxy or other query logging/scripting engine ? 2012/8/30 aaron morton aa...@thelastpickle.com in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 We've not had any reports of a performance drop off. All tests so far have show improvements in both read and write performance. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. The sha1 is created so we can diagnose corruptions in the -Data component of the SSTables. They are not used to save network IO. It is calculated while streaming the Memtable to disk so has no impact on disk IO. While not the fasted algorithm I would assume it's CPU overhead in this case is minimal. there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. Bloom filters are used to test if a row key may exist in an SSTable. any explanation ? If you can provide some more information on your use case we may be able to help. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 30/08/2012, at 5:18 AM, Илья Шипицин chipits...@gmail.com wrote: in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 after some investigation carried out I noticed files with sha1 extension (which are missing for Cassandra-0.7.8) in maybeWriteDigest() function I see no option fot switching sha1 digests off. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. why to use one more digest (which have to be calculated), there's already
Re: performance is drastically degraded after 0.7.8 -- 1.0.11 upgrade
The whole test run is taking longer ? So it could be slower queries or slower test setup / tear down? If you are creating and truncate the KS for each of the 500 tests is that taking longer ? (Schema code has changed a lot 0.7 1.0) Can you log the execution time for tests and find ones that are taking longer ? There are full request metrics available on the StorageProxy JMX object. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 31/08/2012, at 4:45 PM, Илья Шипицин chipits...@gmail.com wrote: we are using functional tests ( ~500 tests in time). it is hard to tell which query is slower, it is slower in general. same hardware. 1 node, 32Gb RAM, 8Gb heap. default cassandra settings. as we are talking about functional tests, so we recreate KS just before tests are run. I do not know how to record queries (there are a lot of them), if you are interested, I can set up a special stand for you. 2012/8/31 aaron morton aa...@thelastpickle.com we are running somewhat queue-like with aggressive write-read patterns. We'll need some more details… How much data ? How many machines ? What is the machine spec ? How many clients ? Is there an example of a slow request ? How are you measuring that it's slow ? Is there anything unusual in the log ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 31/08/2012, at 3:30 AM, Edward Capriolo edlinuxg...@gmail.com wrote: If you move from 7.X to 0.8X or 1.0X you have to rebuild sstables as soon as possible. If you have large bloomfilters you can hit a bug where the bloom filters will not work properly. On Thu, Aug 30, 2012 at 9:44 AM, Илья Шипицин chipits...@gmail.com wrote: we are running somewhat queue-like with aggressive write-read patterns. I was looking for scripting queries from live Cassandra installation, but I didn't find any. is there something like thrift-proxy or other query logging/scripting engine ? 2012/8/30 aaron morton aa...@thelastpickle.com in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 We've not had any reports of a performance drop off. All tests so far have show improvements in both read and write performance. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. The sha1 is created so we can diagnose corruptions in the -Data component of the SSTables. They are not used to save network IO. It is calculated while streaming the Memtable to disk so has no impact on disk IO. While not the fasted algorithm I would assume it's CPU overhead in this case is minimal. there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. Bloom filters are used to test if a row key may exist in an SSTable. any explanation ? If you can provide some more information on your use case we may be able to help. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 30/08/2012, at 5:18 AM, Илья Шипицин chipits...@gmail.com wrote: in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 after some investigation carried out I noticed files with sha1 extension (which are missing for Cassandra-0.7.8) in maybeWriteDigest() function I see no option fot switching sha1 digests off. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. why to use one more digest (which have to be calculated), there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. any explanation ? Ilya Shipitsin
Re: performance is drastically degraded after 0.7.8 -- 1.0.11 upgrade
we are running somewhat queue-like with aggressive write-read patterns. I was looking for scripting queries from live Cassandra installation, but I didn't find any. is there something like thrift-proxy or other query logging/scripting engine ? 2012/8/30 aaron morton aa...@thelastpickle.com in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 We've not had any reports of a performance drop off. All tests so far have show improvements in both read and write performance. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. The sha1 is created so we can diagnose corruptions in the -Data component of the SSTables. They are not used to save network IO. It is calculated while streaming the Memtable to disk so has no impact on disk IO. While not the fasted algorithm I would assume it's CPU overhead in this case is minimal. there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. Bloom filters are used to test if a row key may exist in an SSTable. any explanation ? If you can provide some more information on your use case we may be able to help. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 30/08/2012, at 5:18 AM, Илья Шипицин chipits...@gmail.com wrote: in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 after some investigation carried out I noticed files with sha1 extension (which are missing for Cassandra-0.7.8) in maybeWriteDigest() function I see no option fot switching sha1 digests off. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. why to use one more digest (which have to be calculated), there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. any explanation ? Ilya Shipitsin
Re: performance is drastically degraded after 0.7.8 -- 1.0.11 upgrade
If you move from 7.X to 0.8X or 1.0X you have to rebuild sstables as soon as possible. If you have large bloomfilters you can hit a bug where the bloom filters will not work properly. On Thu, Aug 30, 2012 at 9:44 AM, Илья Шипицин chipits...@gmail.com wrote: we are running somewhat queue-like with aggressive write-read patterns. I was looking for scripting queries from live Cassandra installation, but I didn't find any. is there something like thrift-proxy or other query logging/scripting engine ? 2012/8/30 aaron morton aa...@thelastpickle.com in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 We've not had any reports of a performance drop off. All tests so far have show improvements in both read and write performance. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. The sha1 is created so we can diagnose corruptions in the -Data component of the SSTables. They are not used to save network IO. It is calculated while streaming the Memtable to disk so has no impact on disk IO. While not the fasted algorithm I would assume it's CPU overhead in this case is minimal. there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. Bloom filters are used to test if a row key may exist in an SSTable. any explanation ? If you can provide some more information on your use case we may be able to help. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 30/08/2012, at 5:18 AM, Илья Шипицин chipits...@gmail.com wrote: in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 after some investigation carried out I noticed files with sha1 extension (which are missing for Cassandra-0.7.8) in maybeWriteDigest() function I see no option fot switching sha1 digests off. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. why to use one more digest (which have to be calculated), there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. any explanation ? Ilya Shipitsin
Re: performance is drastically degraded after 0.7.8 -- 1.0.11 upgrade
we are running somewhat queue-like with aggressive write-read patterns. We'll need some more details… How much data ? How many machines ? What is the machine spec ? How many clients ? Is there an example of a slow request ? How are you measuring that it's slow ? Is there anything unusual in the log ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 31/08/2012, at 3:30 AM, Edward Capriolo edlinuxg...@gmail.com wrote: If you move from 7.X to 0.8X or 1.0X you have to rebuild sstables as soon as possible. If you have large bloomfilters you can hit a bug where the bloom filters will not work properly. On Thu, Aug 30, 2012 at 9:44 AM, Илья Шипицин chipits...@gmail.com wrote: we are running somewhat queue-like with aggressive write-read patterns. I was looking for scripting queries from live Cassandra installation, but I didn't find any. is there something like thrift-proxy or other query logging/scripting engine ? 2012/8/30 aaron morton aa...@thelastpickle.com in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 We've not had any reports of a performance drop off. All tests so far have show improvements in both read and write performance. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. The sha1 is created so we can diagnose corruptions in the -Data component of the SSTables. They are not used to save network IO. It is calculated while streaming the Memtable to disk so has no impact on disk IO. While not the fasted algorithm I would assume it's CPU overhead in this case is minimal. there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. Bloom filters are used to test if a row key may exist in an SSTable. any explanation ? If you can provide some more information on your use case we may be able to help. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 30/08/2012, at 5:18 AM, Илья Шипицин chipits...@gmail.com wrote: in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 after some investigation carried out I noticed files with sha1 extension (which are missing for Cassandra-0.7.8) in maybeWriteDigest() function I see no option fot switching sha1 digests off. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. why to use one more digest (which have to be calculated), there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. any explanation ? Ilya Shipitsin
Re: performance is drastically degraded after 0.7.8 -- 1.0.11 upgrade
we are using functional tests ( ~500 tests in time). it is hard to tell which query is slower, it is slower in general. same hardware. 1 node, 32Gb RAM, 8Gb heap. default cassandra settings. as we are talking about functional tests, so we recreate KS just before tests are run. I do not know how to record queries (there are a lot of them), if you are interested, I can set up a special stand for you. 2012/8/31 aaron morton aa...@thelastpickle.com we are running somewhat queue-like with aggressive write-read patterns. We'll need some more details... How much data ? How many machines ? What is the machine spec ? How many clients ? Is there an example of a slow request ? How are you measuring that it's slow ? Is there anything unusual in the log ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 31/08/2012, at 3:30 AM, Edward Capriolo edlinuxg...@gmail.com wrote: If you move from 7.X to 0.8X or 1.0X you have to rebuild sstables as soon as possible. If you have large bloomfilters you can hit a bug where the bloom filters will not work properly. On Thu, Aug 30, 2012 at 9:44 AM, Илья Шипицин chipits...@gmail.com wrote: we are running somewhat queue-like with aggressive write-read patterns. I was looking for scripting queries from live Cassandra installation, but I didn't find any. is there something like thrift-proxy or other query logging/scripting engine ? 2012/8/30 aaron morton aa...@thelastpickle.com in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 We've not had any reports of a performance drop off. All tests so far have show improvements in both read and write performance. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. The sha1 is created so we can diagnose corruptions in the -Data component of the SSTables. They are not used to save network IO. It is calculated while streaming the Memtable to disk so has no impact on disk IO. While not the fasted algorithm I would assume it's CPU overhead in this case is minimal. there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. Bloom filters are used to test if a row key may exist in an SSTable. any explanation ? If you can provide some more information on your use case we may be able to help. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 30/08/2012, at 5:18 AM, Илья Шипицин chipits...@gmail.com wrote: in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 after some investigation carried out I noticed files with sha1 extension (which are missing for Cassandra-0.7.8) in maybeWriteDigest() function I see no option fot switching sha1 digests off. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. why to use one more digest (which have to be calculated), there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. any explanation ? Ilya Shipitsin
performance is drastically degraded after 0.7.8 -- 1.0.11 upgrade
in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 after some investigation carried out I noticed files with sha1 extension (which are missing for Cassandra-0.7.8) in maybeWriteDigest() function I see no option fot switching sha1 digests off. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. why to use one more digest (which have to be calculated), there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. any explanation ? Ilya Shipitsin
Re: performance is drastically degraded after 0.7.8 -- 1.0.11 upgrade
in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 We've not had any reports of a performance drop off. All tests so far have show improvements in both read and write performance. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. The sha1 is created so we can diagnose corruptions in the -Data component of the SSTables. They are not used to save network IO. It is calculated while streaming the Memtable to disk so has no impact on disk IO. While not the fasted algorithm I would assume it's CPU overhead in this case is minimal. there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. Bloom filters are used to test if a row key may exist in an SSTable. any explanation ? If you can provide some more information on your use case we may be able to help. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 30/08/2012, at 5:18 AM, Илья Шипицин chipits...@gmail.com wrote: in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 after some investigation carried out I noticed files with sha1 extension (which are missing for Cassandra-0.7.8) in maybeWriteDigest() function I see no option fot switching sha1 digests off. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. why to use one more digest (which have to be calculated), there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. any explanation ? Ilya Shipitsin