Re: Can I create a counter column family with many rows in 1.1.10?
What would be the exact CQL3 syntax to create a counter CF with composite row key and not predefined column names ? Is the following supposed to work ? CREATE TABLE composite_counter ( aid text, key1 text, key2 text, key3 text, value counter, PRIMARY KEY (aid, key1, key2, key3) ) First, when I do so I have no error shown, but I *can't* see this CF appear in my OpsCenter. update composite_counter set value = value + 5 where aid = '1' and key1 = 'test1' and key2 = 'test2' and key3 = 'test3'; works as expected too. But how can I have multiple counter columns using the schemaless property of cassandra ? I mean before, when I created counter CF with cli, things like this used to work: update composite_counter set 'value2' = 'value2' + 5 where aid = '1' and key1 = 'test1' and key2 = 'test2' and key3 = 'test3'; = Bad Request: line 1:29 no viable alternative at input 'value2' I also tried: update composite_counter set value2 = value2 + 5 where aid = '1' and key1 = 'test1' and key2 = 'test2' and key3 = 'test3'; = Bad Request: Unknown identifier value2 (as expected I guess) I want to make a counter CF with composite keys and a lot of counters using this pattern 20130306#event or (20130306, event), not sure if I should use composite columns there. Is it mandatory to create the CF with at least one column with the counter type ? I mean I will probably never use a column named 'value', I defined it just to be sure the CF is defined as a counter CF. 2013/3/6 Abhijit Chanda abhijit.chan...@gmail.com Thanks @aaron for the rectification On Wed, Mar 6, 2013 at 1:17 PM, aaron morton aa...@thelastpickle.comwrote: Note that CQL 3 in 1.1 is compatible with CQL 3 in 1.2. Also you do not have to use CQL 3, you can still use the cassandra-cli to create CF's. The syntax you use to populate it depends on the client you are using. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 5/03/2013, at 9:16 PM, Abhijit Chanda abhijit.chan...@gmail.com wrote: Yes you can , you just have to use CQL3 and 1.1.10 onward cassandra supports CQL3. Just you have to aware of the fact that a column family that contains a counter column can only contain counters. In other other words either all the columns of the column family excluding KEY have the counter type or none of them can have it. Best Regards, -- Abhijit Chanda +91-974395 -- Abhijit Chanda +91-974395
RE: Can I create a counter column family with many rows in 1.1.10?
Ah, I'ts with many columns, not rows. I use this in cql 2-3 create table cnt (key text PRIMARY KEY, y2003 counter, y2004 counter);it says this is not a counter column family, and if I try to use default_validation_class=CounterType,it says this is not a valid keyword.What I'm supposed to type in order to create it? From: aa...@thelastpickle.com Subject: Re: Can I create a counter column family with many rows in 1.1.10? Date: Tue, 5 Mar 2013 23:47:38 -0800 To: user@cassandra.apache.org Note that CQL 3 in 1.1 is compatible with CQL 3 in 1.2. Also you do not have to use CQL 3, you can still use the cassandra-cli to create CF's. The syntax you use to populate it depends on the client you are using. Cheers -Aaron MortonFreelance Cassandra DeveloperNew Zealand @aaronmortonhttp://www.thelastpickle.com On 5/03/2013, at 9:16 PM, Abhijit Chanda abhijit.chan...@gmail.com wrote:Yes you can , you just have to use CQL3 and 1.1.10 onward cassandra supports CQL3. Just you have to aware of the fact that a column family that contains a counter column can only contain counters. In other other words either all the columns of the column family excluding KEY have the counter type or none of them can have it. Best Regards, -- Abhijit Chanda +91-974395
RE: Can I create a counter column family with many rows in 1.1.10?
I got it now. From: mateus.ffrei...@hotmail.com To: user@cassandra.apache.org Subject: RE: Can I create a counter column family with many rows in 1.1.10? Date: Wed, 6 Mar 2013 08:42:37 -0300 Ah, I'ts with many columns, not rows. I use this in cql 2-3 create table cnt (key text PRIMARY KEY, y2003 counter, y2004 counter);it says this is not a counter column family, and if I try to use default_validation_class=CounterType,it says this is not a valid keyword.What I'm supposed to type in order to create it? From: aa...@thelastpickle.com Subject: Re: Can I create a counter column family with many rows in 1.1.10? Date: Tue, 5 Mar 2013 23:47:38 -0800 To: user@cassandra.apache.org Note that CQL 3 in 1.1 is compatible with CQL 3 in 1.2. Also you do not have to use CQL 3, you can still use the cassandra-cli to create CF's. The syntax you use to populate it depends on the client you are using. Cheers -Aaron MortonFreelance Cassandra DeveloperNew Zealand @aaronmortonhttp://www.thelastpickle.com On 5/03/2013, at 9:16 PM, Abhijit Chanda abhijit.chan...@gmail.com wrote:Yes you can , you just have to use CQL3 and 1.1.10 onward cassandra supports CQL3. Just you have to aware of the fact that a column family that contains a counter column can only contain counters. In other other words either all the columns of the column family excluding KEY have the counter type or none of them can have it. Best Regards, -- Abhijit Chanda +91-974395
Re: anyone see this user-cassandra thread get answered...
Wow, that's quite new... Threadjack to ask how to unsubscribe, amazing. Help yourself: https://www.google.com/search?q=unsubscribe+cassandra Any of the first results should help you. Goodbye ! 2013/3/6 deepansh jain deepanshcri...@gmail.com how to unsubscribe from mailing list On Wed, Mar 6, 2013 at 1:06 PM, aaron morton aa...@thelastpickle.comwrote: bah, think I got confused by looking at the version in the email you linked to. if the update CF call is not working, and this is QA, run it with DEBUG logging and file a bug here https://issues.apache.org/jira/browse/CASSANDRA Thanks - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 5/03/2013, at 8:29 AM, Hiller, Dean dean.hil...@nrel.gov wrote: That ticket says it was fixed in 1.1.5 and we are on 1.2.2. We upgraded from 1.1.4 to 1.2.2, ran upgrade tables and watched filenames change from *-he-*.db to *-id-*.db, then changed compaction strategies and still had this issue. Is it the fact we came from 1.1.4? Ours was a very simple 4 node QA test where we setup a 1.1.4 cluster, put data in, upgraded, then upgraded tables, then switched to LCS and run upgrade tables again hoping it would use LCS. Thanks, Dean From: aaron morton aa...@thelastpickle.com mailto:aa...@thelastpickle.com aa...@thelastpickle.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.orguser@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.orguser@cassandra.apache.org Date: Tuesday, March 5, 2013 9:13 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.orguser@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.orguser@cassandra.apache.org Subject: Re: anyone see this user-cassandra thread get answered... Was probably this https://issues.apache.org/jira/browse/CASSANDRA-4597 Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 4/03/2013, at 2:05 PM, Hiller, Dean dean.hil...@nrel.govmailto: dean.hil...@nrel.gov wrote: I was reading http://mail-archives.apache.org/mod_mbox/cassandra-user/201208.mbox/%3CCAGZm5drRh3VXNpHefR9UjH8H=dhad2y18s0xmam5cs4yfl5...@mail.gmail.com%3E As we are having the same issue in 1.2.2. We modify to LCS and cassandra-cli shows us at LCS on any node we run cassandra cli on, but then looking at cqlsh, it is showing us at SizeTieredCompactionStrategy :(. Thanks, Dean
should I file a bug report on this or is this normal?
I ran a pretty solid QA test(cleaned data from scratch) on version 1.2.2 My test was as so 1. Start up 4 node cassandra cluster 2. Populate with initial test data (no other data is added to system after this point!!!) 3. Run nodetool drain on every node(move stuff from commit log to sstables) 4. Stop and start cassandra cluster to have it running again 5. Get size of nreldata CF folder is 128kB 6. Go to node 3, run snapshot and mv snapshots directory OUT of nreldata 7. Get size of nreldata CF folder is 128kB 8. On node 3, run nodetool drain 9. Get size of nreldataCF folder is still 128kB 10. Stop cassandra node 11. Rm keyspace/nreldata/*.db 12. Size of nreldata CF is 8kb(odd of an empty folder but ok) 13. Start cassandra 14. Nodetool repair databus5 nreldata 15. Size of nreldata is now 220K ….it has exploded in size!! I ran this QA test as we see data size explosion in production as well(I can't be 100% sure if this is the same thing though as above is such a small data set). Would leveled compaction be a bit more stable in terms of size ratios and such. QUESTIONS 1. Why is the bloomfilter for level 5 a total of 3856 bytes for 29118(large to small) bytes of data while in the initial data it was 2192 bytes for 43038(small to large) bytes of data? 2. Why is there 3 levels? With such a small set of data, I would think it would flush one data file like the original data but instead there is 3 files. My files after repair have levels 5, 6, and 7. My files before deletion of the CF have just level 1. After repair files are -rw-rw-r--. 1 cassandra cassandra54 Mar 6 07:18 databus5-nreldata-ib-5-CompressionInfo.db -rw-rw-r--. 1 cassandra cassandra 29118 Mar 6 07:18 databus5-nreldata-ib-5-Data.db -rw-rw-r--. 1 cassandra cassandra 3856 Mar 6 07:18 databus5-nreldata-ib-5-Filter.db -rw-rw-r--. 1 cassandra cassandra 37000 Mar 6 07:18 databus5-nreldata-ib-5-Index.db -rw-rw-r--. 1 cassandra cassandra 4772 Mar 6 07:18 databus5-nreldata-ib-5-Statistics.db -rw-rw-r--. 1 cassandra cassandra 383 Mar 6 07:18 databus5-nreldata-ib-5-Summary.db -rw-rw-r--. 1 cassandra cassandra79 Mar 6 07:18 databus5-nreldata-ib-5-TOC.txt -rw-rw-r--. 1 cassandra cassandra46 Mar 6 07:18 databus5-nreldata-ib-6-CompressionInfo.db -rw-rw-r--. 1 cassandra cassandra 14271 Mar 6 07:18 databus5-nreldata-ib-6-Data.db -rw-rw-r--. 1 cassandra cassandra 816 Mar 6 07:18 databus5-nreldata-ib-6-Filter.db -rw-rw-r--. 1 cassandra cassandra 18248 Mar 6 07:18 databus5-nreldata-ib-6-Index.db -rw-rw-r--. 1 cassandra cassandra 4756 Mar 6 07:18 databus5-nreldata-ib-6-Statistics.db -rw-rw-r--. 1 cassandra cassandra 230 Mar 6 07:18 databus5-nreldata-ib-6-Summary.db -rw-rw-r--. 1 cassandra cassandra79 Mar 6 07:18 databus5-nreldata-ib-6-TOC.txt -rw-rw-r--. 1 cassandra cassandra46 Mar 6 07:18 databus5-nreldata-ib-7-CompressionInfo.db -rw-rw-r--. 1 cassandra cassandra 14271 Mar 6 07:18 databus5-nreldata-ib-7-Data.db -rw-rw-r--. 1 cassandra cassandra 816 Mar 6 07:18 databus5-nreldata-ib-7-Filter.db -rw-rw-r--. 1 cassandra cassandra 18248 Mar 6 07:18 databus5-nreldata-ib-7-Index.db -rw-rw-r--. 1 cassandra cassandra 4756 Mar 6 07:18 databus5-nreldata-ib-7-Statistics.db -rw-rw-r--. 1 cassandra cassandra 230 Mar 6 07:18 databus5-nreldata-ib-7-Summary.db -rw-rw-r--. 1 cassandra cassandra79 Mar 6 07:18 databus5-nreldata-ib-7-TOC.txt Before repair files(from my moved snapshot as I moved it out of the directory so cassandra no longer had it)…. -rw-rw-r--. 1 cassandra cassandra62 Mar 6 07:11 databus5-nreldata-ib-1-CompressionInfo.db -rw-rw-r--. 1 cassandra cassandra 43038 Mar 6 07:11 databus5-nreldata-ib-1-Data.db -rw-rw-r--. 1 cassandra cassandra 2192 Mar 6 07:11 databus5-nreldata-ib-1-Filter.db -rw-rw-r--. 1 cassandra cassandra 55248 Mar 6 07:11 databus5-nreldata-ib-1-Index.db -rw-rw-r--. 1 cassandra cassandra 4756 Mar 6 07:11 databus5-nreldata-ib-1-Statistics.db -rw-rw-r--. 1 cassandra cassandra 499 Mar 6 07:11 databus5-nreldata-ib-1-Summary.db -rw-rw-r--. 1 cassandra cassandra79 Mar 6 07:11 databus5-nreldata-ib-1-TOC.txt Thanks, Dean
Re: Cassandra instead of memcached
http://www.slideshare.net/edwardcapriolo/cassandra-as-memcache Read at ONE. READ_REPAIR_CHANCE as low as possible. Use short TTL and short GC_GRACE. Make the in memory memtable size as high as possible to avoid flushing and compacting. Optionally turn off commit log. You can use cassandra like memcache but it is not a memcache replacement. Cassandra persists writes and compacts SSTables, memcache only has to keep data in memory. If you want to try a crazy idea. try putting your persistent data on a ram disk! Not data/system however! On Wed, Mar 6, 2013 at 2:45 AM, aaron morton aa...@thelastpickle.comwrote: consider disabling durable_writes in the KS config to remove writing to the commit log. That will speed things up for you. Note that you risk losing data is cassandra crashes or is not shut down with nodetool drain. Even if you set the gc_grace to 0, deletes will still need to be committed to disk. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 5/03/2013, at 9:51 AM, Drew Kutcharian d...@venarc.com wrote: Thanks Ben, that article was actually the reason I started thinking about removing memcached. I wanted to see what would be the optimum config to use C* as an in-memory store. -- Drew On Mar 5, 2013, at 2:39 AM, Ben Bromhead b...@instaclustr.com wrote: Check out http://techblog.netflix.com/2012/07/benchmarking-high-performance-io-with.html Netflix used Cassandra with SSDs and were able to drop their memcache layer. Mind you they were not using it purely as an in memory KV store. Ben Instaclustr | www.instaclustr.com | @instaclustrhttp://twitter.com/instaclustr On 05/03/2013, at 4:33 PM, Drew Kutcharian d...@venarc.com wrote: Hi Guys, I'm thinking about using Cassandra as an in-memory key/value store instead of memcached for a new project (just to get rid of a dependency if possible). I was thinking about setting the replication factor to 1, enabling off-heap row-cache and setting gc_grace_period to zero for the CF that will be used for the key/value store. Has anyone tried this? Any comments? Thanks, Drew
Re: Can I create a counter column family with many rows in 1.1.10?
If you have one column in the table that is not part of the primary key and is a counter, then all columns that are not part of the primary key must also be a counter. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 6/03/2013, at 2:56 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: What would be the exact CQL3 syntax to create a counter CF with composite row key and not predefined column names ? Is the following supposed to work ? CREATE TABLE composite_counter ( aid text, key1 text, key2 text, key3 text, value counter, PRIMARY KEY (aid, key1, key2, key3) ) First, when I do so I have no error shown, but I *can't* see this CF appear in my OpsCenter. update composite_counter set value = value + 5 where aid = '1' and key1 = 'test1' and key2 = 'test2' and key3 = 'test3'; works as expected too. But how can I have multiple counter columns using the schemaless property of cassandra ? I mean before, when I created counter CF with cli, things like this used to work: update composite_counter set 'value2' = 'value2' + 5 where aid = '1' and key1 = 'test1' and key2 = 'test2' and key3 = 'test3'; = Bad Request: line 1:29 no viable alternative at input 'value2' I also tried: update composite_counter set value2 = value2 + 5 where aid = '1' and key1 = 'test1' and key2 = 'test2' and key3 = 'test3'; = Bad Request: Unknown identifier value2 (as expected I guess) I want to make a counter CF with composite keys and a lot of counters using this pattern 20130306#event or (20130306, event), not sure if I should use composite columns there. Is it mandatory to create the CF with at least one column with the counter type ? I mean I will probably never use a column named 'value', I defined it just to be sure the CF is defined as a counter CF. 2013/3/6 Abhijit Chanda abhijit.chan...@gmail.com Thanks @aaron for the rectification On Wed, Mar 6, 2013 at 1:17 PM, aaron morton aa...@thelastpickle.com wrote: Note that CQL 3 in 1.1 is compatible with CQL 3 in 1.2. Also you do not have to use CQL 3, you can still use the cassandra-cli to create CF's. The syntax you use to populate it depends on the client you are using. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 5/03/2013, at 9:16 PM, Abhijit Chanda abhijit.chan...@gmail.com wrote: Yes you can , you just have to use CQL3 and 1.1.10 onward cassandra supports CQL3. Just you have to aware of the fact that a column family that contains a counter column can only contain counters. In other other words either all the columns of the column family excluding KEY have the counter type or none of them can have it. Best Regards, -- Abhijit Chanda +91-974395 -- Abhijit Chanda +91-974395
Re: should I file a bug report on this or is this normal?
15. Size of nreldata is now 220K ….it has exploded in size!! This may be explained by fragmentation in the sstables, which compaction would eventually resolve. During repair the data came from multiple nodes and created multiple sstables for each CF. Streaming copies part of an SSTable on the source and creates an SSTable on the destination. This pattern is different to all writes for a CF going to the same sstable when flushed. To compare apples to apples run a major compaction after the initial data load, and after the repair. 1. Why is the bloomfilter for level 5 a total of 3856 bytes for 29118(large to small) bytes of data while in the initial data it was 2192 bytes for 43038(small to large) bytes of data? The size of the BF depends on the number of rows and the false positive rate. Not the size of the -Data.db component on disk. 2. Why is there 3 levels? With such a small set of data, I would think it would flush one data file like the original data but instead there is 3 files. See above. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 6/03/2013, at 6:40 AM, Hiller, Dean dean.hil...@nrel.gov wrote: I ran a pretty solid QA test(cleaned data from scratch) on version 1.2.2 My test was as so 1. Start up 4 node cassandra cluster 2. Populate with initial test data (no other data is added to system after this point!!!) 3. Run nodetool drain on every node(move stuff from commit log to sstables) 4. Stop and start cassandra cluster to have it running again 5. Get size of nreldata CF folder is 128kB 6. Go to node 3, run snapshot and mv snapshots directory OUT of nreldata 7. Get size of nreldata CF folder is 128kB 8. On node 3, run nodetool drain 9. Get size of nreldataCF folder is still 128kB 10. Stop cassandra node 11. Rm keyspace/nreldata/*.db 12. Size of nreldata CF is 8kb(odd of an empty folder but ok) 13. Start cassandra 14. Nodetool repair databus5 nreldata 15. Size of nreldata is now 220K ….it has exploded in size!! I ran this QA test as we see data size explosion in production as well(I can't be 100% sure if this is the same thing though as above is such a small data set). Would leveled compaction be a bit more stable in terms of size ratios and such. QUESTIONS 1. Why is the bloomfilter for level 5 a total of 3856 bytes for 29118(large to small) bytes of data while in the initial data it was 2192 bytes for 43038(small to large) bytes of data? 2. Why is there 3 levels? With such a small set of data, I would think it would flush one data file like the original data but instead there is 3 files. My files after repair have levels 5, 6, and 7. My files before deletion of the CF have just level 1. After repair files are -rw-rw-r--. 1 cassandra cassandra54 Mar 6 07:18 databus5-nreldata-ib-5-CompressionInfo.db -rw-rw-r--. 1 cassandra cassandra 29118 Mar 6 07:18 databus5-nreldata-ib-5-Data.db -rw-rw-r--. 1 cassandra cassandra 3856 Mar 6 07:18 databus5-nreldata-ib-5-Filter.db -rw-rw-r--. 1 cassandra cassandra 37000 Mar 6 07:18 databus5-nreldata-ib-5-Index.db -rw-rw-r--. 1 cassandra cassandra 4772 Mar 6 07:18 databus5-nreldata-ib-5-Statistics.db -rw-rw-r--. 1 cassandra cassandra 383 Mar 6 07:18 databus5-nreldata-ib-5-Summary.db -rw-rw-r--. 1 cassandra cassandra79 Mar 6 07:18 databus5-nreldata-ib-5-TOC.txt -rw-rw-r--. 1 cassandra cassandra46 Mar 6 07:18 databus5-nreldata-ib-6-CompressionInfo.db -rw-rw-r--. 1 cassandra cassandra 14271 Mar 6 07:18 databus5-nreldata-ib-6-Data.db -rw-rw-r--. 1 cassandra cassandra 816 Mar 6 07:18 databus5-nreldata-ib-6-Filter.db -rw-rw-r--. 1 cassandra cassandra 18248 Mar 6 07:18 databus5-nreldata-ib-6-Index.db -rw-rw-r--. 1 cassandra cassandra 4756 Mar 6 07:18 databus5-nreldata-ib-6-Statistics.db -rw-rw-r--. 1 cassandra cassandra 230 Mar 6 07:18 databus5-nreldata-ib-6-Summary.db -rw-rw-r--. 1 cassandra cassandra79 Mar 6 07:18 databus5-nreldata-ib-6-TOC.txt -rw-rw-r--. 1 cassandra cassandra46 Mar 6 07:18 databus5-nreldata-ib-7-CompressionInfo.db -rw-rw-r--. 1 cassandra cassandra 14271 Mar 6 07:18 databus5-nreldata-ib-7-Data.db -rw-rw-r--. 1 cassandra cassandra 816 Mar 6 07:18 databus5-nreldata-ib-7-Filter.db -rw-rw-r--. 1 cassandra cassandra 18248 Mar 6 07:18 databus5-nreldata-ib-7-Index.db -rw-rw-r--. 1 cassandra cassandra 4756 Mar 6 07:18 databus5-nreldata-ib-7-Statistics.db -rw-rw-r--. 1 cassandra cassandra 230 Mar 6 07:18 databus5-nreldata-ib-7-Summary.db -rw-rw-r--. 1 cassandra cassandra79 Mar 6 07:18 databus5-nreldata-ib-7-TOC.txt Before repair files(from my moved snapshot as I moved it out of the directory so cassandra no longer had it)…. -rw-rw-r--. 1 cassandra cassandra62 Mar 6 07:11 databus5-nreldata-ib-1-CompressionInfo.db -rw-rw-r--. 1
Re: should I file a bug report on this or is this normal?
Thanks for the great info, I will give it a go. 1 question though, my false positive rate and number of rows is not changing so why is the bloomfilter bigger? Or do you mean bloomfilter is not based on number of rows int he table but based on how the rows are spread through the sstable files? Ie. I have the same amount of rows before and after in that specific column family. Thanks, Dean From: aaron morton aa...@thelastpickle.commailto:aa...@thelastpickle.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Wednesday, March 6, 2013 9:29 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: should I file a bug report on this or is this normal? 15. Size of nreldata is now 220K ….it has exploded in size!! This may be explained by fragmentation in the sstables, which compaction would eventually resolve. During repair the data came from multiple nodes and created multiple sstables for each CF. Streaming copies part of an SSTable on the source and creates an SSTable on the destination. This pattern is different to all writes for a CF going to the same sstable when flushed. To compare apples to apples run a major compaction after the initial data load, and after the repair. 1. Why is the bloomfilter for level 5 a total of 3856 bytes for 29118(large to small) bytes of data while in the initial data it was 2192 bytes for 43038(small to large) bytes of data? The size of the BF depends on the number of rows and the false positive rate. Not the size of the -Data.db component on disk. 2. Why is there 3 levels? With such a small set of data, I would think it would flush one data file like the original data but instead there is 3 files. See above. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 6/03/2013, at 6:40 AM, Hiller, Dean dean.hil...@nrel.govmailto:dean.hil...@nrel.gov wrote: I ran a pretty solid QA test(cleaned data from scratch) on version 1.2.2 My test was as so 1. Start up 4 node cassandra cluster 2. Populate with initial test data (no other data is added to system after this point!!!) 3. Run nodetool drain on every node(move stuff from commit log to sstables) 4. Stop and start cassandra cluster to have it running again 5. Get size of nreldata CF folder is 128kB 6. Go to node 3, run snapshot and mv snapshots directory OUT of nreldata 7. Get size of nreldata CF folder is 128kB 8. On node 3, run nodetool drain 9. Get size of nreldataCF folder is still 128kB 10. Stop cassandra node 11. Rm keyspace/nreldata/*.db 12. Size of nreldata CF is 8kb(odd of an empty folder but ok) 13. Start cassandra 14. Nodetool repair databus5 nreldata 15. Size of nreldata is now 220K ….it has exploded in size!! I ran this QA test as we see data size explosion in production as well(I can't be 100% sure if this is the same thing though as above is such a small data set). Would leveled compaction be a bit more stable in terms of size ratios and such. QUESTIONS 1. Why is the bloomfilter for level 5 a total of 3856 bytes for 29118(large to small) bytes of data while in the initial data it was 2192 bytes for 43038(small to large) bytes of data? 2. Why is there 3 levels? With such a small set of data, I would think it would flush one data file like the original data but instead there is 3 files. My files after repair have levels 5, 6, and 7. My files before deletion of the CF have just level 1. After repair files are -rw-rw-r--. 1 cassandra cassandra54 Mar 6 07:18 databus5-nreldata-ib-5-CompressionInfo.db -rw-rw-r--. 1 cassandra cassandra 29118 Mar 6 07:18 databus5-nreldata-ib-5-Data.db -rw-rw-r--. 1 cassandra cassandra 3856 Mar 6 07:18 databus5-nreldata-ib-5-Filter.db -rw-rw-r--. 1 cassandra cassandra 37000 Mar 6 07:18 databus5-nreldata-ib-5-Index.db -rw-rw-r--. 1 cassandra cassandra 4772 Mar 6 07:18 databus5-nreldata-ib-5-Statistics.db -rw-rw-r--. 1 cassandra cassandra 383 Mar 6 07:18 databus5-nreldata-ib-5-Summary.db -rw-rw-r--. 1 cassandra cassandra79 Mar 6 07:18 databus5-nreldata-ib-5-TOC.txt -rw-rw-r--. 1 cassandra cassandra46 Mar 6 07:18 databus5-nreldata-ib-6-CompressionInfo.db -rw-rw-r--. 1 cassandra cassandra 14271 Mar 6 07:18 databus5-nreldata-ib-6-Data.db -rw-rw-r--. 1 cassandra cassandra 816 Mar 6 07:18 databus5-nreldata-ib-6-Filter.db -rw-rw-r--. 1 cassandra cassandra 18248 Mar 6 07:18 databus5-nreldata-ib-6-Index.db -rw-rw-r--. 1 cassandra cassandra 4756 Mar 6 07:18 databus5-nreldata-ib-6-Statistics.db -rw-rw-r--. 1 cassandra cassandra 230 Mar 6 07:18 databus5-nreldata-ib-6-Summary.db -rw-rw-r--. 1 cassandra cassandra79 Mar 6 07:18 databus5-nreldata-ib-6-TOC.txt -rw-rw-r--. 1 cassandra cassandra46 Mar 6 07:18
Re: Cassandra instead of memcached
Thanks guys, this is what I was looking for. @Edward. I definitely like crazy ideas ;), I think the only issue here is that C* is a disk space hug, so not sure if that would be feasible since free RAM is not as abundant as disk. BTW, I watched your presentation, are you guys still using C* as in-memory store? On Mar 6, 2013, at 7:44 AM, Edward Capriolo edlinuxg...@gmail.com wrote: http://www.slideshare.net/edwardcapriolo/cassandra-as-memcache Read at ONE. READ_REPAIR_CHANCE as low as possible. Use short TTL and short GC_GRACE. Make the in memory memtable size as high as possible to avoid flushing and compacting. Optionally turn off commit log. You can use cassandra like memcache but it is not a memcache replacement. Cassandra persists writes and compacts SSTables, memcache only has to keep data in memory. If you want to try a crazy idea. try putting your persistent data on a ram disk! Not data/system however! On Wed, Mar 6, 2013 at 2:45 AM, aaron morton aa...@thelastpickle.com wrote: consider disabling durable_writes in the KS config to remove writing to the commit log. That will speed things up for you. Note that you risk losing data is cassandra crashes or is not shut down with nodetool drain. Even if you set the gc_grace to 0, deletes will still need to be committed to disk. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 5/03/2013, at 9:51 AM, Drew Kutcharian d...@venarc.com wrote: Thanks Ben, that article was actually the reason I started thinking about removing memcached. I wanted to see what would be the optimum config to use C* as an in-memory store. -- Drew On Mar 5, 2013, at 2:39 AM, Ben Bromhead b...@instaclustr.com wrote: Check out http://techblog.netflix.com/2012/07/benchmarking-high-performance-io-with.html Netflix used Cassandra with SSDs and were able to drop their memcache layer. Mind you they were not using it purely as an in memory KV store. Ben Instaclustr | www.instaclustr.com | @instaclustr On 05/03/2013, at 4:33 PM, Drew Kutcharian d...@venarc.com wrote: Hi Guys, I'm thinking about using Cassandra as an in-memory key/value store instead of memcached for a new project (just to get rid of a dependency if possible). I was thinking about setting the replication factor to 1, enabling off-heap row-cache and setting gc_grace_period to zero for the CF that will be used for the key/value store. Has anyone tried this? Any comments? Thanks, Drew
Re: Cassandra instead of memcached
If your writing much more data then RAM cassandra will not work as fast as memcache. Cassandra is not magical, if all of your data fits in memory it is going to be fast, if most of your data fits in memory it can still be fast. However if you plan on having much more data then disk you need to think about more RAM and OR SSD disks. We do not use c* as an in-memory store. However for many of our datasets we do not have a separate caching tier. In those cases cassandra is both our database and our in-memory store if you want to use those terms :) On Wed, Mar 6, 2013 at 12:02 PM, Drew Kutcharian d...@venarc.com wrote: Thanks guys, this is what I was looking for. @Edward. I definitely like crazy ideas ;), I think the only issue here is that C* is a disk space hug, so not sure if that would be feasible since free RAM is not as abundant as disk. BTW, I watched your presentation, are you guys still using C* as in-memory store? On Mar 6, 2013, at 7:44 AM, Edward Capriolo edlinuxg...@gmail.com wrote: http://www.slideshare.net/edwardcapriolo/cassandra-as-memcache Read at ONE. READ_REPAIR_CHANCE as low as possible. Use short TTL and short GC_GRACE. Make the in memory memtable size as high as possible to avoid flushing and compacting. Optionally turn off commit log. You can use cassandra like memcache but it is not a memcache replacement. Cassandra persists writes and compacts SSTables, memcache only has to keep data in memory. If you want to try a crazy idea. try putting your persistent data on a ram disk! Not data/system however! On Wed, Mar 6, 2013 at 2:45 AM, aaron morton aa...@thelastpickle.comwrote: consider disabling durable_writes in the KS config to remove writing to the commit log. That will speed things up for you. Note that you risk losing data is cassandra crashes or is not shut down with nodetool drain. Even if you set the gc_grace to 0, deletes will still need to be committed to disk. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 5/03/2013, at 9:51 AM, Drew Kutcharian d...@venarc.com wrote: Thanks Ben, that article was actually the reason I started thinking about removing memcached. I wanted to see what would be the optimum config to use C* as an in-memory store. -- Drew On Mar 5, 2013, at 2:39 AM, Ben Bromhead b...@instaclustr.com wrote: Check out http://techblog.netflix.com/2012/07/benchmarking-high-performance-io-with.html Netflix used Cassandra with SSDs and were able to drop their memcache layer. Mind you they were not using it purely as an in memory KV store. Ben Instaclustr | www.instaclustr.com | @instaclustrhttp://twitter.com/instaclustr On 05/03/2013, at 4:33 PM, Drew Kutcharian d...@venarc.com wrote: Hi Guys, I'm thinking about using Cassandra as an in-memory key/value store instead of memcached for a new project (just to get rid of a dependency if possible). I was thinking about setting the replication factor to 1, enabling off-heap row-cache and setting gc_grace_period to zero for the CF that will be used for the key/value store. Has anyone tried this? Any comments? Thanks, Drew
Re: Cassandra instead of memcached
I think the dataset should fit in memory easily. The main purpose of this would be as a store for an API rate limiting/accounting system. I think ebay guys are using C* too for the same reason. Initially we were thinking of using Hazelcast or memcahed. But Hazelcast (at least the community edition) has Java gc issues with big heaps and the problem with memcached is lack of a reliable distribution (you lose a node, you need to rehash everything), so I figured why not just use C*. On Mar 6, 2013, at 9:08 AM, Edward Capriolo edlinuxg...@gmail.com wrote: If your writing much more data then RAM cassandra will not work as fast as memcache. Cassandra is not magical, if all of your data fits in memory it is going to be fast, if most of your data fits in memory it can still be fast. However if you plan on having much more data then disk you need to think about more RAM and OR SSD disks. We do not use c* as an in-memory store. However for many of our datasets we do not have a separate caching tier. In those cases cassandra is both our database and our in-memory store if you want to use those terms :) On Wed, Mar 6, 2013 at 12:02 PM, Drew Kutcharian d...@venarc.com wrote: Thanks guys, this is what I was looking for. @Edward. I definitely like crazy ideas ;), I think the only issue here is that C* is a disk space hug, so not sure if that would be feasible since free RAM is not as abundant as disk. BTW, I watched your presentation, are you guys still using C* as in-memory store? On Mar 6, 2013, at 7:44 AM, Edward Capriolo edlinuxg...@gmail.com wrote: http://www.slideshare.net/edwardcapriolo/cassandra-as-memcache Read at ONE. READ_REPAIR_CHANCE as low as possible. Use short TTL and short GC_GRACE. Make the in memory memtable size as high as possible to avoid flushing and compacting. Optionally turn off commit log. You can use cassandra like memcache but it is not a memcache replacement. Cassandra persists writes and compacts SSTables, memcache only has to keep data in memory. If you want to try a crazy idea. try putting your persistent data on a ram disk! Not data/system however! On Wed, Mar 6, 2013 at 2:45 AM, aaron morton aa...@thelastpickle.com wrote: consider disabling durable_writes in the KS config to remove writing to the commit log. That will speed things up for you. Note that you risk losing data is cassandra crashes or is not shut down with nodetool drain. Even if you set the gc_grace to 0, deletes will still need to be committed to disk. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 5/03/2013, at 9:51 AM, Drew Kutcharian d...@venarc.com wrote: Thanks Ben, that article was actually the reason I started thinking about removing memcached. I wanted to see what would be the optimum config to use C* as an in-memory store. -- Drew On Mar 5, 2013, at 2:39 AM, Ben Bromhead b...@instaclustr.com wrote: Check out http://techblog.netflix.com/2012/07/benchmarking-high-performance-io-with.html Netflix used Cassandra with SSDs and were able to drop their memcache layer. Mind you they were not using it purely as an in memory KV store. Ben Instaclustr | www.instaclustr.com | @instaclustr On 05/03/2013, at 4:33 PM, Drew Kutcharian d...@venarc.com wrote: Hi Guys, I'm thinking about using Cassandra as an in-memory key/value store instead of memcached for a new project (just to get rid of a dependency if possible). I was thinking about setting the replication factor to 1, enabling off-heap row-cache and setting gc_grace_period to zero for the CF that will be used for the key/value store. Has anyone tried this? Any comments? Thanks, Drew
Re: Cassandra instead of memcached
It also depends on you SLA, it should work for 99% of the time. But one GC/flush/compact could screw things up big time if you have tight SLA. -Wei From: Drew Kutcharian d...@venarc.com To: user@cassandra.apache.org Sent: Wednesday, March 6, 2013 9:32 AM Subject: Re: Cassandra instead of memcached I think the dataset should fit in memory easily. The main purpose of this would be as a store for an API rate limiting/accounting system. I think ebay guys are using C* too for the same reason. Initially we were thinking of using Hazelcast or memcahed. But Hazelcast (at least the community edition) has Java gc issues with big heaps and the problem with memcached is lack of a reliable distribution (you lose a node, you need to rehash everything), so I figured why not just use C*. On Mar 6, 2013, at 9:08 AM, Edward Capriolo edlinuxg...@gmail.com wrote: If your writing much more data then RAM cassandra will not work as fast as memcache. Cassandra is not magical, if all of your data fits in memory it is going to be fast, if most of your data fits in memory it can still be fast. However if you plan on having much more data then disk you need to think about more RAM and OR SSD disks. We do not use c* as an in-memory store. However for many of our datasets we do not have a separate caching tier. In those cases cassandra is both our database and our in-memory store if you want to use those terms :) On Wed, Mar 6, 2013 at 12:02 PM, Drew Kutcharian d...@venarc.com wrote: Thanks guys, this is what I was looking for. @Edward. I definitely like crazy ideas ;), I think the only issue here is that C* is a disk space hug, so not sure if that would be feasible since free RAM is not as abundant as disk. BTW, I watched your presentation, are you guys still using C* as in-memory store? On Mar 6, 2013, at 7:44 AM, Edward Capriolo edlinuxg...@gmail.com wrote: http://www.slideshare.net/edwardcapriolo/cassandra-as-memcache Read at ONE. READ_REPAIR_CHANCE as low as possible. Use short TTL and short GC_GRACE. Make the in memory memtable size as high as possible to avoid flushing and compacting. Optionally turn off commit log. You can use cassandra like memcache but it is not a memcache replacement. Cassandra persists writes and compacts SSTables, memcache only has to keep data in memory. If you want to try a crazy idea. try putting your persistent data on a ram disk! Not data/system however! On Wed, Mar 6, 2013 at 2:45 AM, aaron morton aa...@thelastpickle.com wrote: consider disabling durable_writes in the KS config to remove writing to the commit log. That will speed things up for you. Note that you risk losing data is cassandra crashes or is not shut down with nodetool drain. Even if you set the gc_grace to 0, deletes will still need to be committed to disk. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com/ On 5/03/2013, at 9:51 AM, Drew Kutcharian d...@venarc.com wrote: Thanks Ben, that article was actually the reason I started thinking about removing memcached. I wanted to see what would be the optimum config to use C* as an in-memory store. -- Drew On Mar 5, 2013, at 2:39 AM, Ben Bromhead b...@instaclustr.com wrote: Check out http://techblog.netflix.com/2012/07/benchmarking-high-performance-io-with.html Netflix used Cassandra with SSDs and were able to drop their memcache layer. Mind you they were not using it purely as an in memory KV store. Ben Instaclustr | www.instaclustr.com | @instaclustr On 05/03/2013, at 4:33 PM, Drew Kutcharian d...@venarc.com wrote: Hi Guys, I'm thinking about using Cassandra as an in-memory key/value store instead of memcached for a new project (just to get rid of a dependency if possible). I was thinking about setting the replication factor to 1, enabling off-heap row-cache and setting gc_grace_period to zero for the CF that will be used for the key/value store. Has anyone tried this? Any comments? Thanks, Drew
Hinted handoff
Hi - Is there a way to increase the hinted handoff throughput ? I am seeing around 8Mb/s (bits). Thanks, Kanwar
RE: Hinted handoff
Got the param. thanks From: Kanwar Sangha [mailto:kan...@mavenir.com] Sent: 06 March 2013 13:50 To: user@cassandra.apache.org Subject: Hinted handoff Hi - Is there a way to increase the hinted handoff throughput ? I am seeing around 8Mb/s (bits). Thanks, Kanwar
RE: Hinted handoff
After trying to bump up the hinted_handoff_throttle_in_kb to 1G/b per sec, It still does not go above 25Mb/s. Is there a limitation ? From: Kanwar Sangha [mailto:kan...@mavenir.com] Sent: 06 March 2013 14:41 To: user@cassandra.apache.org Subject: RE: Hinted handoff Got the param. thanks From: Kanwar Sangha [mailto:kan...@mavenir.com] Sent: 06 March 2013 13:50 To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Hinted handoff Hi - Is there a way to increase the hinted handoff throughput ? I am seeing around 8Mb/s (bits). Thanks, Kanwar
RE: Hinted handoff
Is this correct ? I have Raid 0 setup for 16 TB across 8 disks. Each disk is 7.2kRPM with IOPS of 80 per disk. Data is ~9.5 TB So 4K * 80 * 9.5 = 3040 KB ~ 23.75 Mb/s. So basically I am limited at the disk rather than the n/w From: Kanwar Sangha [mailto:kan...@mavenir.com] Sent: 06 March 2013 15:11 To: user@cassandra.apache.org Subject: RE: Hinted handoff After trying to bump up the hinted_handoff_throttle_in_kb to 1G/b per sec, It still does not go above 25Mb/s. Is there a limitation ? From: Kanwar Sangha [mailto:kan...@mavenir.com] Sent: 06 March 2013 14:41 To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: RE: Hinted handoff Got the param. thanks From: Kanwar Sangha [mailto:kan...@mavenir.com] Sent: 06 March 2013 13:50 To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Hinted handoff Hi - Is there a way to increase the hinted handoff throughput ? I am seeing around 8Mb/s (bits). Thanks, Kanwar
Re: Consistent problem when solve Digest mismatch
Actually I didn't concurrent update the same records, because I first create it, then search it, then delete it. The version conflict solved failed, due to delete local time stamp is earlier then create local time stamp. 2013/3/6 aaron morton aa...@thelastpickle.com Otherwise, it means the version conflict solving strong depends on global sequence id (timestamp) which need provide by client ? Yes. If you have an area of your data model that has a high degree of concurrency C* may not be the right match. In 1.1 we have atomic updates so clients see either the entire write or none of it. And sometimes you can design a data model that does mutate shared values, but writes ledger entries instead. See Matt Denis talk here http://www.datastax.com/events/cassandrasummit2012/presentations or this post http://thelastpickle.com/2012/08/18/Sorting-Lists-For-Humans/ Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 4/03/2013, at 4:30 PM, Jason Tang ares.t...@gmail.com wrote: Hi The timestamp provided by my client is unix timestamp (with ntp), and as I said, due to the ntp drift, the local unix timestamp is not accurately synchronized (compare to my case). So for short, client can not provide global sequence number to indicate the event order. But I wonder, I configured Cassandra consistency level as write QUORUM. So for one record, I suppose Cassandra has the ability to decide the final update results. Otherwise, it means the version conflict solving strong depends on global sequence id (timestamp) which need provide by client ? //Tang 2013/3/4 Sylvain Lebresne sylv...@datastax.com The problem is, what is the sequence number you are talking about is exactly? Or let me put it another way: if you do have a sequence number that provides a total ordering of your operation, then that is exactly what you should use as your timestamp. What Cassandra calls the timestamp, is exactly what you call seqID, it's the number Cassandra uses to decide the order of operation. Except that in real life, provided you have more than one client talking to Cassandra, then providing a total ordering of operation is hard, and in fact not doable efficiently. So in practice, people use unix timestamp (with ntp) which provide a very good while cheap approximation of the real life order of operations. But again, if you do know how to assign a more precise timestamp, Cassandra let you use that: you can provid your own timestamp (using unix timestamp is just the default). The point being, unix timestamp is the better approximation we have in practice. -- Sylvain On Mon, Mar 4, 2013 at 9:26 AM, Jason Tang ares.t...@gmail.com wrote: Hi Previous I met a consistency problem, you can refer the link below for the whole story. http://mail-archives.apache.org/mod_mbox/cassandra-user/201206.mbox/%3CCAFb+LUxna0jiY0V=AvXKzUdxSjApYm4zWk=ka9ljm-txc04...@mail.gmail.com%3E And after check the code, seems I found some clue of the problem. Maybe some one can check this. For short, I have Cassandra cluster (1.0.3), The consistency level is read/write quorum, replication_factor is 3. Here is event sequence: seqID NodeA NodeB NodeC 1. New New New 2. Update Update Update 3. Delete Delete When try to read from NodeB and NodeC, Digest mismatch exception triggered, so Cassandra try to resolve this version conflict. But the result is value Update. Here is the suspect root cause, the version conflict resolved based on time stamp. Node C local time is a bit earlier then node A. Update requests sent from node C with time stamp 00:00:00.050, Delete sent from node A with time stamp 00:00:00.020, which is not same as the event sequence. So the version conflict resolved incorrectly. It is true? If Yes, then it means, consistency level can secure the conflict been found, but to solve it correctly, dependence one time synchronization's accuracy, e.g. NTP ?
Correct way to set ByteOrderedPartitioner initial tokens
I have 4 Nodes, and I'd like to store all keys starting with 'a' on node 1, 'b' on 2, and so on.My keys just start with a letter and numbers follow, like 'a150', 'b1','c32000'.I've set the initial tokens to 61ff, 62ff ,63ff, 64ff .This does not seem to be the correct way.Thanks.
Re: Hinted handoff
Check the IO utilisation using iostat You *really* should not need to make HH run faster, if you do there is some thing bad going on. I would consider dropping the hints and running repair. Data is ~9.5 TB Do you have 9.5TB on a single node ? In the normal case it's best to have around 300 to 500GB per node. With that much data is will take a week to run repair or replace a failed node. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 6/03/2013, at 1:22 PM, Kanwar Sangha kan...@mavenir.com wrote: Is this correct ? I have Raid 0 setup for 16 TB across 8 disks. Each disk is 7.2kRPM with IOPS of 80 per disk. Data is ~9.5 TB So 4K * 80 * 9.5 = 3040 KB ~ 23.75 Mb/s. So basically I am limited at the disk rather than the n/w From: Kanwar Sangha [mailto:kan...@mavenir.com] Sent: 06 March 2013 15:11 To: user@cassandra.apache.org Subject: RE: Hinted handoff After trying to bump up the “hinted_handoff_throttle_in_kb” to 1G/b per sec, It still does not go above 25Mb/s. Is there a limitation ? From: Kanwar Sangha [mailto:kan...@mavenir.com] Sent: 06 March 2013 14:41 To: user@cassandra.apache.org Subject: RE: Hinted handoff Got the param. thanks From: Kanwar Sangha [mailto:kan...@mavenir.com] Sent: 06 March 2013 13:50 To: user@cassandra.apache.org Subject: Hinted handoff Hi – Is there a way to increase the hinted handoff throughput ? I am seeing around 8Mb/s (bits). Thanks, Kanwar
Re: should I file a bug report on this or is this normal?
but based on how the rows are spread through the sstable files? It's per sstable. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 6/03/2013, at 8:51 AM, Hiller, Dean dean.hil...@nrel.gov wrote: Thanks for the great info, I will give it a go. 1 question though, my false positive rate and number of rows is not changing so why is the bloomfilter bigger? Or do you mean bloomfilter is not based on number of rows int he table but based on how the rows are spread through the sstable files? Ie. I have the same amount of rows before and after in that specific column family. Thanks, Dean From: aaron morton aa...@thelastpickle.commailto:aa...@thelastpickle.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Wednesday, March 6, 2013 9:29 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: should I file a bug report on this or is this normal? 15. Size of nreldata is now 220K ….it has exploded in size!! This may be explained by fragmentation in the sstables, which compaction would eventually resolve. During repair the data came from multiple nodes and created multiple sstables for each CF. Streaming copies part of an SSTable on the source and creates an SSTable on the destination. This pattern is different to all writes for a CF going to the same sstable when flushed. To compare apples to apples run a major compaction after the initial data load, and after the repair. 1. Why is the bloomfilter for level 5 a total of 3856 bytes for 29118(large to small) bytes of data while in the initial data it was 2192 bytes for 43038(small to large) bytes of data? The size of the BF depends on the number of rows and the false positive rate. Not the size of the -Data.db component on disk. 2. Why is there 3 levels? With such a small set of data, I would think it would flush one data file like the original data but instead there is 3 files. See above. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 6/03/2013, at 6:40 AM, Hiller, Dean dean.hil...@nrel.govmailto:dean.hil...@nrel.gov wrote: I ran a pretty solid QA test(cleaned data from scratch) on version 1.2.2 My test was as so 1. Start up 4 node cassandra cluster 2. Populate with initial test data (no other data is added to system after this point!!!) 3. Run nodetool drain on every node(move stuff from commit log to sstables) 4. Stop and start cassandra cluster to have it running again 5. Get size of nreldata CF folder is 128kB 6. Go to node 3, run snapshot and mv snapshots directory OUT of nreldata 7. Get size of nreldata CF folder is 128kB 8. On node 3, run nodetool drain 9. Get size of nreldataCF folder is still 128kB 10. Stop cassandra node 11. Rm keyspace/nreldata/*.db 12. Size of nreldata CF is 8kb(odd of an empty folder but ok) 13. Start cassandra 14. Nodetool repair databus5 nreldata 15. Size of nreldata is now 220K ….it has exploded in size!! I ran this QA test as we see data size explosion in production as well(I can't be 100% sure if this is the same thing though as above is such a small data set). Would leveled compaction be a bit more stable in terms of size ratios and such. QUESTIONS 1. Why is the bloomfilter for level 5 a total of 3856 bytes for 29118(large to small) bytes of data while in the initial data it was 2192 bytes for 43038(small to large) bytes of data? 2. Why is there 3 levels? With such a small set of data, I would think it would flush one data file like the original data but instead there is 3 files. My files after repair have levels 5, 6, and 7. My files before deletion of the CF have just level 1. After repair files are -rw-rw-r--. 1 cassandra cassandra54 Mar 6 07:18 databus5-nreldata-ib-5-CompressionInfo.db -rw-rw-r--. 1 cassandra cassandra 29118 Mar 6 07:18 databus5-nreldata-ib-5-Data.db -rw-rw-r--. 1 cassandra cassandra 3856 Mar 6 07:18 databus5-nreldata-ib-5-Filter.db -rw-rw-r--. 1 cassandra cassandra 37000 Mar 6 07:18 databus5-nreldata-ib-5-Index.db -rw-rw-r--. 1 cassandra cassandra 4772 Mar 6 07:18 databus5-nreldata-ib-5-Statistics.db -rw-rw-r--. 1 cassandra cassandra 383 Mar 6 07:18 databus5-nreldata-ib-5-Summary.db -rw-rw-r--. 1 cassandra cassandra79 Mar 6 07:18 databus5-nreldata-ib-5-TOC.txt -rw-rw-r--. 1 cassandra cassandra46 Mar 6 07:18 databus5-nreldata-ib-6-CompressionInfo.db -rw-rw-r--. 1 cassandra cassandra 14271 Mar 6 07:18 databus5-nreldata-ib-6-Data.db -rw-rw-r--. 1 cassandra cassandra 816 Mar 6 07:18 databus5-nreldata-ib-6-Filter.db -rw-rw-r--. 1 cassandra cassandra 18248 Mar 6 07:18
Cassandra OOM, many deletedColumn
Hi, My version is 1.1.7 Our use case is : we have a index columnfamily to record how many resource is stored for a user. The number might vary from tens to millions. We provide a feature to let user to delete resource according prefix. we found some cassandra will OOM after some period. The cluster is a kind of cross-datacenter ring. 1. Exception in cassandra log: ERROR [Thread-5810] 2013-02-04 05:38:13,882 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[Thread-5810,5,main] java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767) at java.util.concurrent.ThreadPoolExecutor.ensureQueuedTaskHandled(ThreadPoolExecutor.java:758) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:655) at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:581) at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:155) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:113) ERROR [Thread-5819] 2013-02-04 05:38:13,888 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[Thread-5819,5,main] java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658) at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:581) at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:155) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:113) ERROR [Thread-36] 2013-02-04 05:38:13,898 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[Thread-36,5,main] java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658) at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:581) at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:155) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:113) ERROR [Thread-3990] 2013-02-04 05:38:13,902 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[Thread-3990,5,main] java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658) at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:581) at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:155) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:113) ERROR [ACCEPT-/10.139.50.62] AbstractCassandraDaemon.java (line 135) Exception in thread Thread[ACCEPT-/10.139.50.62,5,main] java.lang.RuntimeException: java.nio.channels.ClosedChannelException at org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:710) Caused by: java.nio.channels.ClosedChannelException at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:137) at sun.nio.ch.ServerSocketAdaptor.accept(ServerSocketAdaptor.java:84) at org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:699) INFO [HintedHandoff:1] 2013-02-04 05:38:24,971 HintedHandOffManager.java (line 374) Timed out replaying hints to /23.20.84.240; aborting further deliveries INFO [HintedHandoff:1] 2013-02-04 05:38:24,971 HintedHandOffManager.java (line 392) Finished hinted handoff of 0 rows to endpoint INFO [HintedHandoff:1] 2013-02-04 05:38:24,971 HintedHandOffManager.java (line 296) Started hinted handoff for token: 3 2. From heap dump, there are many deletedColumn found, rooted from thread readStage. Pls help: where might be the problem? Best Regards! Jian Jin
Re: Correct way to set ByteOrderedPartitioner initial tokens
I have 4 Nodes, and I'd like to store all keys starting with 'a' on node 1, 'b' on 2, and so on. Can I ask why ? In general you *really* dont want to use the ByteOrderedPartitioner. If you are starting out, you will have a happier time if you start with the Random Partitioner. If you want your code to know where the rows are take a look at the Astynax client https://github.com/Netflix/astyanax I've set the initial tokens to 61ff, 62ff ,63ff, 64ff . I think you want to set them to the letter an then the highest number you will ever use. (coded as hex) Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 6/03/2013, at 9:31 PM, Mateus Ferreira e Freitas mateus.ffrei...@hotmail.com wrote: I have 4 Nodes, and I'd like to store all keys starting with 'a' on node 1, 'b' on 2, and so on. My keys just start with a letter and numbers follow, like 'a150', 'b1','c32000'. I've set the initial tokens to 61ff, 62ff ,63ff, 64ff . This does not seem to be the correct way. Thanks.
Re: Cassandra OOM, many deletedColumn
hmm.. did you managed to take a look using nodetool tpstats? That may give you indication further.. Jason On Thu, Mar 7, 2013 at 1:56 PM, 金剑 jinjia...@gmail.com wrote: Hi, My version is 1.1.7 Our use case is : we have a index columnfamily to record how many resource is stored for a user. The number might vary from tens to millions. We provide a feature to let user to delete resource according prefix. we found some cassandra will OOM after some period. The cluster is a kind of cross-datacenter ring. 1. Exception in cassandra log: ERROR [Thread-5810] 2013-02-04 05:38:13,882 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[Thread-5810,5,main] java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767) at java.util.concurrent.ThreadPoolExecutor.ensureQueuedTaskHandled(ThreadPoolExecutor.java:758) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:655) at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:581) at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:155) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:113) ERROR [Thread-5819] 2013-02-04 05:38:13,888 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[Thread-5819,5,main] java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658) at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:581) at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:155) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:113) ERROR [Thread-36] 2013-02-04 05:38:13,898 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[Thread-36,5,main] java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658) at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:581) at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:155) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:113) ERROR [Thread-3990] 2013-02-04 05:38:13,902 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[Thread-3990,5,main] java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658) at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:581) at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:155) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:113) ERROR [ACCEPT-/10.139.50.62] AbstractCassandraDaemon.java (line 135) Exception in thread Thread[ACCEPT-/10.139.50.62,5,main] java.lang.RuntimeException: java.nio.channels.ClosedChannelException at org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:710) Caused by: java.nio.channels.ClosedChannelException at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:137) at sun.nio.ch.ServerSocketAdaptor.accept(ServerSocketAdaptor.java:84) at org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:699) INFO [HintedHandoff:1] 2013-02-04 05:38:24,971 HintedHandOffManager.java (line 374) Timed out replaying hints to /23.20.84.240; aborting further deliveries INFO [HintedHandoff:1] 2013-02-04 05:38:24,971 HintedHandOffManager.java (line 392) Finished hinted handoff of 0 rows to endpoint INFO [HintedHandoff:1] 2013-02-04 05:38:24,971 HintedHandOffManager.java (line 296) Started hinted handoff for token: 3 2. From heap dump, there are many deletedColumn found, rooted from thread readStage. Pls help: where might be the problem? Best Regards! Jian Jin