Re: Brisk with standard C* cluster
Yes, you can add nodes in a second DC that have cassandra and brisk. This will keep the analytics load of the original nodes. There is some documentation here http://www.datastax.com/docs/0.8/brisk/index You may have better luck with user group http://groups.google.com/group/brisk-users or the data stax forums http://www.datastax.com/support-forums/ Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 17/01/2012, at 9:07 AM, Mohit Anchlia wrote: Is it possible to add Brisk only nodes to standard C* cluster? So if we have node A,B,C with standard C* then add Brisk node D,E,F for analytics?
Re: Hector + Range query problem
Does this help ? http://wiki.apache.org/cassandra/FAQ#range_rp Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 17/01/2012, at 10:58 AM, Philippe wrote: Hello, I've been trying to retrieve rows based on key range but every single time I test, Hector retrieves ALL the rows, no matter the range I give it. What can I possibly be doing wrong ? Thanks. I'm doing a test on a single-node RF=1 cluster (c* 1.0.5) with one column family (I've added truncated the CF quite a few times during my tests). Each row has a single column whose name is the byte value 2. The keys are 0,1,2,3 (shifted by a number of bits). The values are 0,1,2,3. list in the CLI gives me Using default limit of 100 --- RowKey: 02 = (column=02, value=00, timestamp=1326750723079000) --- RowKey: 010002 = (column=02, value=01, timestamp=1326750723239000) --- RowKey: 020002 = (column=02, value=02, timestamp=1326750723329000) --- RowKey: 030002 = (column=02, value=03, timestamp=1326750723416000) 4 Rows Returned. Hector code: RangeSlicesQueryTileKey,Byte,byte[] query = HFactory.createRangeSlicesQuery(keyspace, keySerializer, columnNameSerializer, BytesArraySerializer .get()); query.setColumnFamily(overlay).setKeys(keyStart, keyEnd).setColumnNames((byte)2); query.execute(); The execution log shows 1359 [main] INFO com.sensorly.heatmap.drawing.cassandra.CassandraTileDao - Range query from TileKey [overlayName=UNSET, tilex=0, tiley=0, zoom=2] to TileKey [overlayName=UNSET, tilex=1, tiley=0, zoom=2] = morton codes = [02,010002] getFiles() query returned TileKey [overlayName=UNSET, tilex=0, tiley=0, zoom=2] with 1 columns, morton = 02 getFiles() query returned TileKey [overlayName=UNSET, tilex=1, tiley=0, zoom=2] with 1 columns, morton = 010002 getFiles() query returned TileKey [overlayName=UNSET, tilex=0, tiley=1, zoom=2] with 1 columns, morton = 020002 getFiles() query returned TileKey [overlayName=UNSET, tilex=1, tiley=1, zoom=2] with 1 columns, morton = 030002 = ALL rows are returned when I really expect it to only return the 1st one.
Re: specifying initial cassandra schema
check the command line help for cassandra-cli, you can pass it a file name. e.g. cassandra --host localhost --file schema.txt Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 18/01/2012, at 9:35 AM, Carlos Pérez Miguel wrote: Hi Ramesh You can use the schematool command. I am using it for the same purposes in Cassandra 0.7.9. I use the following line in my cassandra startup script: $CASSANDRA_HOME/bin/schematool HOSTNAME 8080 import where HOSTNAME is the hostname of your test machine. It will import the schema from your cassandra.yaml file. If you execute it and there is already a schema in the cassandra cluster, you'll get a exception from schematool but no impact to the cluster. Bye Carlos Pérez Miguel 2012/1/17 Ramesh Natarajan rames...@gmail.com: I usually start cassandra and then use cassandra-cli to import a schema. Is there any automated way to load a fixed schema when cassandra starts automatically? I have a test setup where i run cassandra on a single node. I have a OS image packaged with cassandra and it automatically starts cassandra as a part of OS boot up. I saw some old references to specify schema in cassandra.yaml. Is this still supported in Cassandra 1.x? Are there any examples? thanks Ramesh
Re: Incremental backups
As this option is in the cassandra.yaml file, you might need to perform a restart of your entire cluster (a rolling restart should work). Hope this will help. Alain 2012/1/18 Michael Vaknine micha...@citypath.com Hi, I am configured to do incremental backups on all my node on the cluster but it is not working. In cassandra.yaml : incremental_backups: true When I check data folder there are some keyspaces that has folder backups but empty and I suspect this is a folder created in the past when I had 0.7.6 version. In a new creted Keyspace the folder does not exists. Does someone know if I need to configure any thing besides cassandra.yaml for this to work? ** ** Thanks Michael
Re: nodetool ring question
Good idea Jeremiah, are you using compression Michael ? Scanning through the CF stats this jumps out… Column Family: Attractions SSTable count: 3 Space used (live): 27542876685 Space used (total): 1213220387 Thats 25Gb of live data but only 1.3GB total. Otherwise want to see if a restart fixes it :) Would be interesting to know if it's wrong from the start or drifts during streaming or compaction. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 18/01/2012, at 12:04 PM, Jeremiah Jordan wrote: There were some nodetool ring load reporting issues with early version of 1.0.X don't remember when they were fixed, but that could be your issue. Are you using compressed column families, a lot of the issues were with those. Might update to 1.0.7. -Jeremiah On 01/16/2012 04:04 AM, Michael Vaknine wrote: Hi, I have a 4 nodes cluster 1.0.3 version This is what I get when I run nodetool ring Address DC RackStatus State LoadOwns Token 127605887595351923798765477786913079296 10.8.193.87 datacenter1 rack1 Up Normal 46.47 GB 25.00% 0 10.5.7.76 datacenter1 rack1 Up Normal 48.01 GB 25.00% 42535295865117307932921825928971026432 10.8.189.197datacenter1 rack1 Up Normal 53.7 GB 25.00% 85070591730234615865843651857942052864 10.5.3.17 datacenter1 rack1 Up Normal 43.49 GB 25.00% 127605887595351923798765477786913079296 I have finished running repair on all 4 nodes. I have less then 10 GB on the /var/lib/cassandra/data/ folders My question is Why nodetool reports almost 50 GB on each node? Thanks Michael
Re: poor Memtable performance on column slices?
On Wed, Jan 18, 2012 at 2:44 AM, Josep Blanquer blanq...@rightscale.com wrote: Hi, I've been doing some tests using wide rows recently, and I've seen some odd performance problems that I'd like to understand. In particular, I've seen that the time it takes for Cassandra to perform a column slice of a single key, solely in a Memtable, seems to be very expensive, but most importantly proportional to the ordered position where the start column of the slice lives. In other words: 1- if I start Cassandra fresh (with an empty ColumnFamily with TimeUUID comparator) 2- I create a single Row with Key K 3- Then add 200K TimeUUID columns to key K 4- (and make sure nothing is flushed to SSTables...so it's all in the Memtable) ...I observe the following timings (secondds to perform 1000 reads) while performing multiget slices on it: (pardon the pseudo-code, but you'll get the gist) a) simply a get of the first column: GET(K,:count=1) -- 2.351226 b) doing a slice get, starting from the first column: GET(K,:start = '144abe16-416c-11e1-9e23-2cbae9ddfe8b' , :count = 1 ) -- 2.189224 - so with or without start doesn't seem to make much of a difference c) doing a slice get, starting from the middle of the ordered columns..approx starting at item number 100K: GET(K,:start = '9c13c644-416c-11e1-81dd-4ba530dc83d0' , :count = 1 ) -- 11.849326 - 5 times more expensive if the start of the slice is 100K positions away d) doing a slice get, starting from the last of the ordered columns..approx position 200K: GET(K,:start = '1c1b9b32-416d-11e1-83ff-dd2796c3abd7' , :count = 1 ) -- 19.889741 - Almost twice as expensive than starting the slice at position 100K, and 10 times more expensive than starting from the first one This behavior leads me to believe that there's a clear Memtable column scan for the columns of the key. If one tries a column name read on those positions (i.e., not a slice), the performance is constant. I.e., GET(K, '144abe16-416c-11e1-9e23-2cbae9ddfe8b') . Retrieving the first, middle or last timeUUID is done in the same amount of time. Having increasingly worse performance for column slices in Memtables seems to be a bit of a problem...aren't Memtables backed by a structure that has some sort of column name indexing?...so that landing on the start column can be efficient? I'm definitely observing very high CPU utilization on those scans...By the way, with wide columns like this, slicing SSTables is quite faster than slicing Memtables...I'm attributing that to the sampled index of the SSTables, hence that's why I'm wondering if the Memtables do not have such column indexing builtin and resort to linked lists of sort Note, that the actual timings shown are not important, it's in my laptop and I have a small amount of debugging enabled...what it is important is the difference between then. I'm using Cassandra trunk as of Dec 1st, but I believe I've done experiments with 0.8 series too, leading to the same issue. You may want to retry your experiments on current trunk. We do had inefficiency in our memtable search that was fixed by: https://issues.apache.org/jira/browse/CASSANDRA-3545 (the name of the ticket doesn't make it clear that it's related but it is) The issue was committed on December 8. -- Sylvain Cheers, Josep M.
Re: cassandra hit a wall: Too many open files (98567!)
On Fri, Jan 13, 2012 at 8:01 PM, Thorsten von Eicken t...@rightscale.com wrote: I'm running a single node cassandra 1.0.6 server which hit a wall yesterday: ERROR [CompactionExecutor:2918] 2012-01-12 20:37:06,327 AbstractCassandraDaemon.java (line 133) Fatal exception in thread Thread[CompactionExecutor:2918,1,main] java.io.IOError: java.io.FileNotFoundException: /mnt/ebs/data/rslog_production/req_word_idx-hc-453661-Data.db (Too many open files in system) After that it stopped working and just say there with this error (undestandable). I did an lsof and saw that it had 98567 open files, yikes! An ls in the data directory shows 234011 files. After restarting it spent about 5 hours compacting, then quieted down. About 173k files left in the data directory. I'm using leveldb (with compression). I looked into the json of the two large CFs and gen 0 is empty, most sstables are gen 3 4. I have a total of about 150GB of data (compressed). Almost all the SStables are around 3MB in size. Aren't they supposed to get 10x bigger at higher gen's? No, with leveled compaction, the (max) size of sstables is fixed whatever the generation is (the default is 5MB, but it's 5MB of uncompressed data (we may change that though) so 3MB sound about right). What changes between generations is the number of sstables it can contain. Gen 1 can have 10 sstables (it can have more but only temporarily), Gen 2 can have 100, Gen 3 can have 1000 etc.. So again, that most sstables are in gen 3 and 4 is expected too. This situation can't be healthy, can it? Suggestions? Leveled compaction uses lots of files (the number is proportional to the amount of data). It is not necessarily a big problem as modern OS deal wit big amount of open files fairly well (as far as I know at least). I would just up the file descriptor ulimit and not worry too much about it, unless you have reasons to believe that it's an actual descriptor leak (but given the number of files you have, the number of open ones doesn't seem off so I don't think there is one here) or that this has performance impacts. -- Sylvain
Re: nodetool ring question
I also have this problem. My data on nodes grows to roughly 30GB. After a restart only 5GB remains. Is a factor 6 common for Cassandra? 2012/1/18 aaron morton aa...@thelastpickle.com Good idea Jeremiah, are you using compression Michael ? Scanning through the CF stats this jumps out… Column Family: Attractions SSTable count: 3 Space used (live): 27542876685 Space used (total): 1213220387 Thats 25Gb of live data but only 1.3GB total. Otherwise want to see if a restart fixes it :) Would be interesting to know if it's wrong from the start or drifts during streaming or compaction. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 18/01/2012, at 12:04 PM, Jeremiah Jordan wrote: There were some nodetool ring load reporting issues with early version of 1.0.X don't remember when they were fixed, but that could be your issue. Are you using compressed column families, a lot of the issues were with those. Might update to 1.0.7. -Jeremiah On 01/16/2012 04:04 AM, Michael Vaknine wrote: Hi, ** ** I have a 4 nodes cluster 1.0.3 version ** ** This is what I get when I run nodetool ring ** ** Address DC RackStatus State Load OwnsToken 127605887595351923798765477786913079296 10.8.193.87 datacenter1 rack1 Up Normal 46.47 GB 25.00% 0 10.5.7.76 datacenter1 rack1 Up Normal 48.01 GB 25.00% 42535295865117307932921825928971026432 10.8.189.197datacenter1 rack1 Up Normal 53.7 GB 25.00% 85070591730234615865843651857942052864 10.5.3.17 datacenter1 rack1 Up Normal 43.49 GB 25.00% 127605887595351923798765477786913079296 ** ** I have finished running repair on all 4 nodes. ** ** I have less then 10 GB on the /var/lib/cassandra/data/ folders ** ** My question is Why nodetool reports almost 50 GB on each node? ** ** Thanks Michael
Re: Hector + Range query problem
Hi aaron Nope: I'm using BOP...forgot to mention it in my original message. I changed it to a multiget and it works but i think the range would be more efficient so I'd really like to solve this. Thanks Le 18 janv. 2012 09:18, aaron morton aa...@thelastpickle.com a écrit : Does this help ? http://wiki.apache.org/cassandra/FAQ#range_rp Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 17/01/2012, at 10:58 AM, Philippe wrote: Hello, I've been trying to retrieve rows based on key range but every single time I test, Hector retrieves ALL the rows, no matter the range I give it. What can I possibly be doing wrong ? Thanks. I'm doing a test on a single-node RF=1 cluster (c* 1.0.5) with one column family (I've added truncated the CF quite a few times during my tests). Each row has a single column whose name is the byte value 2. The keys are 0,1,2,3 (shifted by a number of bits). The values are 0,1,2,3. list in the CLI gives me Using default limit of 100 --- RowKey: 02 = (column=02, value=00, timestamp=1326750723079000) --- RowKey: 010002 = (column=02, value=01, timestamp=1326750723239000) --- RowKey: 020002 = (column=02, value=02, timestamp=1326750723329000) --- RowKey: 030002 = (column=02, value=03, timestamp=1326750723416000) 4 Rows Returned. Hector code: RangeSlicesQueryTileKey,Byte,byte[] query = HFactory.createRangeSlicesQuery(keyspace, keySerializer, columnNameSerializer, BytesArraySerializer .get()); query.setColumnFamily(overlay).setKeys(keyStart, keyEnd).setColumnNames(( byte)2); query.execute(); The execution log shows 1359 [main] INFO com.sensorly.heatmap.drawing.cassandra.CassandraTileDao - Range query from TileKey [overlayName=UNSET, tilex=0, tiley=0, zoom=2] to TileKey [overlayName=UNSET, tilex=1, tiley=0, zoom=2] = morton codes = [02,010002] getFiles() query returned TileKey [overlayName=UNSET, tilex=0, tiley=0, zoom=2] with 1 columns, morton = 02 getFiles() query returned TileKey [overlayName=UNSET, tilex=1, tiley=0, zoom=2] with 1 columns, morton = 010002 getFiles() query returned TileKey [overlayName=UNSET, tilex=0, tiley=1, zoom=2] with 1 columns, morton = 020002 getFiles() query returned TileKey [overlayName=UNSET, tilex=1, tiley=1, zoom=2] with 1 columns, morton = 030002 = ALL rows are returned when I really expect it to only return the 1st one.
RE: nodetool ring question
I did restart the cluster and now it is normal 5GB. From: R. Verlangen [mailto:ro...@us2.nl] Sent: Wednesday, January 18, 2012 11:32 AM To: user@cassandra.apache.org Subject: Re: nodetool ring question I also have this problem. My data on nodes grows to roughly 30GB. After a restart only 5GB remains. Is a factor 6 common for Cassandra? 2012/1/18 aaron morton aa...@thelastpickle.com Good idea Jeremiah, are you using compression Michael ? Scanning through the CF stats this jumps out. Column Family: Attractions SSTable count: 3 Space used (live): 27542876685 Space used (total): 1213220387 Thats 25Gb of live data but only 1.3GB total. Otherwise want to see if a restart fixes it :) Would be interesting to know if it's wrong from the start or drifts during streaming or compaction. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 18/01/2012, at 12:04 PM, Jeremiah Jordan wrote: There were some nodetool ring load reporting issues with early version of 1.0.X don't remember when they were fixed, but that could be your issue. Are you using compressed column families, a lot of the issues were with those. Might update to 1.0.7. -Jeremiah On 01/16/2012 04:04 AM, Michael Vaknine wrote: Hi, I have a 4 nodes cluster 1.0.3 version This is what I get when I run nodetool ring Address DC RackStatus State LoadOwns Token 127605887595351923798765477786913079296 10.8.193.87 datacenter1 rack1 Up Normal 46.47 GB 25.00% 0 10.5.7.76 datacenter1 rack1 Up Normal 48.01 GB 25.00% 42535295865117307932921825928971026432 10.8.189.197datacenter1 rack1 Up Normal 53.7 GB 25.00% 85070591730234615865843651857942052864 10.5.3.17 datacenter1 rack1 Up Normal 43.49 GB 25.00% 127605887595351923798765477786913079296 I have finished running repair on all 4 nodes. I have less then 10 GB on the /var/lib/cassandra/data/ folders My question is Why nodetool reports almost 50 GB on each node? Thanks Michael
Re: cassandra hit a wall: Too many open files (98567!)
1.0.6 has a file leak problem, fixed in 1.0.7. Perhaps this is the reason? https://issues.apache.org/jira/browse/CASSANDRA-3616 /Janne On Jan 18, 2012, at 03:52 , dir dir wrote: Very Interesting Why you open so many file? Actually what kind of system that is built by you until open so many files? would you tell us? Thanks... On Sat, Jan 14, 2012 at 2:01 AM, Thorsten von Eicken t...@rightscale.com wrote: I'm running a single node cassandra 1.0.6 server which hit a wall yesterday: ERROR [CompactionExecutor:2918] 2012-01-12 20:37:06,327 AbstractCassandraDaemon.java (line 133) Fatal exception in thread Thread[CompactionExecutor:2918,1,main] java.io.IOError: java.io.FileNotFoundException: /mnt/ebs/data/rslog_production/req_word_idx-hc-453661-Data.db (Too many open files in system) After that it stopped working and just say there with this error (undestandable). I did an lsof and saw that it had 98567 open files, yikes! An ls in the data directory shows 234011 files. After restarting it spent about 5 hours compacting, then quieted down. About 173k files left in the data directory. I'm using leveldb (with compression). I looked into the json of the two large CFs and gen 0 is empty, most sstables are gen 3 4. I have a total of about 150GB of data (compressed). Almost all the SStables are around 3MB in size. Aren't they supposed to get 10x bigger at higher gen's? This situation can't be healthy, can it? Suggestions?
RE: Incremental backups
Hi, Thank you for response. I did restart for all the nodes and now I can see files in backup folders so It seems like it is working. During this process I have noticed to something very strange In data/City folder there are files that are not created in the snapshot folder (it looks like old orphaned files) Is there any process of cassandta that will delete uneeded files I tried to run nodetool cleanup but it did not help. This is the files: -rw-r--r-- 1 cassandra cassandra 230281 2011-12-06 00:57 AttractionCheckins.3039706172746974696f6e-f-157-Data.db -rw-r--r-- 1 cassandra cassandra 1936 2011-12-06 00:57 AttractionCheckins.3039706172746974696f6e-f-157-Filter.db -rw-r--r-- 1 cassandra cassandra 27 2011-12-06 00:57 AttractionCheckins.3039706172746974696f6e-f-157-Index.db -rw-r--r-- 1 cassandra cassandra 4264 2011-12-06 00:57 AttractionCheckins.3039706172746974696f6e-f-157-Statistics.db -rw-r--r-- 1 cassandra cassandra 1321 2011-12-06 00:58 AttractionCheckins.3039706172746974696f6e-f-158-Data.db -rw-r--r-- 1 cassandra cassandra 16 2011-12-06 00:58 AttractionCheckins.3039706172746974696f6e-f-158-Filter.db -rw-r--r-- 1 cassandra cassandra 27 2011-12-06 00:58 AttractionCheckins.3039706172746974696f6e-f-158-Index.db -rw-r--r-- 1 cassandra cassandra 4264 2011-12-06 00:58 AttractionCheckins.3039706172746974696f6e-f-158-Statistics.db -rw-r--r-- 1 cassandra cassandra2627100 2011-12-06 06:55 Attractions.3039706172746974696f6e-f-1156-Data.db -rw-r--r-- 1 cassandra cassandra 1936 2011-12-06 06:55 Attractions.3039706172746974696f6e-f-1156-Filter.db -rw-r--r-- 1 cassandra cassandra 20 2011-12-06 06:55 Attractions.3039706172746974696f6e-f-1156-Index.db -rw-r--r-- 1 cassandra cassandra 4264 2011-12-06 06:55 Attractions.3039706172746974696f6e-f-1156-Statistics.db -rw-r--r-- 1 cassandra cassandra2238358 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1157-Data.db -rw-r--r-- 1 cassandra cassandra 16 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1157-Filter.db -rw-r--r-- 1 cassandra cassandra 20 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1157-Index.db -rw-r--r-- 1 cassandra cassandra 4264 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1157-Statistics.db -rw-r--r-- 1 cassandra cassandra 92 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1158-Data.db -rw-r--r-- 1 cassandra cassandra 16 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1158-Filter.db -rw-r--r-- 1 cassandra cassandra 20 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1158-Index.db -rw-r--r-- 1 cassandra cassandra 4264 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1158-Statistics.db -rw-r--r-- 1 cassandra cassandra 44799 2011-12-06 01:25 CityResources.3039706172746974696f6e-f-365-Data.db -rw-r--r-- 1 cassandra cassandra 1936 2011-12-06 01:25 CityResources.3039706172746974696f6e-f-365-Filter.db -rw-r--r-- 1 cassandra cassandra196 2011-12-06 01:25 CityResources.3039706172746974696f6e-f-365-Index.db -rw-r--r-- 1 cassandra cassandra 4264 2011-12-06 01:25 CityResources.3039706172746974696f6e-f-365-Statistics.db -rw-r--r-- 1 cassandra cassandra 7647 2011-12-06 07:50 CityResources.3039706172746974696f6e-f-366-Data.db -rw-r--r-- 1 cassandra cassandra 24 2011-12-06 07:50 CityResources.3039706172746974696f6e-f-366-Filter.db -rw-r--r-- 1 cassandra cassandra 96 2011-12-06 07:50 CityResources.3039706172746974696f6e-f-366-Index.db -rw-r--r-- 1 cassandra cassandra 4264 2011-12-06 07:50 CityResources.3039706172746974696f6e-f-366-Statistics.db Thanks Michael From: Alain RODRIGUEZ [mailto:arodr...@gmail.com] Sent: Wednesday, January 18, 2012 10:40 AM To: user@cassandra.apache.org Subject: Re: Incremental backups As this option is in the cassandra.yaml file, you might need to perform a restart of your entire cluster (a rolling restart should work). Hope this will help. Alain 2012/1/18 Michael Vaknine micha...@citypath.com Hi, I am configured to do incremental backups on all my node on the cluster but it is not working. In cassandra.yaml : incremental_backups: true When I check data folder there are some keyspaces that has folder backups but empty and I suspect this is a folder created in the past when I had 0.7.6 version. In a new creted Keyspace the folder does not exists. Does someone know if I need to configure any thing besides cassandra.yaml for this to work? Thanks Michael
Deploying Cassandra 1.0.7 on EC2 in minutes
Hi guys, I just want to the let you know that Apache Whirr trunk (the upcoming 0.7.1 release) can deploy Cassandra 1.0.7 on AWS EC2 Rackspace Cloud. You can give it a try by running the following commands: https://gist.github.com/1632893 And the last thing we would appreciate any suggestions on improving the deployment scripts or on improving Whirr. Thanks, -- Andrei Savu / andreisavu.ro
RE: JMX BulkLoad weirdness
I'm running 1.0.6 on both clusters. After running a nodetool repair on all machines, everything seems to be behaving correctly, and AFAIK, no data has been lost. If what you say is true and the exception was preventing a file from being used, then I imagine that the nodetool repair corrected that data from replicas. Unfortunately, the only steps I have I outlined below. I suspect it had something to do with that particular data set, however. When I did the exact same steps for a different data set, the error did not appear, and the streaming proceeded as normal. Perhaps a particular SSTable in the set was corrupted? Scott From: aaron morton [aa...@thelastpickle.com] Sent: Wednesday, January 18, 2012 1:52 AM To: user@cassandra.apache.org Subject: Re: JMX BulkLoad weirdness I'd need the version number to be sure, but it looks like that error will stop the node from actually using the data that has been streamed to it. The file is been received, the aux files (bloom etc) are created, the file is opened but the exception stops the file from been used. I've not looked at the JMX bulk load for a while. If you google around you may find some examples. If you have some more steps to repo we may be able to look into it. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 17/01/2012, at 2:42 AM, Scott Fines wrote: Unfortunately, I'm not doing a 1-1 migration; I'm moving data from a 15-node to a 6-node cluster. In this case, that means an excessive amount of time spent repairing data put on to the wrong machines. Also, the bulkloader's requirement of having either a different IP address or a different machine is something that I don't really want to bother with, if I can activate it through JMX. It seems like the JMX bulkloader works perfectly fine, however, except for the error that I mentioned below. So I suppose I'll ask again, is that error something to be concerned about? Thanks, Scott From: aaron morton [aa...@thelastpickle.commailto:aa...@thelastpickle.com] Sent: Sunday, January 15, 2012 12:07 PM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: JMX BulkLoad weirdness If you are doing a straight one-to-one copy from one cluster to another try… 1) nodetool snapshot on each prod node for the system and application key spaces. 2) rsync system and app key space snapshots 3) update the yaml files on the new cluster to have the correct initial_tokens. This is not necessary as they are stored in the system KS, but it is limits surprises later. 4) Start the new cluster. For bulk load you will want to use the sstableloader http://www.datastax.com/dev/blog/bulk-loading Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 14/01/2012, at 3:32 AM, Scott Fines wrote: Hi all, I'm trying to copy a column family from our production cluster to our development one for testing purposes, so I thought I would try the bulkload API. Since I'm lazy, I'm using the Cassandra bulkLoad JMX call from one of the development machines. Here are the steps I followed: 1. (on production C* node): nodetool flush keyspace CF 2. rsync SSTables from production C* node to development C* node 3. bulkLoad SSTables through JMX But when I do that, on one of the development C* nodes, I keep getting this exception: java.lang.NullPointerException at org.apache.cassandra.io.sstable.SSTable.getMinimalKey(SSTable.java:156) at org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTableWriter.java:334) at org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTableWriter.java:302) at org.apache.cassandra.streaming.IncomingStreamReader.streamIn(IncomingStreamReader.java:156) at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:88) at org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:184) After which, the node itself seems to stream data successfully (I'm in the middle of checking that right now). Is this an error that I should be concerned about? Thanks, Scott
Re: Deploying Cassandra 1.0.7 on EC2 in minutes
Thanks Andrei! On Wed, Jan 18, 2012 at 8:00 AM, Andrei Savu savu.and...@gmail.com wrote: Hi guys, I just want to the let you know that Apache Whirr trunk (the upcoming 0.7.1 release) can deploy Cassandra 1.0.7 on AWS EC2 Rackspace Cloud. You can give it a try by running the following commands: https://gist.github.com/1632893 And the last thing we would appreciate any suggestions on improving the deployment scripts or on improving Whirr. Thanks, -- Andrei Savu / andreisavu.ro -- http://twitter.com/tjake
How to store unique visitors in cassandra
I'm wondering how to modelize my CFs to store the number of unique visitors in a time period in order to be able to request it fast. I thought of sharding them by day (row = 20120118, column = visitor_id, value = '') and perform a getcount. This would work to get unique visitors per day, per week or per month but it wouldn't work if I want to get unique visitors between 2 specific dates because 2 rows can share the same visitors (same columns). I can have 1500 unique visitors today, 1000 unique visitors yesterday but only 2000 new visitors when aggregating these days. I could get all the columns for this 2 rows and perform an intersect with my client language but performance won't be good with big data. Has someone already thought about this modelization ? Thanks for your help ;) Alain
Re: How to store unique visitors in cassandra
Why not http://www.countandra.org/ Lucas de Souza Santos (ldss) On Wed, Jan 18, 2012 at 3:23 PM, Alain RODRIGUEZ arodr...@gmail.com wrote: I'm wondering how to modelize my CFs to store the number of unique visitors in a time period in order to be able to request it fast. I thought of sharding them by day (row = 20120118, column = visitor_id, value = '') and perform a getcount. This would work to get unique visitors per day, per week or per month but it wouldn't work if I want to get unique visitors between 2 specific dates because 2 rows can share the same visitors (same columns). I can have 1500 unique visitors today, 1000 unique visitors yesterday but only 2000 new visitors when aggregating these days. I could get all the columns for this 2 rows and perform an intersect with my client language but performance won't be good with big data. Has someone already thought about this modelization ? Thanks for your help ;) Alain
Max records per node for a given secondary index value
Hi All, It is great to know that Cassandra column family can accommodate 2 billion columns per row! I was reading about how Cassandra stores the secondary index info internally. I now understand that the index related data are stored in hidden CF and each node is responsible to store the keys of data that reside on that node only. I have been using secondary index for a low cardinality column called product. There can only be 3 possible values for this column. I have a four node cluster and process about 5000 records per second with a RF 2. My question here is, what happens after the number of columns in hidden index CF exceeds 2 billion? How does Cassandra handle this situation? I guess, one way to handle this is to add more nodes to the cluster. I am interested in knowing if any other solution exist. Thanks, Kamal
Re: Deploying Cassandra 1.0.7 on EC2 in minutes
Hi Andrei, As you know, we are using Whirr for ElasticInbox (https://github.com/elasticinbox/whirr-elasticinbox). While testing we encountered a few minor problems which I think could be improved. Note that we were using 0.6 (there were some strange bug in 0.7, maybe fixed already). Although initial_token is pre-calculated to form balanced cluster, our tests cluster (4 nodes) was always unbalanced. There were no initial_token specified (just default). Second note is AWS specific - for the performance reasons it's better to store data files on ephemeral drive. Currently data stored under default location (/var/...) Thanks for the great work! -- Rustam. On 18/01/2012 13:00, Andrei Savu wrote: Hi guys, I just want to the let you know that Apache Whirr trunk (the upcoming 0.7.1 release) can deploy Cassandra 1.0.7 on AWS EC2 Rackspace Cloud. You can give it a try by running the following commands: https://gist.github.com/1632893 And the last thing we would appreciate any suggestions on improving the deployment scripts or on improving Whirr. Thanks, -- Andrei Savu / andreisavu.ro http://andreisavu.ro
Re: specifying initial cassandra schema
Thanks and appreciate the responses. Will look into this. thanks Ramesh On Wed, Jan 18, 2012 at 2:27 AM, aaron morton aa...@thelastpickle.com wrote: check the command line help for cassandra-cli, you can pass it a file name. e.g. cassandra --host localhost --file schema.txt Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 18/01/2012, at 9:35 AM, Carlos Pérez Miguel wrote: Hi Ramesh You can use the schematool command. I am using it for the same purposes in Cassandra 0.7.9. I use the following line in my cassandra startup script: $CASSANDRA_HOME/bin/schematool HOSTNAME 8080 import where HOSTNAME is the hostname of your test machine. It will import the schema from your cassandra.yaml file. If you execute it and there is already a schema in the cassandra cluster, you'll get a exception from schematool but no impact to the cluster. Bye Carlos Pérez Miguel 2012/1/17 Ramesh Natarajan rames...@gmail.com: I usually start cassandra and then use cassandra-cli to import a schema. Is there any automated way to load a fixed schema when cassandra starts automatically? I have a test setup where i run cassandra on a single node. I have a OS image packaged with cassandra and it automatically starts cassandra as a part of OS boot up. I saw some old references to specify schema in cassandra.yaml. Is this still supported in Cassandra 1.x? Are there any examples? thanks Ramesh
Re: poor Memtable performance on column slices?
Excellent Sylvain! Yes, that seems to remove the linear scan component of slice read times. FYI, I still see some interesting difference in some aspects though. If I do a slice without a start (i.e., get me the first column)...it seems to fly. GET(K, :count = 1 ) -- 4.832877 -- very fast, and actually in this case I see the reading client being the bottleneck, not cassandra (which it is at about 20% CPU only) If I do the same, but actually specifying the start column with the first existing value...GET(K,:start = '144abe16-416c-11e1-9e23-2cbae9ddfe8b' , :count = 1 ) -- 11.084275 -- half as fast, and using twice the CPU...hovering about 50% or more. (again Cassandra is not the bottleneck, but the significant data is that the initial seeking seems to be doubling the time/cpu If I do the same, starting by the middle. GET(K,:start = '9c13c644-416c-11e1-81dd-4ba530dc83d0' , :count = 1 ) -- 11.038187 -- as expensive as starting from the beginning The same starting at the last one. GET(K,:start = '1c1b9b32-416d-11e1-83ff-dd2796c3abd7' , :count = 1 ) -- 6.489683 - Much faster than any other slice ... although not quite as fast as not using a start column I could see that not having to seek into whatever backing map/structure is obviously faster...although I'm surprised that seeking to an initial value results in half as slow reads. Wouldn't this mostly imply following some links/pointers in memory to start reading ordered columns? What is the backing store used for Memtables when column slices are performed? I am not sure why starting at the end (without reversing or anything) yields much better performance. Cheers, Josep M. On Wed, Jan 18, 2012 at 12:57 AM, Sylvain Lebresne sylv...@datastax.comwrote: On Wed, Jan 18, 2012 at 2:44 AM, Josep Blanquer blanq...@rightscale.com wrote: Hi, I've been doing some tests using wide rows recently, and I've seen some odd performance problems that I'd like to understand. In particular, I've seen that the time it takes for Cassandra to perform a column slice of a single key, solely in a Memtable, seems to be very expensive, but most importantly proportional to the ordered position where the start column of the slice lives. In other words: 1- if I start Cassandra fresh (with an empty ColumnFamily with TimeUUID comparator) 2- I create a single Row with Key K 3- Then add 200K TimeUUID columns to key K 4- (and make sure nothing is flushed to SSTables...so it's all in the Memtable) ...I observe the following timings (secondds to perform 1000 reads) while performing multiget slices on it: (pardon the pseudo-code, but you'll get the gist) a) simply a get of the first column: GET(K,:count=1) -- 2.351226 b) doing a slice get, starting from the first column: GET(K,:start = '144abe16-416c-11e1-9e23-2cbae9ddfe8b' , :count = 1 ) -- 2.189224 - so with or without start doesn't seem to make much of a difference c) doing a slice get, starting from the middle of the ordered columns..approx starting at item number 100K: GET(K,:start = '9c13c644-416c-11e1-81dd-4ba530dc83d0' , :count = 1 ) -- 11.849326 - 5 times more expensive if the start of the slice is 100K positions away d) doing a slice get, starting from the last of the ordered columns..approx position 200K: GET(K,:start = '1c1b9b32-416d-11e1-83ff-dd2796c3abd7' , :count = 1 ) -- 19.889741 - Almost twice as expensive than starting the slice at position 100K, and 10 times more expensive than starting from the first one This behavior leads me to believe that there's a clear Memtable column scan for the columns of the key. If one tries a column name read on those positions (i.e., not a slice), the performance is constant. I.e., GET(K, '144abe16-416c-11e1-9e23-2cbae9ddfe8b') . Retrieving the first, middle or last timeUUID is done in the same amount of time. Having increasingly worse performance for column slices in Memtables seems to be a bit of a problem...aren't Memtables backed by a structure that has some sort of column name indexing?...so that landing on the start column can be efficient? I'm definitely observing very high CPU utilization on those scans...By the way, with wide columns like this, slicing SSTables is quite faster than slicing Memtables...I'm attributing that to the sampled index of the SSTables, hence that's why I'm wondering if the Memtables do not have such column indexing builtin and resort to linked lists of sort Note, that the actual timings shown are not important, it's in my laptop and I have a small amount of debugging enabled...what it is important is the difference between then. I'm using Cassandra trunk as of Dec 1st, but I believe I've done experiments with 0.8 series too, leading to the same issue. You may want to retry your experiments on current trunk. We do had inefficiency in our memtable search that was fixed by:
Re: poor Memtable performance on column slices?
On Wed, Jan 18, 2012 at 12:31 PM, Josep Blanquer blanq...@rightscale.com wrote: If I do a slice without a start (i.e., get me the first column)...it seems to fly. GET(K, :count = 1 ) Yep, that's a totally different code path (SimpleSliceReader instead of IndexedSliceReader) that we've done to optimize this common case. The same starting at the last one. GET(K,:start = '1c1b9b32-416d-11e1-83ff-dd2796c3abd7' , :count = 1 ) -- 6.489683 - Much faster than any other slice ... although not quite as fast as not using a start column That's not a special code path, but I'd guess that the last column is more likely to be still in memory instead of on disk. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Unbalanced cluster with RandomPartitioner
If you have performed any token moves the data will not be deleted until you run nodetool cleanup. To get a baseline I would run nodetool compact to do major compaction and purge any tomb stones as others have said. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 18/01/2012, at 2:19 PM, Maki Watanabe wrote: Are there any significant difference of number of sstables on each nodes? 2012/1/18 Marcel Steinbach marcel.steinb...@chors.de: We are running regular repairs, so I don't think that's the problem. And the data dir sizes match approx. the load from the nodetool. Thanks for the advise, though. Our keys are digits only, and all contain a few zeros at the same offsets. I'm not that familiar with the md5 algorithm, but I doubt that it would generate 'hotspots' for those kind of keys, right? On 17.01.2012, at 17:34, Mohit Anchlia wrote: Have you tried running repair first on each node? Also, verify using df -h on the data dirs On Tue, Jan 17, 2012 at 7:34 AM, Marcel Steinbach marcel.steinb...@chors.de wrote: Hi, we're using RP and have each node assigned the same amount of the token space. The cluster looks like that: Address Status State LoadOwnsToken 205648943402372032879374446248852460236 1 Up Normal 310.83 GB 12.50% 56775407874461455114148055497453867724 2 Up Normal 470.24 GB 12.50% 78043055807020109080608968461939380940 3 Up Normal 271.57 GB 12.50% 99310703739578763047069881426424894156 4 Up Normal 282.61 GB 12.50% 120578351672137417013530794390910407372 5 Up Normal 248.76 GB 12.50% 141845999604696070979991707355395920588 6 Up Normal 164.12 GB 12.50% 163113647537254724946452620319881433804 7 Up Normal 76.23 GB12.50% 184381295469813378912913533284366947020 8 Up Normal 19.79 GB12.50% 205648943402372032879374446248852460236 I was under the impression, the RP would distribute the load more evenly. Our row sizes are 0,5-1 KB, hence, we don't store huge rows on a single node. Should we just move the nodes so that the load is more even distributed, or is there something off that needs to be fixed first? Thanks Marcel hr style=border-color:blue pchors GmbH brhr style=border-color:blue pspecialists in digital and direct marketing solutionsbr Haid-und-Neu-Straße 7br 76131 Karlsruhe, Germanybr www.chors.com/p pManaging Directors: Dr. Volker Hatz, Markus PlattnerbrAmtsgericht Montabaur, HRB 15029/p p style=font-size:9pxThis e-mail is for the intended recipient only and may contain confidential or privileged information. If you have received this e-mail by mistake, please contact us immediately and completely delete it (and any attachments) and do not forward it or inform any other person of its contents. If you send us messages by e-mail, we take this as your authorization to correspond with you by e-mail. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, amended, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. Neither chors GmbH nor the sender accept liability for any errors or omissions in the content of this message which arise as a result of its e-mail transmission. Please note that all e-mail communications to and from chors GmbH may be monitored./p -- w3m
Re: nodetool ring question
Michael, Robin Let us know if the reported live load is increasing and diverging from the on disk size. If it is can you check nodetool cfstats and find an example of a particular CF where Space Used Live has diverged from the on disk size. The provide the schema for the CF and any other info that may be handy. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 18/01/2012, at 10:58 PM, Michael Vaknine wrote: I did restart the cluster and now it is normal 5GB. From: R. Verlangen [mailto:ro...@us2.nl] Sent: Wednesday, January 18, 2012 11:32 AM To: user@cassandra.apache.org Subject: Re: nodetool ring question I also have this problem. My data on nodes grows to roughly 30GB. After a restart only 5GB remains. Is a factor 6 common for Cassandra? 2012/1/18 aaron morton aa...@thelastpickle.com Good idea Jeremiah, are you using compression Michael ? Scanning through the CF stats this jumps out… Column Family: Attractions SSTable count: 3 Space used (live): 27542876685 Space used (total): 1213220387 Thats 25Gb of live data but only 1.3GB total. Otherwise want to see if a restart fixes it :) Would be interesting to know if it's wrong from the start or drifts during streaming or compaction. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 18/01/2012, at 12:04 PM, Jeremiah Jordan wrote: There were some nodetool ring load reporting issues with early version of 1.0.X don't remember when they were fixed, but that could be your issue. Are you using compressed column families, a lot of the issues were with those. Might update to 1.0.7. -Jeremiah On 01/16/2012 04:04 AM, Michael Vaknine wrote: Hi, I have a 4 nodes cluster 1.0.3 version This is what I get when I run nodetool ring Address DC RackStatus State LoadOwns Token 127605887595351923798765477786913079296 10.8.193.87 datacenter1 rack1 Up Normal 46.47 GB25.00% 0 10.5.7.76 datacenter1 rack1 Up Normal 48.01 GB25.00% 42535295865117307932921825928971026432 10.8.189.197datacenter1 rack1 Up Normal 53.7 GB 25.00% 85070591730234615865843651857942052864 10.5.3.17 datacenter1 rack1 Up Normal 43.49 GB25.00% 127605887595351923798765477786913079296 I have finished running repair on all 4 nodes. I have less then 10 GB on the /var/lib/cassandra/data/ folders My question is Why nodetool reports almost 50 GB on each node? Thanks Michael
Re: Incremental backups
Looks like you are on a 0.7.X release, which one exactly ? It would be a really good idea to at least be on 8.X, preferably 1.0 Pre 1.0 compacted SSTables were removed during JVM GC, but compacted SSTables have a .Compacted file created so we know they are no longer needed. These SSTables look like secondary index files. It may be a bug if they are not included in the incremental backups. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 19/01/2012, at 12:13 AM, Michael Vaknine wrote: Hi, Thank you for response. I did restart for all the nodes and now I can see files in backup folders so It seems like it is working. During this process I have noticed to something very strange In data/City folder there are files that are not created in the snapshot folder (it looks like old orphaned files) Is there any process of cassandta that will delete uneeded files I tried to run nodetool cleanup but it did not help. This is the files: -rw-r--r-- 1 cassandra cassandra 230281 2011-12-06 00:57 AttractionCheckins.3039706172746974696f6e-f-157-Data.db -rw-r--r-- 1 cassandra cassandra 1936 2011-12-06 00:57 AttractionCheckins.3039706172746974696f6e-f-157-Filter.db -rw-r--r-- 1 cassandra cassandra 27 2011-12-06 00:57 AttractionCheckins.3039706172746974696f6e-f-157-Index.db -rw-r--r-- 1 cassandra cassandra 4264 2011-12-06 00:57 AttractionCheckins.3039706172746974696f6e-f-157-Statistics.db -rw-r--r-- 1 cassandra cassandra 1321 2011-12-06 00:58 AttractionCheckins.3039706172746974696f6e-f-158-Data.db -rw-r--r-- 1 cassandra cassandra 16 2011-12-06 00:58 AttractionCheckins.3039706172746974696f6e-f-158-Filter.db -rw-r--r-- 1 cassandra cassandra 27 2011-12-06 00:58 AttractionCheckins.3039706172746974696f6e-f-158-Index.db -rw-r--r-- 1 cassandra cassandra 4264 2011-12-06 00:58 AttractionCheckins.3039706172746974696f6e-f-158-Statistics.db -rw-r--r-- 1 cassandra cassandra2627100 2011-12-06 06:55 Attractions.3039706172746974696f6e-f-1156-Data.db -rw-r--r-- 1 cassandra cassandra 1936 2011-12-06 06:55 Attractions.3039706172746974696f6e-f-1156-Filter.db -rw-r--r-- 1 cassandra cassandra 20 2011-12-06 06:55 Attractions.3039706172746974696f6e-f-1156-Index.db -rw-r--r-- 1 cassandra cassandra 4264 2011-12-06 06:55 Attractions.3039706172746974696f6e-f-1156-Statistics.db -rw-r--r-- 1 cassandra cassandra2238358 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1157-Data.db -rw-r--r-- 1 cassandra cassandra 16 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1157-Filter.db -rw-r--r-- 1 cassandra cassandra 20 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1157-Index.db -rw-r--r-- 1 cassandra cassandra 4264 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1157-Statistics.db -rw-r--r-- 1 cassandra cassandra 92 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1158-Data.db -rw-r--r-- 1 cassandra cassandra 16 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1158-Filter.db -rw-r--r-- 1 cassandra cassandra 20 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1158-Index.db -rw-r--r-- 1 cassandra cassandra 4264 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1158-Statistics.db -rw-r--r-- 1 cassandra cassandra 44799 2011-12-06 01:25 CityResources.3039706172746974696f6e-f-365-Data.db -rw-r--r-- 1 cassandra cassandra 1936 2011-12-06 01:25 CityResources.3039706172746974696f6e-f-365-Filter.db -rw-r--r-- 1 cassandra cassandra196 2011-12-06 01:25 CityResources.3039706172746974696f6e-f-365-Index.db -rw-r--r-- 1 cassandra cassandra 4264 2011-12-06 01:25 CityResources.3039706172746974696f6e-f-365-Statistics.db -rw-r--r-- 1 cassandra cassandra 7647 2011-12-06 07:50 CityResources.3039706172746974696f6e-f-366-Data.db -rw-r--r-- 1 cassandra cassandra 24 2011-12-06 07:50 CityResources.3039706172746974696f6e-f-366-Filter.db -rw-r--r-- 1 cassandra cassandra 96 2011-12-06 07:50 CityResources.3039706172746974696f6e-f-366-Index.db -rw-r--r-- 1 cassandra cassandra 4264 2011-12-06 07:50 CityResources.3039706172746974696f6e-f-366-Statistics.db Thanks Michael From: Alain RODRIGUEZ [mailto:arodr...@gmail.com] Sent: Wednesday, January 18, 2012 10:40 AM To: user@cassandra.apache.org Subject: Re: Incremental backups As this option is in the cassandra.yaml file, you might need to perform a restart of your entire cluster (a rolling restart should work). Hope this will help. Alain 2012/1/18 Michael Vaknine micha...@citypath.com Hi, I am configured to do incremental backups on all my node on the cluster but it is not working. In cassandra.yaml : incremental_backups: true When I check data folder there are some keyspaces that has folder backups but empty and I
Re: Max records per node for a given secondary index value
Anyone? On Wed, Jan 18, 2012 at 9:53 AM, Kamal Bahadur mailtoka...@gmail.comwrote: Hi All, It is great to know that Cassandra column family can accommodate 2 billion columns per row! I was reading about how Cassandra stores the secondary index info internally. I now understand that the index related data are stored in hidden CF and each node is responsible to store the keys of data that reside on that node only. I have been using secondary index for a low cardinality column called product. There can only be 3 possible values for this column. I have a four node cluster and process about 5000 records per second with a RF 2. My question here is, what happens after the number of columns in hidden index CF exceeds 2 billion? How does Cassandra handle this situation? I guess, one way to handle this is to add more nodes to the cluster. I am interested in knowing if any other solution exist. Thanks, Kamal
Re: Max records per node for a given secondary index value
You need to shard your rows On Wed, Jan 18, 2012 at 5:46 PM, Kamal Bahadur mailtoka...@gmail.com wrote: Anyone? On Wed, Jan 18, 2012 at 9:53 AM, Kamal Bahadur mailtoka...@gmail.com wrote: Hi All, It is great to know that Cassandra column family can accommodate 2 billion columns per row! I was reading about how Cassandra stores the secondary index info internally. I now understand that the index related data are stored in hidden CF and each node is responsible to store the keys of data that reside on that node only. I have been using secondary index for a low cardinality column called product. There can only be 3 possible values for this column. I have a four node cluster and process about 5000 records per second with a RF 2. My question here is, what happens after the number of columns in hidden index CF exceeds 2 billion? How does Cassandra handle this situation? I guess, one way to handle this is to add more nodes to the cluster. I am interested in knowing if any other solution exist. Thanks, Kamal
Re: poor Memtable performance on column slices?
On Wed, Jan 18, 2012 at 12:44 PM, Jonathan Ellis jbel...@gmail.com wrote: On Wed, Jan 18, 2012 at 12:31 PM, Josep Blanquer blanq...@rightscale.com wrote: If I do a slice without a start (i.e., get me the first column)...it seems to fly. GET(K, :count = 1 ) Yep, that's a totally different code path (SimpleSliceReader instead of IndexedSliceReader) that we've done to optimize this common case. Thanks Jonathan, yup, that makes sense. It was surprising to me that avoiding the seek was that much faster..but I guess if it's a completely different code path, there might be many other things in play. The same starting at the last one. GET(K,:start = '1c1b9b32-416d-11e1-83ff-dd2796c3abd7' , :count = 1 ) -- 6.489683 - Much faster than any other slice ... although not quite as fast as not using a start column That's not a special code path, but I'd guess that the last column is more likely to be still in memory instead of on disk. Well, no need to prolong the thread, but my tests are exclusively in Memtable reads (data has not flushed)...so there's no SSTable read involved here...which is exactly why is felt a bit funny to have that case be considerably faster. I just wanted to bring it up to you guys, in case you can think of some cause and/or potential issue. Thanks for the responses! Josep M.
RE: Incremental backups
I am on 1.0.3 release and it looks like very old files that remained from the upgrade process. How can I verify that? Michael From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Thursday, January 19, 2012 2:22 AM To: user@cassandra.apache.org Subject: Re: Incremental backups Looks like you are on a 0.7.X release, which one exactly ? It would be a really good idea to at least be on 8.X, preferably 1.0 Pre 1.0 compacted SSTables were removed during JVM GC, but compacted SSTables have a .Compacted file created so we know they are no longer needed. These SSTables look like secondary index files. It may be a bug if they are not included in the incremental backups. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 19/01/2012, at 12:13 AM, Michael Vaknine wrote: Hi, Thank you for response. I did restart for all the nodes and now I can see files in backup folders so It seems like it is working. During this process I have noticed to something very strange In data/City folder there are files that are not created in the snapshot folder (it looks like old orphaned files) Is there any process of cassandta that will delete uneeded files I tried to run nodetool cleanup but it did not help. This is the files: -rw-r--r-- 1 cassandra cassandra 230281 2011-12-06 00:57 AttractionCheckins.3039706172746974696f6e-f-157-Data.db -rw-r--r-- 1 cassandra cassandra 1936 2011-12-06 00:57 AttractionCheckins.3039706172746974696f6e-f-157-Filter.db -rw-r--r-- 1 cassandra cassandra 27 2011-12-06 00:57 AttractionCheckins.3039706172746974696f6e-f-157-Index.db -rw-r--r-- 1 cassandra cassandra 4264 2011-12-06 00:57 AttractionCheckins.3039706172746974696f6e-f-157-Statistics.db -rw-r--r-- 1 cassandra cassandra 1321 2011-12-06 00:58 AttractionCheckins.3039706172746974696f6e-f-158-Data.db -rw-r--r-- 1 cassandra cassandra 16 2011-12-06 00:58 AttractionCheckins.3039706172746974696f6e-f-158-Filter.db -rw-r--r-- 1 cassandra cassandra 27 2011-12-06 00:58 AttractionCheckins.3039706172746974696f6e-f-158-Index.db -rw-r--r-- 1 cassandra cassandra 4264 2011-12-06 00:58 AttractionCheckins.3039706172746974696f6e-f-158-Statistics.db -rw-r--r-- 1 cassandra cassandra2627100 2011-12-06 06:55 Attractions.3039706172746974696f6e-f-1156-Data.db -rw-r--r-- 1 cassandra cassandra 1936 2011-12-06 06:55 Attractions.3039706172746974696f6e-f-1156-Filter.db -rw-r--r-- 1 cassandra cassandra 20 2011-12-06 06:55 Attractions.3039706172746974696f6e-f-1156-Index.db -rw-r--r-- 1 cassandra cassandra 4264 2011-12-06 06:55 Attractions.3039706172746974696f6e-f-1156-Statistics.db -rw-r--r-- 1 cassandra cassandra2238358 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1157-Data.db -rw-r--r-- 1 cassandra cassandra 16 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1157-Filter.db -rw-r--r-- 1 cassandra cassandra 20 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1157-Index.db -rw-r--r-- 1 cassandra cassandra 4264 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1157-Statistics.db -rw-r--r-- 1 cassandra cassandra 92 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1158-Data.db -rw-r--r-- 1 cassandra cassandra 16 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1158-Filter.db -rw-r--r-- 1 cassandra cassandra 20 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1158-Index.db -rw-r--r-- 1 cassandra cassandra 4264 2011-12-06 07:50 Attractions.3039706172746974696f6e-f-1158-Statistics.db -rw-r--r-- 1 cassandra cassandra 44799 2011-12-06 01:25 CityResources.3039706172746974696f6e-f-365-Data.db -rw-r--r-- 1 cassandra cassandra 1936 2011-12-06 01:25 CityResources.3039706172746974696f6e-f-365-Filter.db -rw-r--r-- 1 cassandra cassandra196 2011-12-06 01:25 CityResources.3039706172746974696f6e-f-365-Index.db -rw-r--r-- 1 cassandra cassandra 4264 2011-12-06 01:25 CityResources.3039706172746974696f6e-f-365-Statistics.db -rw-r--r-- 1 cassandra cassandra 7647 2011-12-06 07:50 CityResources.3039706172746974696f6e-f-366-Data.db -rw-r--r-- 1 cassandra cassandra 24 2011-12-06 07:50 CityResources.3039706172746974696f6e-f-366-Filter.db -rw-r--r-- 1 cassandra cassandra 96 2011-12-06 07:50 CityResources.3039706172746974696f6e-f-366-Index.db -rw-r--r-- 1 cassandra cassandra 4264 2011-12-06 07:50 CityResources.3039706172746974696f6e-f-366-Statistics.db Thanks Michael From: Alain RODRIGUEZ [mailto:arodr...@gmail.com] Sent: Wednesday, January 18, 2012 10:40 AM To: user@cassandra.apache.org Subject: Re: Incremental backups As this option is in the cassandra.yaml file, you might need to perform a restart of your entire cluster (a rolling restart should work). Hope this will help. Alain 2012/1/18 Michael Vaknine micha...@citypath.com
Re: Using 5-6 bytes for cassandra timestamps vs 8…
I believe the timestamps *on per column basis* are only required until the compaction time after that it may also work if the timestamp range could be specified globally on per SST table basis. and thus the timestamps until compaction are only required to be measure the time from the initialization of the new memtable to the point the column is written to that memtable. Thus you can easily fit that time in 4 bytes. This I believe would save atleast 4 bytes overhead for each column. Is anything related to these overheads under consideration/ or planned in the roadmap ? On Tue, Sep 6, 2011 at 11:44 AM, Oleg Anastastasyev olega...@gmail.com wrote: I have a patch for trunk which I just have to get time to test a bit before I submit. It is for super columns and will use the super columns timestamp as the base and only store variant encoded offsets in the underlying columns. Could you please measure how much real benefit it brings (in real RAM consumption by JVM). It is hard to tell will it give noticeable results or not. AFAIK memory structures used for memtable consume much more memory. And 64-bit JVM allocates memory aligned to 64-bit word boundary. So 37% of memory consumption reduction looks doubtful.
Re: Using 5-6 bytes for cassandra timestamps vs 8…
I must have accidentally deleted all messages in this thread save this one. On the face value, we are talking about saving 2 bytes per column. I know it can add up with many columns, but relative to the size of the column -- is it THAT significant? I made an effort to minimize my CF footprint by replacing the natural column keys with integers (and translating back and forth when writing and reading). It's easy to see that in my case I achieve almost 50% storage savings and at least 30%. But if the column in question contains more than 20 bytes -- what's up with trying to save 2? Cheers Maxim On 1/18/2012 11:49 PM, Ertio Lew wrote: I believe the timestamps *on per column basis* are only required until the compaction time after that it may also work if the timestamp range could be specified globally on per SST table basis. and thus the timestamps until compaction are only required to be measure the time from the initialization of the new memtable to the point the column is written to that memtable. Thus you can easily fit that time in 4 bytes. This I believe would save atleast 4 bytes overhead for each column. Is anything related to these overheads under consideration/ or planned in the roadmap ? On Tue, Sep 6, 2011 at 11:44 AM, Oleg Anastastasyevolega...@gmail.com wrote: I have a patch for trunk which I just have to get time to test a bit before I submit. It is for super columns and will use the super columns timestamp as the base and only store variant encoded offsets in the underlying columns. Could you please measure how much real benefit it brings (in real RAM consumption by JVM). It is hard to tell will it give noticeable results or not. AFAIK memory structures used for memtable consume much more memory. And 64-bit JVM allocates memory aligned to 64-bit word boundary. So 37% of memory consumption reduction looks doubtful.