RE: about the data directory
Not exactly. You mean one data will be put in four nodes which have 25%? If does, how about two replica? From: Viktor Jevdokimov [mailto:viktor.jevdoki...@adform.com] Sent: Thursday, January 13, 2011 2:59 PM To: user@cassandra.apache.org Subject: RE: about the data directory I have 4 nodes, then I I create one keyspace (such as FOO) with replica factor =1 and insert an data, why I can see the directory of /var/lib/Cassandra/data/FOO in every nodes? As I know, I just have one replica So why do you have installed 4 nodes, not 1? They're for your data to be distributed between 4 nodes with 1 copy on one of them. This is like you have 100% of data and each node will have 25% of the data (random partitioning). Viktor. Best regards/ Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.jevdoki...@adform.com Phone: +370 5 212 3063 Fax: +370 5 261 0453 Konstitucijos pr. 23, LT-08105 Vilnius, Lithuania [Adform news]http://www.adform.com/ Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the interested recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete or destroy this message and any copies. inline: image001.png
Re: about the data directory
I have 4 nodes, then I I create one keyspace (such as FOO) with replica factor =1 and insert an data, why I can see the directory of /var/lib/Cassandra/data/FOO in every nodes? As I know, I just have one replica The schema (keyspaces and column families) are global across the cluster (anything else would not make a lot of sense I think). The replication factor determines the number of replicas of actual data, based on row key. Given replication factor one, the data should only show up on one node (assuming a single row), but all nodes will be aware of your keyspace/column family. -- / Peter Schuller
RE: about the data directory
I agree with you totally. but I want to know which node is the data kept? I mean which way to know the actual data kept? -Original Message- From: sc...@scode.org [mailto:sc...@scode.org] On Behalf Of Peter Schuller Sent: Thursday, January 13, 2011 4:20 PM To: user@cassandra.apache.org Subject: Re: about the data directory I have 4 nodes, then I I create one keyspace (such as FOO) with replica factor =1 and insert an data, why I can see the directory of /var/lib/Cassandra/data/FOO in every nodes? As I know, I just have one replica The schema (keyspaces and column families) are global across the cluster (anything else would not make a lot of sense I think). The replication factor determines the number of replicas of actual data, based on row key. Given replication factor one, the data should only show up on one node (assuming a single row), but all nodes will be aware of your keyspace/column family. -- / Peter Schuller
Re: about the data directory
I agree with you totally. but I want to know which node is the data kept? I mean which way to know the actual data kept? If you're just doing testing, you might 'nodetool flush' each host and then look for the sstable being written. Prior to a flush, it's going to sit in a memtable in memory (and otherwise only in the commit log), up until the configurable time period. For real use-cases, you would normally not care which node has a particular row, except I suppose for debugging purposes or similar. I realize I don't know off hand of a simple way, from the perspective of the command line, to answer that question for a particular key. -- / Peter Schuller
RE: about the data directory
So you mean just the replica node 's sstable will be changed ,right? If all the replica node broke down, whether the users can read the data? -Original Message- From: sc...@scode.org [mailto:sc...@scode.org] On Behalf Of Peter Schuller Sent: Thursday, January 13, 2011 4:32 PM To: user@cassandra.apache.org Subject: Re: about the data directory I agree with you totally. but I want to know which node is the data kept? I mean which way to know the actual data kept? If you're just doing testing, you might 'nodetool flush' each host and then look for the sstable being written. Prior to a flush, it's going to sit in a memtable in memory (and otherwise only in the commit log), up until the configurable time period. For real use-cases, you would normally not care which node has a particular row, except I suppose for debugging purposes or similar. I realize I don't know off hand of a simple way, from the perspective of the command line, to answer that question for a particular key. -- / Peter Schuller
Re: Usage Pattern : quot;uniquequot; value of a key.
It is unlikely that both racing threads will have exactly the same microsecond timestamp at the moment of creating a new user - so if data you read have exactly the same timestamp you used to write data - this is your data. I think this would have to be combined with CL=QUORUM for both write and read. On Thu, Jan 13, 2011 at 9:57 AM, Oleg Anastasyev olega...@gmail.com wrote: Benoit Perroud benoit at noisette.ch writes: My idea to solve such use case is to have both thread writing the username, but with a colum like lock-RANDOM VALUE, and then read the row, and find out if the first lock column appearing belong to the thread. If this is the case, it can continue the process, otherwise it has been preempted by another thread. This looks ok for this task. As an alternative you can avoid creating extra \lock-random value' column and compare timestamps of new user data you just written. It is unlikely that both racing threads will have exactly the same microsecond timestamp at the moment of creating a new user - so if data you read have exactly the same timestamp you used to write data - this is your data. Another possible way is to use some external lock coordinator, eg zookeeper. Although for this task it looks a bit overkill, but this can become even more valuable, if you have more data concurrency issues to solve and can bear extra 5-10ms update operations latency.
Question about fat rows
Hi everyone. I have a question about data modeling in my application. I have to store items of a customer, and I can do it in one fat row per customer where the column name is the id and the value a json serialized object, or one entry per item with the same layout. This data is updated almost every day, sometimes several times per day. My question is, which scheme will give me a better read performance? I was hoping on saving keys so I could cache all the keys in this CF, but I'm worried about read performance with very updated fat rows. Any help or hints would be appreciated. Thanks!
Re: Requesting data model suggestions
Hello Scott, 6 month later but maybe you are still interested. I wrote an article on the subject of migrating EERD models to Cassandra on the Cassandra wiki. Have a look and tell me what you think of it: http://wiki.apache.org/cassandra/ThomasBoose/EERD%20model%20components%20to%20Ca ssandra%20Column%20family%27s? action=fullsearchcontext=180value=linkto%3A%22ThomasBoose/EERD+model+component s+to+Cassandra+Column+family%27s%22 Thanks in advance, Thomas Boose
Re: Usage Pattern : quot;uniquequot; value of a key.
Thanks for your answer. You're right when you say it's unlikely that 2 threads have the same timestamp, but it can. So it could work for user creation, but maybe not on a more write intensive problem. Moreover, we cannot rely on fully time synchronized node in the cluster (but on node synchronized at a few ms), so a second node could theoretically write a smaller timestamp after the first node. An even worst case could be the one illustrated here (http://noisette.ch/cassandra/cassandra_unique_key_pattern.png) : nodes are synchronized, but something goes wrong (slow) during the write, then both nodes think the key belongs to them. So my idea of writing a lock is not well suitable... Does anyone have another idea to share regarding this topic ? Thanks, Kind regards, Benoit. 2011/1/13 Oleg Anastasyev olega...@gmail.com: Benoit Perroud benoit at noisette.ch writes: My idea to solve such use case is to have both thread writing the username, but with a colum like lock-RANDOM VALUE, and then read the row, and find out if the first lock column appearing belong to the thread. If this is the case, it can continue the process, otherwise it has been preempted by another thread. This looks ok for this task. As an alternative you can avoid creating extra \lock-random value' column and compare timestamps of new user data you just written. It is unlikely that both racing threads will have exactly the same microsecond timestamp at the moment of creating a new user - so if data you read have exactly the same timestamp you used to write data - this is your data. Another possible way is to use some external lock coordinator, eg zookeeper. Although for this task it looks a bit overkill, but this can become even more valuable, if you have more data concurrency issues to solve and can bear extra 5-10ms update operations latency.
Re: Timeout Errors while running Hadoop over Cassandra
On Jan 12, 2011, at 12:40 PM, Jairam Chandar wrote: Hi folks, We have a Cassandra 0.6.6 cluster running in production. We want to run Hadoop (version 0.20.2) jobs over this cluster in order to generate reports. I modified the word_count example in the contrib folder of the cassandra distribution. While the program is running fine for small datasets (in the order of 100-200 MB) on small clusters (2 machines), it starts to give errors while trying to run on a bigger cluster (5 machines) with much larger dataset (400 GB). Here is the error that we get - java.lang.RuntimeException: TimedOutException() at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:186) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:236) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:104) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:135) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:130) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:98) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: TimedOutException() at org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:11094) at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:628) at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:602) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:164) ... 11 more I wonder if messing with RpcTimeoutInMillis in storage-conf.xml would help. I came across this page on the Cassandra wiki - http://wiki.apache.org/cassandra/HadoopSupport and tried modifying the ulimit and changing batch sizes. These did not help. Though the number of successful map tasks increased, it eventually fails since the total number of map tasks is huge. Any idea on what could be causing this? The program we are running is a very slight modification of the word_count example with respect to reading from Cassandra. The only change being specific keyspace, columnfamily and columns. The rest of the code for reading is the same as the word_count example in the source code for Cassandra 0.6.6. Thanks and regards, Jairam Chandar
Re: about the data directory
So you mean just the replica node 's sstable will be changed ,right? The data will only be written to the nodes that are part of the replica set fo the row (with the exception of hinted handoff, but that's a different sstable). If all the replica node broke down, whether the users can read the data? If *all* nodes in the replica set for a particular row are down, then you won't be able to read that row, no. -- / Peter Schuller
Re: java.net.BindException: Cannot assign requested address
Gary Dusbabek gdusbabek at gmail.com writes: On Tue, Nov 3, 2009 at 15:44, mobiledreamers at gmail.com wrote: ERROR - Exception encountered during startup. java.net.BindException: Cannot assign requested address at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) at You will see that error if StoragePort is being used by other system processes. Try using a different port. If you're using a unixy system, you can get a good idea if the port is in use from netstat: E.g. see if port 7000 is being listened on: netstat -an | grep 7000 Cheers, Gary. I am using 0.6.9 version. I am able to run it successfully on my windows machine. On ubuntu I get Cannot assign requested address error. I checked whether storage port (7000) and thriftport(9160) is being used by running the command u mentioned. They are not being used. I even tried using different ports but am getting same error. I also tried putting 0.0.0.0 in Thrift Address.
Welcome committer Jake Luciani
The Cassandra PMC has voted to add Jake as a committer. (Jake is also a committer on Thrift.) Welcome, Jake, and thanks for the hard work! -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Welcome committer Jake Luciani
Thanks Jonathan and Cassandra PMC! Happy to help Cassandra take over the world! -Jake On Thu, Jan 13, 2011 at 1:41 PM, Jonathan Ellis jbel...@gmail.com wrote: The Cassandra PMC has voted to add Jake as a committer. (Jake is also a committer on Thrift.) Welcome, Jake, and thanks for the hard work! -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
cassandra row cache
Hi, I am running a 15 node cluster ,version 0.6.8, Linux 64bit OS, using mmap I/O, 6GB ram allocated. I have row cache enabled to 8 keys (mean row size is 2KB). I am observing a strange behaviour.. I query for 1.6 Million rows across the cluster and time taken is around 40 mins , I query the same data again , the time now is 25 mins to fetch data (i am expecting the cache to be warm now) , but i see row cache hit rate around 30% . Now i request the same data 3rd time, time to fetch is under 4 mins and cache hit ratios are 99% ... Does any one have an idea why this may be happening ? Thanks, Saket
Re: cassandra row cache
does the cache size change between 2nd and 3rd time? On Thu, Jan 13, 2011 at 10:47 AM, Saket Joshi sjo...@touchcommerce.com wrote: Hi, I am running a 15 node cluster ,version 0.6.8, Linux 64bit OS, using mmap I/O, 6GB ram allocated. I have row cache enabled to 8 keys (mean row size is 2KB). I am observing a strange behaviour.. I query for 1.6 Million rows across the cluster and time taken is around 40 mins , I query the same data again , the time now is 25 mins to fetch data (i am expecting the cache to be warm now) , but i see row cache hit rate around 30% . Now i request the same data 3rd time, time to fetch is under 4 mins and cache hit ratios are 99% ... Does any one have an idea why this may be happening ? Thanks, Saket -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Welcome committer Jake Luciani
Three cheers! On Thu, Jan 13, 2011 at 1:45 PM, Jake Luciani jak...@gmail.com wrote: Thanks Jonathan and Cassandra PMC! Happy to help Cassandra take over the world! -Jake On Thu, Jan 13, 2011 at 1:41 PM, Jonathan Ellis jbel...@gmail.com wrote: The Cassandra PMC has voted to add Jake as a committer. (Jake is also a committer on Thrift.) Welcome, Jake, and thanks for the hard work! -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
RE: cassandra row cache
Yes it does change. -Original Message- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: Thursday, January 13, 2011 11:01 AM To: user Subject: Re: cassandra row cache does the cache size change between 2nd and 3rd time? On Thu, Jan 13, 2011 at 10:47 AM, Saket Joshi sjo...@touchcommerce.com wrote: Hi, I am running a 15 node cluster ,version 0.6.8, Linux 64bit OS, using mmap I/O, 6GB ram allocated. I have row cache enabled to 8 keys (mean row size is 2KB). I am observing a strange behaviour.. I query for 1.6 Million rows across the cluster and time taken is around 40 mins , I query the same data again , the time now is 25 mins to fetch data (i am expecting the cache to be warm now) , but i see row cache hit rate around 30% . Now i request the same data 3rd time, time to fetch is under 4 mins and cache hit ratios are 99% ... Does any one have an idea why this may be happening ? Thanks, Saket -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Newbie Replication/Cluster Question
I'm just starting to play with Cassandra, so this is almost certainly a conceptual problem on my part, so apologies in advance. I was testing out how I'd do things like bring up new nodes. I've got a simple 2-node cluster with my only keyspace having replication_factor=2. This is on 32-bit Debian Squeeze. Java==Java(TM) SE Runtime Environment (build 1.6.0_22-b04). This is using the just-released 0.7.0 binaries. Configuration is pretty minimal besides using SimpleAuthentication module. The issue is that whenever I kill a node in the cluster and wipe its datadir (i.e. rm -rf /var/lib/cassandra/*) and try to bootstrap it back into the cluster (and this occurs in both the scenario of both nodes being present during the writing of data as well as only a single node being up during writing of data), it seems to join the cluster and chug along till it keels over and dies with this: INFO [main] 2011-01-13 13:56:23,385 StorageService.java (line 399) Bootstrapping ERROR [main] 2011-01-13 13:56:23,402 AbstractCassandraDaemon.java (line 234) Exception encountered during startup. java.lang.IllegalStateException: replication factor (2) exceeds number of endpoints (1) at org.apache.cassandra.locator.SimpleStrategy.calculateNaturalEndpoints(SimpleStrategy.java:60) at org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:204) at org.apache.cassandra.dht.BootStrapper.getRangesWithSources(BootStrapper.java:198) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:83) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:417) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:361) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:161) at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:55) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:217) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134) Exception encountered during startup. java.lang.IllegalStateException: replication factor (2) exceeds number of endpoints (1) at org.apache.cassandra.locator.SimpleStrategy.calculateNaturalEndpoints(SimpleStrategy.java:60) at org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:204) at org.apache.cassandra.dht.BootStrapper.getRangesWithSources(BootStrapper.java:198) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:83) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:417) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:361) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:161) at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:55) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:217) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134) Seems like something of a chicken-or-the-egg problem of it not liking there only being 1 node but not letting node 2 join. Being that I've been messing with Cassandra for only a couple of days, I'm assuming I'm doing something wrong, but the only google'ing I can find for the above error is just a couple of 4+ month-old tickets that all sound resolved. It's probably worth mentioning that if both nodes are started when I create the keyspace, the cluster appears to work just fine and I can start/stop either node and get at any piece of data. The nodetool ring output looks like this: Prior to starting 10.1.58.4 and then for a while after startup Address Status State LoadOwnsToken 10.1.58.3 Up Normal 524.99 KB 100.00% 74198390702807803312208811144092384306 10.1.58.4 seems to be joining Address Status State LoadOwnsToken 74198390702807803312208811144092384306 10.1.58.4 Up Joining 72.06 KB56.66% 460947270041113367229815744049079597 10.1.58.3 Up Normal 524.99 KB 43.34% 74198390702807803312208811144092384306 Java exception, back to just 10.1.58.3 Address Status State LoadOwnsToken 10.1.58.3 Up Normal 524.99 KB 100.00% 74198390702807803312208811144092384306
Re: Are you using Phpcassa for any application currently in production? or considering so ?
Slow. Look at the Play Framework, http://www.playframework.org/, with a Java client. On Thu, Jan 13, 2011 at 12:17 PM, Ertio Lew ertio...@gmail.com wrote: I need to choose one amongst several client options to work with Cassandra for a serious web application for production environments. I prefer to work with php but I am not sure what if phpcassa would be best choice if I am open to working with other other languages as well. Php developers normally are in huge majority everywhere but I rather found a bit difficult to see the majority here. Do you have a setup in production or are you considering so ? -- Frank LoVecchio Senior Software Engineer | Isidorey, LLC Google Voice +1.720.295.9179 isidorey.com | facebook.com/franklovecchio | franklovecchio.com
Re: cassandra row cache
I'm not sure if this is entirely true, but I *think* older version of cassandra used a version of the ConcurrentLinkedHashmap (which backs the row cache) that used the Second Chance algorithm, rather than LRU, which might explain this non-LRU-like behavior. I may be entirely wrong about this though. -ryan On Thu, Jan 13, 2011 at 11:05 AM, Saket Joshi sjo...@touchcommerce.com wrote: Yes it does change. -Original Message- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: Thursday, January 13, 2011 11:01 AM To: user Subject: Re: cassandra row cache does the cache size change between 2nd and 3rd time? On Thu, Jan 13, 2011 at 10:47 AM, Saket Joshi sjo...@touchcommerce.com wrote: Hi, I am running a 15 node cluster ,version 0.6.8, Linux 64bit OS, using mmap I/O, 6GB ram allocated. I have row cache enabled to 8 keys (mean row size is 2KB). I am observing a strange behaviour.. I query for 1.6 Million rows across the cluster and time taken is around 40 mins , I query the same data again , the time now is 25 mins to fetch data (i am expecting the cache to be warm now) , but i see row cache hit rate around 30% . Now i request the same data 3rd time, time to fetch is under 4 mins and cache hit ratios are 99% ... Does any one have an idea why this may be happening ? Thanks, Saket -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Bloom filter
All, Could someone tell me where (what classes) or what library is Cassandra using for its bloom filters? Thanks Carlos This email message and any attachments are for the sole use of the intended recipients and may contain proprietary and/or confidential information which may be privileged or otherwise protected from disclosure. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not an intended recipient, please contact the sender by reply email and destroy the original message and any copies of the message as well as any attachments to the original message. http://www.mscibarra.com/legal/local_registered_entities.html
RE: cassandra row cache
The cache is 800,000 per node , I have 15 nodes in the cluster. I see the cache value increased after the first run, the row cache hit rate was 0 for first run. For second run of the same data , the hit rate increased to 30% but on the third it jumps to 99% -Saket -Original Message- From: Chris Burroughs [mailto:chris.burrou...@gmail.com] Sent: Thursday, January 13, 2011 1:03 PM To: user@cassandra.apache.org Cc: Saket Joshi Subject: Re: cassandra row cache On 01/13/2011 02:05 PM, Saket Joshi wrote: Yes it does change. So the confusing part for me is why a cache of size 80,000 would not be fill after 1,600,000 requests. Can you observe items cached and hit rate while making the first 1.6 million row query?
Re: Newbie Replication/Cluster Question
It is impossible to properly bootstrap a new node into a system where there are not enough nodes to satisfy the replication factor. The cluster as it stands doesn't contain all the data you are asking it to replicate on the new node. Gary. On Thu, Jan 13, 2011 at 13:13, Mark Moseley moseleym...@gmail.com wrote: I'm just starting to play with Cassandra, so this is almost certainly a conceptual problem on my part, so apologies in advance. I was testing out how I'd do things like bring up new nodes. I've got a simple 2-node cluster with my only keyspace having replication_factor=2. This is on 32-bit Debian Squeeze. Java==Java(TM) SE Runtime Environment (build 1.6.0_22-b04). This is using the just-released 0.7.0 binaries. Configuration is pretty minimal besides using SimpleAuthentication module. The issue is that whenever I kill a node in the cluster and wipe its datadir (i.e. rm -rf /var/lib/cassandra/*) and try to bootstrap it back into the cluster (and this occurs in both the scenario of both nodes being present during the writing of data as well as only a single node being up during writing of data), it seems to join the cluster and chug along till it keels over and dies with this: INFO [main] 2011-01-13 13:56:23,385 StorageService.java (line 399) Bootstrapping ERROR [main] 2011-01-13 13:56:23,402 AbstractCassandraDaemon.java (line 234) Exception encountered during startup. java.lang.IllegalStateException: replication factor (2) exceeds number of endpoints (1) at org.apache.cassandra.locator.SimpleStrategy.calculateNaturalEndpoints(SimpleStrategy.java:60) at org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:204) at org.apache.cassandra.dht.BootStrapper.getRangesWithSources(BootStrapper.java:198) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:83) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:417) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:361) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:161) at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:55) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:217) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134) Exception encountered during startup. java.lang.IllegalStateException: replication factor (2) exceeds number of endpoints (1) at org.apache.cassandra.locator.SimpleStrategy.calculateNaturalEndpoints(SimpleStrategy.java:60) at org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:204) at org.apache.cassandra.dht.BootStrapper.getRangesWithSources(BootStrapper.java:198) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:83) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:417) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:361) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:161) at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:55) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:217) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134) Seems like something of a chicken-or-the-egg problem of it not liking there only being 1 node but not letting node 2 join. Being that I've been messing with Cassandra for only a couple of days, I'm assuming I'm doing something wrong, but the only google'ing I can find for the above error is just a couple of 4+ month-old tickets that all sound resolved. It's probably worth mentioning that if both nodes are started when I create the keyspace, the cluster appears to work just fine and I can start/stop either node and get at any piece of data. The nodetool ring output looks like this: Prior to starting 10.1.58.4 and then for a while after startup Address Status State Load Owns Token 10.1.58.3 Up Normal 524.99 KB 100.00% 74198390702807803312208811144092384306 10.1.58.4 seems to be joining Address Status State Load Owns Token 74198390702807803312208811144092384306 10.1.58.4 Up Joining 72.06 KB 56.66% 460947270041113367229815744049079597 10.1.58.3 Up Normal 524.99 KB 43.34% 74198390702807803312208811144092384306 Java exception, back to just 10.1.58.3 Address Status State Load Owns Token 10.1.58.3 Up Normal 524.99 KB 100.00% 74198390702807803312208811144092384306
Re: Bloom filter
On 01/13/2011 04:07 PM, Carlos Sanchez wrote: Could someone tell me where (what classes) or what library is Cassandra using for its bloom filters? src/java/org/apache/cassandra/utils/BloomFilter.java
Re: Are you using Phpcassa for any application currently in production? or considering so ?
We use SimpleCassie in production right now. http://code.google.com/p/simpletools-php/wiki/SimpleCassie On 01/13/2011 11:17 AM, Ertio Lew wrote: I need to choose one amongst several client options to work with Cassandra for a serious web application for production environments. I prefer to work with php but I am not sure what if phpcassa would be best choice if I am open to working with other other languages as well. Php developers normally are in huge majority everywhere but I rather found a bit difficult to see the majority here. Do you have a setup in production or are you considering so ?
Re: cassandra row cache
Is it possible that your are reading at READ.ONE and that READ.ONE only warms cache on 1 of your three nodes= 20. 2nd read warms another 60%, and by the third read all the replicas are warm? 99% ? This would be true if digest reads were not warming caches. Edward On Thu, Jan 13, 2011 at 4:07 PM, Saket Joshi sjo...@touchcommerce.com wrote: The cache is 800,000 per node , I have 15 nodes in the cluster. I see the cache value increased after the first run, the row cache hit rate was 0 for first run. For second run of the same data , the hit rate increased to 30% but on the third it jumps to 99% -Saket -Original Message- From: Chris Burroughs [mailto:chris.burrou...@gmail.com] Sent: Thursday, January 13, 2011 1:03 PM To: user@cassandra.apache.org Cc: Saket Joshi Subject: Re: cassandra row cache On 01/13/2011 02:05 PM, Saket Joshi wrote: Yes it does change. So the confusing part for me is why a cache of size 80,000 would not be fill after 1,600,000 requests. Can you observe items cached and hit rate while making the first 1.6 million row query?
Re: Newbie Replication/Cluster Question
On Thu, Jan 13, 2011 at 1:08 PM, Gary Dusbabek gdusba...@gmail.com wrote: It is impossible to properly bootstrap a new node into a system where there are not enough nodes to satisfy the replication factor. The cluster as it stands doesn't contain all the data you are asking it to replicate on the new node. Ok, maybe I'm thinking of replication_factor backwards. I took it to mean how many nodes would have *full* copies of the whole of the keyspace's data, in which case with my keyspace with replication_factor=2 the still-alive node would have 100% of the data to replicate to the wiped-clean node--in which case all the data would be there to bootstrap. I was assuming replication_factor=2 in a 2-node cluster == both nodes having a full replica of the data. Do I have that wrong? What's also confusing is that I did this same test on a clean node that wasn't clustered yet (which is interesting that it doesn't complain then about replication_factor # of nodes), so unless it was throwing away data as I was inserting it, it'd all be there. Is the general rule then that the max. replication factor must be #_of_nodes-1 then? If replication_factor==#_of_nodes, then if you lost a box, it seems like your cluster would be toast.
python client example
Guys, I just installed python-cassandra 0.6.1 and Thrift 0.5.0 on my machine and I would like to query against also write into a cassandra server. I guess i am pretty weak in google-fu, there isn't any examples for me get started with. Please help me on how to do this. Thanks, Felix
Re: java.net.BindException: Cannot assign requested address
Can you post the settings you have for- listen_address- storage_port- rpc_address- rpc_portAlso the full error stack again, your original email has dropped off. can you use cassanra 0.7 ?AaronOn 14 Jan, 2011,at 05:22 AM, vikram prajapati prajapativik...@hotmail.com wrote: ERROR 11:33:56,246 Exception encountered during startup.java.io.IOException: Unable to create thrift socket to /10.0.0.1:9160 at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:73) at org.apache.cassandra.serviceAbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:217) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134)Caused by: org.apache.thrift.transport.TTransportException: Could not create ServerSocket on address /10.0.0.1:9160. at org.apache.thrift.transport.TServerSocket.init(TServerSocket.java:99) at org.apache.thrift.transport.TServerSocket.init(TServerSocket.java:85) at org.apache.cassandra.thrift.TCustomServerSocket.init(TCustomServerSocket.java:59) at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:66) Gary Dusbabek gdusbabek at gmail.com writes: On Tue, Nov 3, 2009 at 15:44, mobiledreamers at gmail.com wrote: ERROR - Exception encountered during startup. java.net.BindException: Cannot assign requested address at sun.nio.ch.Net.bind(Native Method) at sunnio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at sun.nio.chServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) at You will see that error if StoragePort is being used by other system processes. Try using a different port. If you're using a unixy system, you can get a good idea if the port is in use from netstat: E.g. see if port 7000 is being listened on: netstat -an | grep 7000 Cheers, Gary. I am using 0.6.9 version. I am able to run it successfully on my windows machine. On ubuntu I get Cannot assign requested address error. I checked whether storage port (7000) and thriftport(9160) is being used by running the command u mentioned. They are not being used. I even tried using different ports but am getting same error. I also tried putting 0.0.0.0 in Thrift Address.
Storing big objects into columns
Dear all, In a project I would like to store big objects in columns, serialized. For example entire images (several Ko to several Mo), flash animations (several Mo) etc... Does someone use Cassandra with those relatively big columns and if yes does it work well ? Is there any drawbacks using this method ? Thank you, Victor K.
Re: Storing big objects into columns
On Thu, Jan 13, 2011 at 2:38 PM, Victor Kabdebon victor.kabde...@gmail.com wrote: Dear all, In a project I would like to store big objects in columns, serialized. For example entire images (several Ko to several Mo), flash animations (several Mo) etc... Does someone use Cassandra with those relatively big columns and if yes does it work well ? Is there any drawbacks using this method ? I haven't benchmarked this myself, but I think you'll want to chunk your content into multiple columns in the same row. -ryan
Re: python client example
Pycassahttps://github.com/pycassa/pycassaHas documentation herehttp://pycassa.github.com/pycassa/Where does python-cassandra live ?AaronOn 14 Jan, 2011,at 11:34 AM, felix gao gre1...@gmail.com wrote:Guys,I justinstalledpython-cassandra 0.6.1 andThrift 0.5.0 on my machine and I would like to query against also write into a cassandra server. I guess i am pretty weak in google-fu, there isn't any examples for me get started with. Please help me on how to do this. Thanks,Felix
Re: python client example
this is where it is stored /opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/ On Thu, Jan 13, 2011 at 2:39 PM, Aaron Morton aa...@thelastpickle.comwrote: Pycassa https://github.com/pycassa/pycassa Has documentation here http://pycassa.github.com/pycassa/ https://github.com/pycassa/pycassaWhere does python-cassandra live ? Aaron On 14 Jan, 2011,at 11:34 AM, felix gao gre1...@gmail.com wrote: Guys, I just installed python-cassandra 0.6.1 and Thrift 0.5.0 on my machine and I would like to query against also write into a cassandra server. I guess i am pretty weak in google-fu, there isn't any examples for me get started with. Please help me on how to do this. Thanks, Felix
Re: Storing big objects into columns
Is there any recommanded maximum size for a Column ? (not the very upper limit which is 2Gb) Why is it useful to chunk the content into multiple columns ? Thank you, Victor K. 2011/1/13 Ryan King r...@twitter.com On Thu, Jan 13, 2011 at 2:38 PM, Victor Kabdebon victor.kabde...@gmail.com wrote: Dear all, In a project I would like to store big objects in columns, serialized. For example entire images (several Ko to several Mo), flash animations (several Mo) etc... Does someone use Cassandra with those relatively big columns and if yes does it work well ? Is there any drawbacks using this method ? I haven't benchmarked this myself, but I think you'll want to chunk your content into multiple columns in the same row. -ryan
Re: python client example
Sorry, I meant where did you get python-cassandra from on the web.Can you use Pycassa, even just as a learning experience ? There is a tutorial herehttp://pycassa.github.com/pycassa/tutorial.htmlAOn 14 Jan, 2011,at 11:42 AM, felix gao gre1...@gmail.com wrote:this is where it is stored/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/On Thu, Jan 13, 2011 at 2:39 PM, Aaron Morton aa...@thelastpickle.com wrote: Pycassahttps://github.com/pycassa/pycassa Has documentation herehttp://pycassa.github.com/pycassa/Where does python-cassandra live ? AaronOn 14 Jan, 2011,at 11:34 AM, felix gao gre1...@gmail.com wrote: Guys,I justinstalledpython-cassandra 0.6.1 andThrift 0.5.0 on my machine and I would like to query against also write into a cassandra server. I guess i am pretty weak in google-fu, there isn't any examples for me get started with. Please help me on how to do this. Thanks,Felix
Re: python client example
Right, python-cassandra just provides the raw Thrift API, which is no fun at all. You should start out with pycassa. - Tyler On Thu, Jan 13, 2011 at 4:45 PM, Aaron Morton aa...@thelastpickle.comwrote: Sorry, I meant where did you get python-cassandra from on the web. Can you use Pycassa, even just as a learning experience ? There is a tutorial here http://pycassa.github.com/pycassa/tutorial.html http://pycassa.github.com/pycassa/tutorial.htmlA On 14 Jan, 2011,at 11:42 AM, felix gao gre1...@gmail.com wrote: this is where it is stored /opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/ On Thu, Jan 13, 2011 at 2:39 PM, Aaron Morton aa...@thelastpickle.comwrote: Pycassa https://github.com/pycassa/pycassa Has documentation here http://pycassa.github.com/pycassa/ https://github.com/pycassa/pycassaWhere does python-cassandra live ? Aaron On 14 Jan, 2011,at 11:34 AM, felix gao gre1...@gmail.com wrote: Guys, I just installed python-cassandra 0.6.1 and Thrift 0.5.0 on my machine and I would like to query against also write into a cassandra server. I guess i am pretty weak in google-fu, there isn't any examples for me get started with. Please help me on how to do this. Thanks, Felix
Re: Storing big objects into columns
On Thu, Jan 13, 2011 at 2:44 PM, Victor Kabdebon victor.kabde...@gmail.com wrote: Is there any recommanded maximum size for a Column ? (not the very upper limit which is 2Gb) Why is it useful to chunk the content into multiple columns ? I think you're going to have to do some tests yourself. You want to chunk it so that you can pseudo-stream the content. You don't want to have to load the whole content at once. -ryan
Re: Storing big objects into columns
Ok thank you very much for these information ! If somebody has more insights on this matter I am still interested ! Victor K. 2011/1/13 Ryan King r...@twitter.com On Thu, Jan 13, 2011 at 2:44 PM, Victor Kabdebon victor.kabde...@gmail.com wrote: Is there any recommanded maximum size for a Column ? (not the very upper limit which is 2Gb) Why is it useful to chunk the content into multiple columns ? I think you're going to have to do some tests yourself. You want to chunk it so that you can pseudo-stream the content. You don't want to have to load the whole content at once. -ryan
Re: python client example
Ah, i get it now. The python code generated from running ant gen-thrift-py .IMHO Start with Pycassa *even* if you want to go your own way later. It solves a lot of problems for you and will save you time.AOn 14 Jan, 2011,at 11:46 AM, Tyler Hobbs ty...@riptano.com wrote:Right, python-cassandra just provides the raw Thrift API, which is no fun at all. You should start out with pycassa.- TylerOn Thu, Jan 13, 2011 at 4:45 PM, Aaron Morton aa...@thelastpickle.com wrote: Sorry, I meant where did you get python-cassandra from on the web. Can you use Pycassa, even just as a learning experience ? There is a tutorial herehttp://pycassa.github.com/pycassa/tutorial.html AOn 14 Jan, 2011,at 11:42 AM, felix gao gre1...@gmail.com wrote: this is where it is stored/opt/local/Library/Frameworks/Python.framework/Versions/26/lib/python2.6/site-packages/On Thu, Jan 13, 2011 at 2:39 PM, Aaron Morton aa...@thelastpickle.com wrote: Pycassahttps://github.com/pycassa/pycassa Has documentation herehttp://pycassa.github.com/pycassa/Where does python-cassandra live ? AaronOn 14 Jan, 2011,at 11:34 AM, felix gao gre1...@gmail.com wrote: Guys,I justinstalledpython-cassandra 0.6.1 andThrift 0.5.0 on my machine and I would like to query against also write into a cassandra server. I guess i am pretty weak in google-fu, there isn't any examples for me get started with. Please help me on how to do this. Thanks,Felix
Re: python client example
Thanks guys, playing around with pycassa right now. seems pretty good. On Thu, Jan 13, 2011 at 2:56 PM, Aaron Morton aa...@thelastpickle.comwrote: Ah, i get it now. The python code generated from running ant gen-thrift-py . IMHO Start with Pycassa *even* if you want to go your own way later. It solves a lot of problems for you and will save you time. A On 14 Jan, 2011,at 11:46 AM, Tyler Hobbs ty...@riptano.com wrote: Right, python-cassandra just provides the raw Thrift API, which is no fun at all. You should start out with pycassa. - Tyler On Thu, Jan 13, 2011 at 4:45 PM, Aaron Morton aa...@thelastpickle.comwrote: Sorry, I meant where did you get python-cassandra from on the web. Can you use Pycassa, even just as a learning experience ? There is a tutorial here http://pycassa.github.com/pycassa/tutorial.html http://pycassa.github.com/pycassa/tutorial.htmlA On 14 Jan, 2011,at 11:42 AM, felix gao gre1...@gmail.com wrote: this is where it is stored /opt/local/Library/Frameworks/Python.framework/Versions/26/lib/python2.6/site-packages/ On Thu, Jan 13, 2011 at 2:39 PM, Aaron Morton aa...@thelastpickle.comwrote: Pycassa https://github.com/pycassa/pycassa Has documentation here http://pycassa.github.com/pycassa/ https://github.com/pycassa/pycassaWhere does python-cassandra live ? Aaron On 14 Jan, 2011,at 11:34 AM, felix gao gre1...@gmail.com wrote: Guys, I just installed python-cassandra 0.6.1 and Thrift 0.5.0 on my machine and I would like to query against also write into a cassandra server. I guess i am pretty weak in google-fu, there isn't any examples for me get started with. Please help me on how to do this. Thanks, Felix
Re: cassandra row cache
On Thu, Jan 13, 2011 at 2:00 PM, Edward Capriolo edlinuxg...@gmail.com wrote: Is it possible that your are reading at READ.ONE and that READ.ONE only warms cache on 1 of your three nodes= 20. 2nd read warms another 60%, and by the third read all the replicas are warm? 99% ? This would be true if digest reads were not warming caches. Digest reads do go through the cache path. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
RE: java.net.BindException: Cannot assign requested address
It's ip address problem, whether you ip address had changed? Please confirm it and restart the Cassandra. -Original Message- From: vikram prajapati [mailto:prajapativik...@hotmail.com] Sent: Friday, January 14, 2011 12:23 AM To: user@cassandra.apache.org Subject: Re: java.net.BindException: Cannot assign requested address Gary Dusbabek gdusbabek at gmail.com writes: On Tue, Nov 3, 2009 at 15:44, mobiledreamers at gmail.com wrote: ERROR - Exception encountered during startup. java.net.BindException: Cannot assign requested address at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) at You will see that error if StoragePort is being used by other system processes. Try using a different port. If you're using a unixy system, you can get a good idea if the port is in use from netstat: E.g. see if port 7000 is being listened on: netstat -an | grep 7000 Cheers, Gary. I am using 0.6.9 version. I am able to run it successfully on my windows machine. On ubuntu I get Cannot assign requested address error. I checked whether storage port (7000) and thriftport(9160) is being used by running the command u mentioned. They are not being used. I even tried using different ports but am getting same error. I also tried putting 0.0.0.0 in Thrift Address.
Cassandra freezes under load when using libc6 2.11.1-0ubuntu7.5
Hey folks, We've discovered an issue on Ubuntu/Lenny with libc6 2.11.1-0ubuntu7.5 (it may also affect versions between 2.11.1-0ubuntu7.1 and 2.11.1-0ubuntu7.4). The bug affects systems when a large number of threads (or processes) are created rapidly. Once triggered, the system will become completely unresponsive for ten to fifteen minutes. We've seen this issue on our production Cassandra clusters under high load. Cassandra seems particularly susceptible to this issue because of the large thread pools that it creates. In particular, we suspect the unbounded thread pool for connection management may be pushing some systems over the edge. We're still trying to narrow down what changed in libc that is causing this issue. We also haven't tested things outside of xen, or on non-x86 architectures. But if you're seeing these symptoms, you may want to try upgrading libc6. I'll send out an update if we find anything else interesting. If anyone has any thoughts as to what the cause is, we're all ears! Hope this saves someone some heart-ache, Mike
Re: Old data not indexed
Hi all, More specifically, I added two rows of data. Row A (users['A']['state']='UT') is added before I add indexing to the column and the Row B (users['B']['state']='UT') after indexing. When I call get_indexed_slices (state='UT') to query the two rows, only the Row B is returned. It's as if Cassandra cannot automatically index rows that are inserted before indexing. Thank you in advance for your help. On Thu, 2011-01-13 at 15:33 +0800, Tan Yeh Zheng wrote: I tried to run the example on http://www.riptano.com/blog/whats-new-cassandra-07-secondary-indexes programatically. After I index the column state, I tried to get_indexed_slices (where state = 'UT') but it returned an empty list. But if I index first, then query, it'll return the correct result. Any advise is appreciated. Thanks. -- Best Regards, Tan Yeh Zheng Software Programmer ChartNexus® :: Chart Your Success ChartNexus Pte. Ltd. 15 Enggor Street #10-01 Realty Center Singapore 079716 Tel: (65) 6491 1456 Website: www.chartnexus.com Disclaimer: This email is confidential and intended only for the use of the individual or individuals named above and may contain information that is privileged. If you are not the intended recipient, you are notified that any dissemination, distribution or copying of this email is strictly prohibited.
Re: about the data directory
On Thu, Jan 13, 2011 at 7:56 PM, raoyixuan (Shandy) raoyix...@huawei.com wrote: I have some confused, why do the users can read the data in all nodes? I mean the data just be kept in the replica, how to achieve it? -Original Message- From: sc...@scode.org [mailto:sc...@scode.org] On Behalf Of Peter Schuller Sent: Friday, January 14, 2011 1:19 AM To: user@cassandra.apache.org Subject: Re: about the data directory So you mean just the replica node 's sstable will be changed ,right? The data will only be written to the nodes that are part of the replica set fo the row (with the exception of hinted handoff, but that's a different sstable). If all the replica node broke down, whether the users can read the data? If *all* nodes in the replica set for a particular row are down, then you won't be able to read that row, no. -- / Peter Schuller It does not matter which node you connect to. The node you connect to determines the hash of the key (or uses the key itself when using Order Preserving Partitioner) to determine which node or nodes the data should be on. If the key is on that node it returns it directly to the client. If the key is not on that node Cassandra fetches it from another node and then returns that data. The client is unaware and does not need to be concerned with where the data came from.
RE: about the data directory
as a administrator, I want to know why I can read the data from any node, because the data just be kept the replica. Can you tell me? Thanks in advance. -Original Message- From: Edward Capriolo [mailto:edlinuxg...@gmail.com] Sent: Friday, January 14, 2011 9:44 AM To: user@cassandra.apache.org Subject: Re: about the data directory On Thu, Jan 13, 2011 at 7:56 PM, raoyixuan (Shandy) raoyix...@huawei.com wrote: I have some confused, why do the users can read the data in all nodes? I mean the data just be kept in the replica, how to achieve it? -Original Message- From: sc...@scode.org [mailto:sc...@scode.org] On Behalf Of Peter Schuller Sent: Friday, January 14, 2011 1:19 AM To: user@cassandra.apache.org Subject: Re: about the data directory So you mean just the replica node 's sstable will be changed ,right? The data will only be written to the nodes that are part of the replica set fo the row (with the exception of hinted handoff, but that's a different sstable). If all the replica node broke down, whether the users can read the data? If *all* nodes in the replica set for a particular row are down, then you won't be able to read that row, no. -- / Peter Schuller It does not matter which node you connect to. The node you connect to determines the hash of the key (or uses the key itself when using Order Preserving Partitioner) to determine which node or nodes the data should be on. If the key is on that node it returns it directly to the client. If the key is not on that node Cassandra fetches it from another node and then returns that data. The client is unaware and does not need to be concerned with where the data came from.
Is there any way I could use keys of other rows as column names that could be sorted according to time ?
I would like to keep the reference of other rows as names of super column and sort those super columns according to time. Is there any way I could implement that ? Thanks in advance!
Re: Is there any way I could use keys of other rows as column names that could be sorted according to time ?
You could make the time an a fixed width integer and prefix your row keys with it, then set the comparotor to ascii or utf.Some issues:- Will you have time collisions ?- Not sure what your are storing in the super columns, but their are limitationshttp://wiki.apache.org/cassandra/CassandraLimitations- If you are using cassandra 0.7, have you looked at the secondary indexes ?http://www.riptano.com/blog/whats-new-cassandra-07-secondary-indexesIf you provide some more info on the problem your trying to solve we may be able to help some more.CheersAaronOn 14 Jan, 2011,at 04:27 PM, Aklin_81 asdk...@gmail.com wrote:I would like to keep the reference of other rows as names of super column and sort those super columns according to time. Is there any way I could implement that ? Thanks in advance!
limiting columns in a row
hi, the time-to-live feature in 0.7 is very nice and it made me want to ask about a somewhat similar feature. i have a stream of data consisting of entities and associated samples. so i create a row for each entity and the columns in each row contain the samples for that entity. when i get around to processing an entity i only care about the most recent N samples. so i read the most recent N columns and delete all the rest. what i would like would be a column family property that allows me to specify a maximum number of columns per row. then i could just keep writing and not have to do the deletes. in my case it would be fine if the limit is only 'eventually' applied (so that sometimes there might be extra columns). does this seem like a generally useful feature? if so, would it be hard to implement (maybe it could be done at compaction time like the TTL feature)? thanks, -mike
Re: Cassandra freezes under load when using libc6 2.11.1-0ubuntu7.5
May or may not be related but I thought I'd recount a similar experience we had in EC2 in hopes it helps someone else. As background, we had been running several servers in a 0.6.8 ring with no Cassandra issues (some EC2 issues, but none related to Cassandra) on multiple EC2 XL instances in a single availability zone. We decided to add several other nodes to a second AZ for reasons beyond the scope of this email. As we reached steady operational state in the new AZ, we noticed that the new nodes in the new AZ were repeatedly getting dropped from the ring. At first we attributed the drops to phi and expected cross-AZ latency. As we tried to pinpoint the issue, we found something very similar to what you describe - the EC2 VMs in the new AZ would become completely unresponsive. Not just the Java process hosting Cassandra, but the entire host. Shell commands would not execute for existing sessions, we could not establish new SSH sessions and tails we had on active files wouldn't show any progress. It appeared as if the machines in the new AZ would seize for several minutes, then come back to life with little rhyme or reason as to why. Tickets opened with AMZN resulted in responses of the physical server looks normal. After digging deeper, here's what we found. To confirm all nodes in both AZs were identical at the following levels: * Kernel (2.6.32-305-ec2 #9-Ubuntu SMP) distro (Ubuntu 10.04.1 LTS) and glibc on x86_64 * All nodes were running identical Java distributions that we deployed ourselves, sun 1.6.0_22-b04 * Same amount of virtualized RAM visible to the guest, same RAID stripe configuration across the same size/number of ephemeral drives We noticed two things that were different across the VMs in the two AZs: * The class of CPU exposed to the guest OSes across the two AZs (and presumably the same physical server above that guest). ** On hosts in the AZ not having issues, we see from the guest older Harpertown class Intel CPUs: model name : Intel(R) Xeon(R) CPU E5430 @ 2.66GHz ** On hosts in the AZ having issues, we see from the guest newer Nehalem class Intel CPUs: model name : Intel(R) Xeon(R) CPU E5507 @ 2.27GHz * Percent steal was consistently higher on the new nodes, on average 25% where as the older (stable) VMs were around 9% at peak load Consistently in our case, we only saw this seizing behavior on guests running on the newer Nehalem architecture CPUs. In digging a bit deeper on the problem machines, we also noticed the following: * Most of the time, ParNew GC on the problematic hosts was fine, averaging around .04 real seconds. After spending time tuning the generations and heap size for our workload, we rarely have CMS collections and almost never have Full GCs, even during full or anti-compactions. * Rarely, and at the same time as the problematic machines would seize, a long running ParNew collection would be recorded after the guest came back to life. Consistently this was between 180 and 220 seconds regardless of host, plenty of time for that host to be shunned from the ring. The long ParNew GCs were a mystery. They *never* happened on the hosts in the other AZ (the Harpertown class) and rarely happened on the new guests but we did observe the behavior within three hours of normal operation on each host in the new AZ. After lots of trial and error, we decided to remove ParNew collections from the equation and tried running a host in the new AZ with -XX:-UseParNewGC and this eliminated the long ParNew problem. The flip side is, we now do serial collections on the young generation for half our ring which means those nodes spend about 4x more time in GC than the other nodes, but they've been stable for two weeks since the change. That's what we know for sure and we're back to operating without a hitch with the one JVM option change. editorial What I think is happening is more complicated. Skip this part if you don't care about opinion and some of this reasoning is surely incorrect. In talking with multiple VMWare experts (I don't have much experience in Xen but I imagine the same is true there as well), it's generally a bad idea to virtualize too many cores (two seems to be the sweet spot). Reason being that if you have a heavily multithreaded application and that app relies on consistent application of memory barriers across multiple cores (as Java does), the Hypervisor has to wait for multiple physical cores to become available before it schedules the guest so that each virtual core gets a consistent view of the virtual memory while scheduled. If the physical server is overcommitted, that wait time is exacerbated as the guest waits for the correct number of physical cores to become available (4 in our case). It's possible to tell this in VMware via esxtop, not sure in Xen. It would also be somewhat visible via %steal increases in the guest which we saw, but that doesn't really explain a two minute pause during garbage collection. My guess then, is that