Re: Key cache hit rate
3.8E-4 means 3.8 * 10^-4, = 0.00038 = 0.038%, I think. So your program must be using random enough keys against the key cache size. maki From iPhone On 2011/04/16, at 15:17, mcasandra mohitanch...@gmail.com wrote: You mean read it like .00038880248833592535E? I didn't quite follow why? If it is 3.8880248833592535E then does it mean I got only 3% hit or .0003? -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Key-cache-hit-rate-tp6277236p6278397.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: cluster IP question and Jconsole?
8081 is your mx4j port, isn't it? You need to connect jconsole to JMX_PORT specified in cassandra-env.sh. maki From iPhone On 2011/04/16, at 13:56, tinhuty he tinh...@hotmail.com wrote: Maki, thanks for your reply. for the second question, I wasn't using the loopback address, I was using the actually IP address for that server. I am able to telnet to that IP on port 8081, but using jconsole failed. -Original Message- From: Maki Watanabe Sent: Friday, April 15, 2011 9:43 PM To: user@cassandra.apache.org Cc: tinhuty he Subject: Re: cluster IP question and Jconsole? 127.0.0.2 to 127.0.0.5 are valid IP addresses. Those are just alias addresses for your loopback interface. Verify: % ifconfig -a 127.0.0.0/8 is for loopback, so you can't connect this address from remote machines. You may be able configure SSH port forwarding from your monitroing host to cassandra node though I haven't try. maki 2011/4/16 tinhuty he tinh...@hotmail.com: I have followed the description here http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/lauching_5_node_cassandra_clusters to created 5 instances of cassandra in one CentOS 5.5 machine. using nodetool shows the 5 nodes are all running fine. Note the 5 nodes are using IP 127.0.0.1 to 127.0.0.5. I understand 127.0.0.1 is pointing to local server, but how about 127.0.0.2 to 127.0.0.5? looks to me that they are not valid IP? how come all 5 nodes are working ok? Another question. I have installed MX4J in instance 127.0.0.1 on port 8081. I am able to connect to http://server:8081/ from the browser. However how do I connect using Jconsole that was installed in another windows machines?(since my CentOS5.5 doesn't have X installed, only SSH allowed). Thanks.
Re: Starting the Cassandra server from Java (without command line)
@Naren Thanks for the reply. I noticed that EmbeddedCassandraService is now part of the Cassandra distribution. The source code is changed compared to what Ran had posted at http://prettyprint.me/2010/02/14/running-cassandra-as-an-embedded-service/ I tried to use EmbeddedCassandraService with this code: @BeforeClass public static void startServer() { // C:\cassandra is where cassandra is installed System.setProperty(storage-config, C:\\cassandra\\conf\\); cassandra = new EmbeddedCassandraService(); cassandra.start(); } But I got the following error: java.lang.ExceptionInInitializerError at org.apache.cassandra.service.EmbeddedCassandraService.start(EmbeddedCassandraService.java:58) ... ... Caused by: java.lang.RuntimeException: Couldn't figure out log4j configuration. at org.apache.cassandra.service.AbstractCassandraDaemon.clinit(AbstractCassandraDaemon.java:74) ... I guess I am not setting the right properties? Any help is appreciated. Thanks, Sam -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Starting-the-Cassandra-server-from-Java-without-command-line-tp6273826p6278552.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: Consistency model
If you are reading and writing at quorum, then what you are seeing shouldn't happen. You shouldn't be able to read N+1 until N+1 has been committed to a quorum of servers. At this point you should not be able to read N anymore, since there is no quorum that contains N. Dan - I think you are right, except that quorum reads should be consistent even during a quorum write. You are not guaranteed to read N+1 until *after* a successful quorum write of N+1, but once you see N+1, you should never see N again, even if the write failed. Sean On Fri, Apr 15, 2011 at 1:29 PM, Dan Hendry dan.hendry.j...@gmail.com wrote: So Cassandra does not use an atomic commit protocol at the cluster level. Strong consistency on a quorum read is only guaranteed *after* a successful quorum write. The behaviour you are seeing is possible if you are reading in the middle of a write or the write failed (which should be reported to your code via an exception). Dan -Original Message- From: James Cipar [mailto:jci...@cmu.edu] Sent: April-15-11 14:15 To: user@cassandra.apache.org Subject: Consistency model I've been experimenting with the consistency model of Cassandra, and I found something that seems a bit unexpected. In my experiment, I have 2 processes, a reader and a writer, each accessing a Cassandra cluster with a replication factor greater than 1. In addition, sometimes I generate background traffic to simulate a busy cluster by uploading a large data file to another table. The writer executes a loop where it writes a single row that contains just an sequentially increasing sequence number and a timestamp. In python this looks something like: while time.time() start_time + duration: target_server = random.sample(servers, 1)[0] target_server = '%s:9160'%target_server row = {'seqnum':str(seqnum), 'timestamp':str(time.time())} seqnum += 1 # print 'uploading to server %s, %s'%(target_server, row) pool = pycassa.connect('Keyspace1', [target_server]) cf = pycassa.ColumnFamily(pool, 'Standard1') cf.insert('foo', row, write_consistency_level=consistency_level) pool.dispose() if sleeptime 0.0: time.sleep(sleeptime) The reader simply executes a loop reading this row and reporting whenever a sequence number is *less* than the previous sequence number. As expected, with consistency_level=ConsistencyLevel.ONE there are many inconsistencies, especially with a high replication factor. What is unexpected is that I still detect inconsistencies when it is set at ConsistencyLevel.QUORUM. This is unexpected because the documentation seems to imply that QUORUM will give consistent results. With background traffic the average difference in timestamps was 0.6s, and the maximum was 3.5s. This means that a client sees a version of the row, and can subsequently see another version of the row that is 3.5s older than the previous. What I imagine is happening is this, but I'd like someone who knows that they're talking about to tell me if it's actually the case: I think Cassandra is not using an atomic commit protocol to commit to the quorum of servers chosen when the write is made. This means that at some point in the middle of the write, some subset of the quorum have seen the write, while others have not. At this time, there is a quorum of servers that have not seen the update, so depending on which quorum the client reads from, it may or may not see the update. Of course, I understand that the client is not *choosing* a bad quorum to read from, it is just the first `q` servers to respond, but in this case it is effectively random and sometimes an bad quorum is chosen. Does anyone have any other insight into what is going on here?= No virus found in this incoming message. Checked by AVG - www.avg.com Version: 9.0.894 / Virus Database: 271.1.1/3574 - Release Date: 04/15/11 02:34:00
Re: cassandra 0.6.3 error Connection refused to host: 127.0.0.1;
Any one have solution for this problem ? On 04/13/2011 05:20 PM, Ali Ahsan wrote: I am not running any firewall this physical machine not EC2,I can telnet to port 8080 telnet 127.0.0.1 8080 Trying 127.0.0.1... Connected to localhost.localdomain (127.0.0.1). Escape character is '^]'. -- S.Ali Ahsan Senior System Engineer e-Business (Pvt) Ltd 49-C Jail Road, Lahore, P.O. Box 676 Lahore 54000, Pakistan Tel: +92 (0)42 3758 7140 Ext. 128 Mobile: +92 (0)345 831 8769 Fax: +92 (0)42 3758 0027 Email: ali.ah...@panasiangroup.com www.ebusiness-pg.com www.panasiangroup.com Confidentiality: This e-mail and any attachments may be confidential and/or privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person use it for any purpose or store or copy the information in any medium. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. We do not accept liability for any errors or omissions.
Re: What will be the steps for adding new nodes
2011/4/16 Roni r...@similarweb.com: I have a 0.6.4 Cassandra cluster of two nodes in full replica (replica factor 2). I wants to add two more nodes and balance the cluster (replica factor 2). I want all of them to be seed's. What should be the simple steps: 1. add the AutoBootstraptrue/AutoBootstrap to all the nodes or only the new ones? You must add this option only on new nodes 2. add the Seed[new_node]/Seed to the config file of the old nodes before adding the new ones? If you do that bootstrap will no be working. And this is not needed step. I think that enough only few seed nodes for fault tolerance 3. do the old node need to be restarted (if no change is needed in their config file)? No that not needed
Re: cassandra 0.6.3 error Connection refused to host: 127.0.0.1;
http://wiki.apache.org/cassandra/JmxGotchas On Sat, Apr 16, 2011 at 12:20 PM, Ali Ahsan ali.ah...@panasiangroup.comwrote: Any one have solution for this problem ? On 04/13/2011 05:20 PM, Ali Ahsan wrote: I am not running any firewall this physical machine not EC2,I can telnet to port 8080 telnet 127.0.0.1 8080 Trying 127.0.0.1... Connected to localhost.localdomain (127.0.0.1). Escape character is '^]'. -- S.Ali Ahsan Senior System Engineer e-Business (Pvt) Ltd 49-C Jail Road, Lahore, P.O. Box 676 Lahore 54000, Pakistan Tel: +92 (0)42 3758 7140 Ext. 128 Mobile: +92 (0)345 831 8769 Fax: +92 (0)42 3758 0027 Email: ali.ah...@panasiangroup.com www.ebusiness-pg.com www.panasiangroup.com Confidentiality: This e-mail and any attachments may be confidential and/or privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person use it for any purpose or store or copy the information in any medium. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. We do not accept liability for any errors or omissions. -- Tyler Hobbs Software Engineer, DataStax http://datastax.com/ Maintainer of the pycassa http://github.com/pycassa/pycassa Cassandra Python client library
Re: Problems with subcolumn retrieval after upgrade from 0.6 to 0.7
Can you run the same request as a get_slice naming the column in the SlicePredicate and see what comes back ? Can you reproduce the fault with logging set at DEBUG and send the logs ? Also, whats the compare function like for your custom type ? Cheers Aaron On 16 Apr 2011, at 07:34, Abraham Sanderson wrote: I'm having some issues with a few of my ColumnFamilies after a cassandra upgrade/import from 0.6.1 to 0.7.4. I followed the instructions to upgrade and everything seem to work OK...until I got into the application and noticed some wierd behavior. I was getting the following stacktrace in cassandra occassionally when I did get operations for a single subcolumn for some of the Super type CFs: ERROR 12:56:05,669 Internal error processing get java.lang.AssertionError at org.apache.cassandra.thrift. CassandraServer.get(CassandraServer.java:300) at org.apache.cassandra.thrift.Cassandra$Processor$get.process(Cassandra.java:2655) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) The assertion that is failing is the check that only one column is retrieved by the get. I did some debugging with the cli and a remote debugger and found a few interesting patterns. First, the problem does not seem consistently duplicatable. If one supercolumn is affected though, it will happen more frequently for subcolumns that when sorted appear at the beginning of the range. For columns near the end of the range, it seems to be more intermittent, and almost never occurs when I step through the code line by line. The only factor I can think of that might cause issues is that I am using custom data types for all supercolumns and columns. I originally thought I might be reading past the end of the ByteBuffer, but I have quadrupled checked that this is not the case. Abe Sanderson
Re: question about performance of Cassandra 0.7.4 under a read-heavy workload.
Am assuming you are not getting a hot spot in the ring using the OPP. When running 0.7.4 with reduced concurrent reads if you see the read stage backing up using nodetool tpstats and the output from iostats shows that the IO system is not stressed then you should return the concurrent reads to the recommended value. Not sure why the recommendation changed, but how does 0.7.4 perform with the recommended number of concurrent readers? Out of interest can anyone talk about IO changes in 0.7.X that resulted in the new recommendation? Thanks Aaron On 16 Apr 2011, at 13:33, 魏金仙 wrote: To make a comparation, 10 threads were run against the two workloads seperately. below is the result of Cassandra0.7.4. write heavy workload(i.e., write/read: 50%/50%) median throughput: 5816 operations/second(i.e., 2908 writes and 2908 reads) update latency:1.32ms read latency:1.81ms read heavy workload(i.e., write/read: 5%/95%) median throughput: 40 operations/second(i.e., 2 writes and 38 reads) update latency:1.85ms read latency:90.43ms and for cassandra0.6.6, the result is: write heavy workload(i.e., write/read: 50%/50%) median throughput: 3284 operations/second(i.e., 1642 writes and 1642 reads) update latency:2.29ms read latency:3.51ms read heavy workload(i.e., write/read: 5%/95%) median throughput: 2759 operations/second(i.e., 2621 writes and 138 reads) update latency:2.33ms read latency:3.53ms all the tests were run in one environment. and most configurations of cassandra are just as default, except that:we choose orderPreservingPartitioner for all the tests and set concurrent_reads as 8( which is the default value of 0.6.6 but the default value of 0.7.4 is 32) . At 2011-04-16 06:53:01,Aaron Morton aa...@thelastpickle.com wrote: Will need to know more about the number of requests, iostats etc. There is no reason for it to run slower. Aaron On 16/04/2011, at 2:35 AM, 魏金仙 sei_...@126.com wrote: I just deployed cassandra 0.7.4 as a 6-server cluster and tested its performance via YCSB. The result seems confusing when compared to that of Cassandra0.6.6. Under a write heavy workload(i.e., write/read: 50%/50%), Cassandra0.7.4 obtains a really satisfactory latency. I mean both the read latency and write latency is much lower than those of Cassandra0.6.6. However, under a read heavy workload(i.e., write/read:5%/95%), Cassandra0.7.4 performs far worse than Cassandra0.6.6 does. Did I miss something? 体验网易邮箱2G超大附件,轻松发优质大电影、大照片,提速3倍! 体验网易邮箱2G超大附件,轻松发优质大电影、大照片,提速3倍!
Re: Two versions of schema
I don't think I got correct answer to my original post. Can someone please help? -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Two-versions-of-schema-tp6277365p6280070.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: Consistency model
Here it is. There is some setup code and global variable definitions that I left out of the previous code, but they are pretty similar to the setup code here. import pycassa import random import time consistency_level = pycassa.cassandra.ttypes.ConsistencyLevel.QUORUM duration = 600 sleeptime = 0.0 hostlist = 'worker-hostlist' def read_servers(fn): f = open(fn) servers = [] for line in f: servers.append(line.strip()) f.close() return servers servers = read_servers(hostlist) start_time = time.time() seqnum = -1 timestamp = 0 while time.time() start_time + duration: target_server = random.sample(servers, 1)[0] target_server = '%s:9160'%target_server try: pool = pycassa.connect('Keyspace1', [target_server]) cf = pycassa.ColumnFamily(pool, 'Standard1') row = cf.get('foo', read_consistency_level=consistency_level) pool.dispose() except: time.sleep(sleeptime) continue sq = int(row['seqnum']) ts = float(row['timestamp']) if sq seqnum: print 'Row changed: %i %f - %i %f'%(seqnum, timestamp, sq, ts) seqnum = sq timestamp = ts if sleeptime 0.0: time.sleep(sleeptime) On Apr 16, 2011, at 5:20 PM, Tyler Hobbs wrote: James, Would you mind sharing your reader process code as well? On Fri, Apr 15, 2011 at 1:14 PM, James Cipar jci...@cmu.edu wrote: I've been experimenting with the consistency model of Cassandra, and I found something that seems a bit unexpected. In my experiment, I have 2 processes, a reader and a writer, each accessing a Cassandra cluster with a replication factor greater than 1. In addition, sometimes I generate background traffic to simulate a busy cluster by uploading a large data file to another table. The writer executes a loop where it writes a single row that contains just an sequentially increasing sequence number and a timestamp. In python this looks something like: while time.time() start_time + duration: target_server = random.sample(servers, 1)[0] target_server = '%s:9160'%target_server row = {'seqnum':str(seqnum), 'timestamp':str(time.time())} seqnum += 1 # print 'uploading to server %s, %s'%(target_server, row) pool = pycassa.connect('Keyspace1', [target_server]) cf = pycassa.ColumnFamily(pool, 'Standard1') cf.insert('foo', row, write_consistency_level=consistency_level) pool.dispose() if sleeptime 0.0: time.sleep(sleeptime) The reader simply executes a loop reading this row and reporting whenever a sequence number is *less* than the previous sequence number. As expected, with consistency_level=ConsistencyLevel.ONE there are many inconsistencies, especially with a high replication factor. What is unexpected is that I still detect inconsistencies when it is set at ConsistencyLevel.QUORUM. This is unexpected because the documentation seems to imply that QUORUM will give consistent results. With background traffic the average difference in timestamps was 0.6s, and the maximum was 3.5s. This means that a client sees a version of the row, and can subsequently see another version of the row that is 3.5s older than the previous. What I imagine is happening is this, but I'd like someone who knows that they're talking about to tell me if it's actually the case: I think Cassandra is not using an atomic commit protocol to commit to the quorum of servers chosen when the write is made. This means that at some point in the middle of the write, some subset of the quorum have seen the write, while others have not. At this time, there is a quorum of servers that have not seen the update, so depending on which quorum the client reads from, it may or may not see the update. Of course, I understand that the client is not *choosing* a bad quorum to read from, it is just the first `q` servers to respond, but in this case it is effectively random and sometimes an bad quorum is chosen. Does anyone have any other insight into what is going on here? -- Tyler Hobbs Software Engineer, DataStax Maintainer of the pycassa Cassandra Python client library
Re: Consistency model
Here's what's probably happening: I'm assuming RF=3 and QUORUM writes/reads here. I'll call the replicas A, B, and C. 1. Writer process writes sequence number 1 and everything works fine. A, B, and C all have sequence number 1. 2. Writer process writes sequence number 2. Replica A writes successfully, B and C fail to respond in time, and a TimedOutException is returned. pycassa waits to retry the operation. 3. Reader process reads, gets a response from A and B. When the row from A and B is merged, sequence number 2 is the newest and is returned. A read repair is pushed to B and C, but they don't yet update their data. 4. Reader process reads again, gets a response from B and C (before they've repaired). These both report sequence number 1, so that's returned to the client. This is were you get a decreasing sequence number. 5. pycassa eventually retries the write; B and C eventually repair their data. Either way, both B and C shortly have sequence number 2. I've left out some of the details of read repair, and this scenario could happen in several slightly different ways, but it should give you an idea of what's happening. On Sat, Apr 16, 2011 at 8:35 PM, James Cipar jci...@cmu.edu wrote: Here it is. There is some setup code and global variable definitions that I left out of the previous code, but they are pretty similar to the setup code here. import pycassa import random import time consistency_level = pycassa.cassandra.ttypes.ConsistencyLevel.QUORUM duration = 600 sleeptime = 0.0 hostlist = 'worker-hostlist' def read_servers(fn): f = open(fn) servers = [] for line in f: servers.append(line.strip()) f.close() return servers servers = read_servers(hostlist) start_time = time.time() seqnum = -1 timestamp = 0 while time.time() start_time + duration: target_server = random.sample(servers, 1)[0] target_server = '%s:9160'%target_server try: pool = pycassa.connect('Keyspace1', [target_server]) cf = pycassa.ColumnFamily(pool, 'Standard1') row = cf.get('foo', read_consistency_level=consistency_level) pool.dispose() except: time.sleep(sleeptime) continue sq = int(row['seqnum']) ts = float(row['timestamp']) if sq seqnum: print 'Row changed: %i %f - %i %f'%(seqnum, timestamp, sq, ts) seqnum = sq timestamp = ts if sleeptime 0.0: time.sleep(sleeptime) On Apr 16, 2011, at 5:20 PM, Tyler Hobbs wrote: James, Would you mind sharing your reader process code as well? On Fri, Apr 15, 2011 at 1:14 PM, James Cipar jci...@cmu.edu wrote: I've been experimenting with the consistency model of Cassandra, and I found something that seems a bit unexpected. In my experiment, I have 2 processes, a reader and a writer, each accessing a Cassandra cluster with a replication factor greater than 1. In addition, sometimes I generate background traffic to simulate a busy cluster by uploading a large data file to another table. The writer executes a loop where it writes a single row that contains just an sequentially increasing sequence number and a timestamp. In python this looks something like: while time.time() start_time + duration: target_server = random.sample(servers, 1)[0] target_server = '%s:9160'%target_server row = {'seqnum':str(seqnum), 'timestamp':str(time.time())} seqnum += 1 # print 'uploading to server %s, %s'%(target_server, row) pool = pycassa.connect('Keyspace1', [target_server]) cf = pycassa.ColumnFamily(pool, 'Standard1') cf.insert('foo', row, write_consistency_level=consistency_level) pool.dispose() if sleeptime 0.0: time.sleep(sleeptime) The reader simply executes a loop reading this row and reporting whenever a sequence number is *less* than the previous sequence number. As expected, with consistency_level=ConsistencyLevel.ONE there are many inconsistencies, especially with a high replication factor. What is unexpected is that I still detect inconsistencies when it is set at ConsistencyLevel.QUORUM. This is unexpected because the documentation seems to imply that QUORUM will give consistent results. With background traffic the average difference in timestamps was 0.6s, and the maximum was 3.5s. This means that a client sees a version of the row, and can subsequently see another version of the row that is 3.5s older than the previous. What I imagine is happening is this, but I'd like someone who knows that they're talking about to tell me if it's actually the case: I think Cassandra is not using an atomic commit protocol to commit to the quorum of servers chosen when the write is made. This means that at some point in the
unsubscribe
Re: Consistency model
Tyler, your answer seems to contradict this email by Jonathan Ellis [1]. In it Jonathan says, The important guarantee this gives you is that once one quorum read sees the new value, all others will too. You can't see the newest version, then see an older version on a subsequent write [sic, I assume he meant read], which is the characteristic of non-strong consistency Jonathan also says, {X, Y} and {X, Z} are equivalent: one node with the write, and one without. The read will recognize that X's version needs to be sent to Z, and the write will be complete. This read and all subsequent ones will see the write. (Z [sic, I assume he meant Y] will be replicated to asynchronously via read repair.) To me, the statement this read and all subsequent ones will see the write implies that the new value must be committed to Y or Z before the read can return. If not, the statement must be false. Sean [1] : http://mail-archives.apache.org/mod_mbox/cassandra-user/201102.mbox/%3caanlktimegp8h87mgs_bxzknck-a59whxf-xx58hca...@mail.gmail.com%3E Sean On Sat, Apr 16, 2011 at 7:44 PM, Tyler Hobbs ty...@datastax.com wrote: Here's what's probably happening: I'm assuming RF=3 and QUORUM writes/reads here. I'll call the replicas A, B, and C. 1. Writer process writes sequence number 1 and everything works fine. A, B, and C all have sequence number 1. 2. Writer process writes sequence number 2. Replica A writes successfully, B and C fail to respond in time, and a TimedOutException is returned. pycassa waits to retry the operation. 3. Reader process reads, gets a response from A and B. When the row from A and B is merged, sequence number 2 is the newest and is returned. A read repair is pushed to B and C, but they don't yet update their data. 4. Reader process reads again, gets a response from B and C (before they've repaired). These both report sequence number 1, so that's returned to the client. This is were you get a decreasing sequence number. 5. pycassa eventually retries the write; B and C eventually repair their data. Either way, both B and C shortly have sequence number 2. I've left out some of the details of read repair, and this scenario could happen in several slightly different ways, but it should give you an idea of what's happening. On Sat, Apr 16, 2011 at 8:35 PM, James Cipar jci...@cmu.edu wrote: Here it is. There is some setup code and global variable definitions that I left out of the previous code, but they are pretty similar to the setup code here. import pycassa import random import time consistency_level = pycassa.cassandra.ttypes.ConsistencyLevel.QUORUM duration = 600 sleeptime = 0.0 hostlist = 'worker-hostlist' def read_servers(fn): f = open(fn) servers = [] for line in f: servers.append(line.strip()) f.close() return servers servers = read_servers(hostlist) start_time = time.time() seqnum = -1 timestamp = 0 while time.time() start_time + duration: target_server = random.sample(servers, 1)[0] target_server = '%s:9160'%target_server try: pool = pycassa.connect('Keyspace1', [target_server]) cf = pycassa.ColumnFamily(pool, 'Standard1') row = cf.get('foo', read_consistency_level=consistency_level) pool.dispose() except: time.sleep(sleeptime) continue sq = int(row['seqnum']) ts = float(row['timestamp']) if sq seqnum: print 'Row changed: %i %f - %i %f'%(seqnum, timestamp, sq, ts) seqnum = sq timestamp = ts if sleeptime 0.0: time.sleep(sleeptime) On Apr 16, 2011, at 5:20 PM, Tyler Hobbs wrote: James, Would you mind sharing your reader process code as well? On Fri, Apr 15, 2011 at 1:14 PM, James Cipar jci...@cmu.edu wrote: I've been experimenting with the consistency model of Cassandra, and I found something that seems a bit unexpected. In my experiment, I have 2 processes, a reader and a writer, each accessing a Cassandra cluster with a replication factor greater than 1. In addition, sometimes I generate background traffic to simulate a busy cluster by uploading a large data file to another table. The writer executes a loop where it writes a single row that contains just an sequentially increasing sequence number and a timestamp. In python this looks something like: while time.time() start_time + duration: target_server = random.sample(servers, 1)[0] target_server = '%s:9160'%target_server row = {'seqnum':str(seqnum), 'timestamp':str(time.time())} seqnum += 1 # print 'uploading to server %s, %s'%(target_server, row) pool = pycassa.connect('Keyspace1', [target_server]) cf = pycassa.ColumnFamily(pool, 'Standard1') cf.insert('foo', row,