Re: Key cache hit rate

2011-04-16 Thread Watanabe Maki
3.8E-4 means 3.8 * 10^-4, = 0.00038 =  0.038%, I think.
So your program must be using random enough keys against the key cache size.

maki


From iPhone


On 2011/04/16, at 15:17, mcasandra mohitanch...@gmail.com wrote:

 You mean read it like .00038880248833592535E? I didn't quite follow why? If
 it is 3.8880248833592535E then does it mean I got only 3% hit or .0003?
 
 --
 View this message in context: 
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Key-cache-hit-rate-tp6277236p6278397.html
 Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
 Nabble.com.


Re: cluster IP question and Jconsole?

2011-04-16 Thread Watanabe Maki
8081 is your mx4j port, isn't it? You need to connect jconsole to JMX_PORT 
specified in cassandra-env.sh.

maki

From iPhone


On 2011/04/16, at 13:56, tinhuty he tinh...@hotmail.com wrote:

 Maki, thanks for your reply. for the second question, I wasn't using the 
 loopback address, I was using the actually IP address for that server. I am 
 able to telnet to that IP on port 8081, but using jconsole failed.
 
 -Original Message- From: Maki Watanabe
 Sent: Friday, April 15, 2011 9:43 PM
 To: user@cassandra.apache.org
 Cc: tinhuty he
 Subject: Re: cluster IP question and Jconsole?
 
 127.0.0.2 to 127.0.0.5 are valid IP addresses. Those are just alias
 addresses for your loopback interface.
 Verify:
 % ifconfig -a
 
 127.0.0.0/8 is for loopback, so you can't connect this address from
 remote machines.
 You may be able configure SSH port forwarding from your monitroing
 host to cassandra node though I haven't try.
 
 maki
 
 2011/4/16 tinhuty he tinh...@hotmail.com:
 I have followed the description here
 http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/lauching_5_node_cassandra_clusters
 to created 5 instances of cassandra in one CentOS 5.5 machine. using
 nodetool shows the 5 nodes are all running fine.
 
 Note the 5 nodes are using IP 127.0.0.1 to 127.0.0.5. I understand 127.0.0.1
 is pointing to local server, but how about 127.0.0.2 to 127.0.0.5? looks to
 me that they are not valid IP? how come all 5 nodes are working ok?
 
 Another question. I have installed MX4J in instance 127.0.0.1 on port 8081.
 I am able to connect to http://server:8081/ from the browser. However how do
 I connect using Jconsole that was installed in another windows
 machines?(since my CentOS5.5 doesn't have X installed, only SSH allowed).
 
 Thanks. 
 


Re: Starting the Cassandra server from Java (without command line)

2011-04-16 Thread sam_
@Naren Thanks for the reply. 
I noticed that EmbeddedCassandraService is now part of the Cassandra
distribution. The source code is changed compared to what Ran had posted at
http://prettyprint.me/2010/02/14/running-cassandra-as-an-embedded-service/

I tried to use EmbeddedCassandraService with this code:

@BeforeClass
public static void startServer() {
// C:\cassandra is where cassandra is installed
System.setProperty(storage-config, C:\\cassandra\\conf\\);

cassandra = new EmbeddedCassandraService();
cassandra.start();
}

But I got the following error:

java.lang.ExceptionInInitializerError
at
org.apache.cassandra.service.EmbeddedCassandraService.start(EmbeddedCassandraService.java:58)
...
...
Caused by: java.lang.RuntimeException: Couldn't figure out log4j
configuration.
at
org.apache.cassandra.service.AbstractCassandraDaemon.clinit(AbstractCassandraDaemon.java:74)
...

I guess I am not setting the right properties?

Any help is appreciated.

Thanks,
Sam

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Starting-the-Cassandra-server-from-Java-without-command-line-tp6273826p6278552.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: Consistency model

2011-04-16 Thread Sean Bridges
If you are reading and writing at quorum, then what you are seeing
shouldn't happen.  You shouldn't be able to read N+1 until N+1 has
been committed to a quorum of servers.  At this point you should not
be able to read N anymore, since there is no quorum that contains N.

Dan - I think you are right, except that quorum reads should be
consistent even during a quorum write.  You are not guaranteed to read
N+1 until *after* a successful quorum write of N+1, but once you see
N+1, you should never see N again, even if the write failed.

Sean

On Fri, Apr 15, 2011 at 1:29 PM, Dan Hendry dan.hendry.j...@gmail.com wrote:
 So Cassandra does not use an atomic commit protocol at the cluster level.
 Strong consistency on a quorum read is only guaranteed *after* a successful
 quorum write. The behaviour you are seeing is possible if you are reading in
 the middle of a write or the write failed (which should be reported to your
 code via an exception).

 Dan

 -Original Message-
 From: James Cipar [mailto:jci...@cmu.edu]
 Sent: April-15-11 14:15
 To: user@cassandra.apache.org
 Subject: Consistency model

 I've been experimenting with the consistency model of Cassandra, and I found
 something that seems a bit unexpected.  In my experiment, I have 2
 processes, a reader and a writer, each accessing a Cassandra cluster with a
 replication factor greater than 1.  In addition, sometimes I generate
 background traffic to simulate a busy cluster by uploading a large data file
 to another table.

 The writer executes a loop where it writes a single row that contains just
 an sequentially increasing sequence number and a timestamp.  In python this
 looks something like:

    while time.time()  start_time + duration:
        target_server = random.sample(servers, 1)[0]
        target_server = '%s:9160'%target_server

        row = {'seqnum':str(seqnum), 'timestamp':str(time.time())}
        seqnum += 1
        # print 'uploading to server %s, %s'%(target_server, row)


        pool = pycassa.connect('Keyspace1', [target_server])
        cf = pycassa.ColumnFamily(pool, 'Standard1')
        cf.insert('foo', row, write_consistency_level=consistency_level)
        pool.dispose()

        if sleeptime  0.0:
            time.sleep(sleeptime)


 The reader simply executes a loop reading this row and reporting whenever a
 sequence number is *less* than the previous sequence number.  As expected,
 with consistency_level=ConsistencyLevel.ONE there are many inconsistencies,
 especially with a high replication factor.

 What is unexpected is that I still detect inconsistencies when it is set at
 ConsistencyLevel.QUORUM.  This is unexpected because the documentation seems
 to imply that QUORUM will give consistent results.  With background traffic
 the average difference in timestamps was 0.6s, and the maximum was 3.5s.
 This means that a client sees a version of the row, and can subsequently see
 another version of the row that is 3.5s older than the previous.

 What I imagine is happening is this, but I'd like someone who knows that
 they're talking about to tell me if it's actually the case:

 I think Cassandra is not using an atomic commit protocol to commit to the
 quorum of servers chosen when the write is made.  This means that at some
 point in the middle of the write, some subset of the quorum have seen the
 write, while others have not.  At this time, there is a quorum of servers
 that have not seen the update, so depending on which quorum the client reads
 from, it may or may not see the update.

 Of course, I understand that the client is not *choosing* a bad quorum to
 read from, it is just the first `q` servers to respond, but in this case it
 is effectively random and sometimes an bad quorum is chosen.

 Does anyone have any other insight into what is going on here?=
 No virus found in this incoming message.
 Checked by AVG - www.avg.com
 Version: 9.0.894 / Virus Database: 271.1.1/3574 - Release Date: 04/15/11
 02:34:00




Re: cassandra 0.6.3 error Connection refused to host: 127.0.0.1;

2011-04-16 Thread Ali Ahsan

Any one have solution for this problem ?


On 04/13/2011 05:20 PM, Ali Ahsan wrote:
I am not running any firewall this physical machine not EC2,I can 
telnet to port 8080



 telnet 127.0.0.1 8080
Trying 127.0.0.1...
Connected to localhost.localdomain (127.0.0.1).
Escape character is '^]'.






--
S.Ali Ahsan

Senior System Engineer

e-Business (Pvt) Ltd

49-C Jail Road, Lahore, P.O. Box 676
Lahore 54000, Pakistan

Tel: +92 (0)42 3758 7140 Ext. 128

Mobile: +92 (0)345 831 8769

Fax: +92 (0)42 3758 0027

Email: ali.ah...@panasiangroup.com



www.ebusiness-pg.com

www.panasiangroup.com

Confidentiality: This e-mail and any attachments may be confidential
and/or privileged. If you are not a named recipient, please notify the
sender immediately and do not disclose the contents to another person
use it for any purpose or store or copy the information in any medium.
Internet communications cannot be guaranteed to be timely, secure, error
or virus-free. We do not accept liability for any errors or omissions.



Re: What will be the steps for adding new nodes

2011-04-16 Thread ruslan usifov
2011/4/16 Roni r...@similarweb.com:
 I have a 0.6.4 Cassandra cluster of two nodes in full replica (replica
 factor 2). I wants to add two more nodes and balance the cluster (replica
 factor 2).

 I want all of them to be seed's.



 What should be the simple steps:

 1. add the AutoBootstraptrue/AutoBootstrap to all the nodes or only
 the new ones?


You must add this option only on new nodes

 2. add the Seed[new_node]/Seed to the config file of the old nodes
 before adding the new ones?



If you do that bootstrap will no be working. And this is not needed
step. I think that enough only few seed nodes for fault tolerance

 3. do the old node need to be restarted (if no change is needed in their
 config file)?


No that not needed


Re: cassandra 0.6.3 error Connection refused to host: 127.0.0.1;

2011-04-16 Thread Tyler Hobbs
http://wiki.apache.org/cassandra/JmxGotchas

On Sat, Apr 16, 2011 at 12:20 PM, Ali Ahsan ali.ah...@panasiangroup.comwrote:

 Any one have solution for this problem ?



 On 04/13/2011 05:20 PM, Ali Ahsan wrote:

 I am not running any firewall this physical machine not EC2,I can telnet
 to port 8080


  telnet 127.0.0.1 8080
 Trying 127.0.0.1...
 Connected to localhost.localdomain (127.0.0.1).
 Escape character is '^]'.





 --
 S.Ali Ahsan

 Senior System Engineer

 e-Business (Pvt) Ltd

 49-C Jail Road, Lahore, P.O. Box 676
 Lahore 54000, Pakistan

 Tel: +92 (0)42 3758 7140 Ext. 128

 Mobile: +92 (0)345 831 8769

 Fax: +92 (0)42 3758 0027

 Email: ali.ah...@panasiangroup.com



 www.ebusiness-pg.com

 www.panasiangroup.com

 Confidentiality: This e-mail and any attachments may be confidential
 and/or privileged. If you are not a named recipient, please notify the
 sender immediately and do not disclose the contents to another person
 use it for any purpose or store or copy the information in any medium.
 Internet communications cannot be guaranteed to be timely, secure, error
 or virus-free. We do not accept liability for any errors or omissions.




-- 
Tyler Hobbs
Software Engineer, DataStax http://datastax.com/
Maintainer of the pycassa http://github.com/pycassa/pycassa Cassandra
Python client library


Re: Problems with subcolumn retrieval after upgrade from 0.6 to 0.7

2011-04-16 Thread aaron morton
Can you run the same request as a get_slice naming the column in the 
SlicePredicate and see what comes back ? 

Can you reproduce the fault with logging set at DEBUG and send the logs ?

Also, whats the compare function like for your custom type ?

Cheers
Aaron

 
On 16 Apr 2011, at 07:34, Abraham Sanderson wrote:

 I'm having some issues with a few of my ColumnFamilies after a cassandra 
 upgrade/import from 0.6.1 to 0.7.4.  I followed the instructions to upgrade 
 and everything seem to work OK...until I got into the application and noticed 
 some wierd behavior.  I was getting the following stacktrace in cassandra 
 occassionally when I did get operations for a single subcolumn for some of 
 the Super type CFs:
 
 ERROR 12:56:05,669 Internal error processing get
 java.lang.AssertionError
 at org.apache.cassandra.thrift.
 CassandraServer.get(CassandraServer.java:300)
 at 
 org.apache.cassandra.thrift.Cassandra$Processor$get.process(Cassandra.java:2655)
 at 
 org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555)
 at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:636)
 
 The assertion that is failing is the check that only one column is retrieved 
 by the get.  I did some debugging with the cli and a remote  debugger and 
 found a few interesting patterns.  First, the problem does not seem 
 consistently duplicatable.  If one supercolumn is affected though, it will 
 happen more frequently for subcolumns that when sorted appear at the 
 beginning of the range.  For columns near the end of the range, it seems to 
 be more intermittent, and almost never occurs when I step through the code 
 line by line.  The only factor I can think of that might cause issues is that 
 I am using custom data types for all supercolumns and columns.  I originally 
 thought I might be reading past the end of the ByteBuffer, but I have 
 quadrupled checked that this is not the case.
 
 Abe Sanderson



Re: question about performance of Cassandra 0.7.4 under a read-heavy workload.

2011-04-16 Thread aaron morton
Am assuming you are not getting a hot spot in the ring using the OPP. 

When running 0.7.4 with reduced concurrent reads if you see the read stage 
backing up using nodetool tpstats and the output from iostats shows that the IO 
system is not stressed then you should return the concurrent reads to the 
recommended value. 

Not sure why the recommendation changed, but how does 0.7.4 perform with the 
recommended number of concurrent readers?

Out of interest can anyone talk about IO changes in 0.7.X that resulted in the 
new recommendation?

Thanks
Aaron


On 16 Apr 2011, at 13:33, 魏金仙 wrote:

 To make a comparation, 10 threads were run against the two workloads 
 seperately. below is the result of Cassandra0.7.4.
 write heavy workload(i.e., write/read: 50%/50%)
 median throughput:  5816 operations/second(i.e., 2908 writes and 2908 reads) 
 update latency:1.32ms read latency:1.81ms
 read heavy workload(i.e., write/read: 5%/95%)
 median throughput:  40 operations/second(i.e., 2 writes and 38 reads) update 
 latency:1.85ms read latency:90.43ms
 
 and for cassandra0.6.6, the result is:
 write heavy workload(i.e., write/read: 50%/50%)
 median throughput:  3284 operations/second(i.e., 1642 writes and 1642 reads) 
 update latency:2.29ms read latency:3.51ms
 read heavy workload(i.e., write/read: 5%/95%)
 median throughput:  2759 operations/second(i.e., 2621 writes and 138 reads) 
 update latency:2.33ms read latency:3.53ms
 
 all the tests were run in one environment. and most configurations of 
 cassandra are just as default, except that:we choose  
 orderPreservingPartitioner for all the tests and set concurrent_reads as 8( 
 which is the default value of 0.6.6 but the default value of 0.7.4 is 32) . 
 
 
 
 
 At 2011-04-16 06:53:01,Aaron Morton aa...@thelastpickle.com wrote:
 Will need to know more about the number of requests, iostats etc. There is no 
 reason for it to run slower.
 
 Aaron
 On 16/04/2011, at 2:35 AM, 魏金仙 sei_...@126.com wrote:
 
 I just deployed cassandra 0.7.4 as a 6-server cluster and tested its 
 performance via YCSB.
 The result seems confusing when compared to that of Cassandra0.6.6. Under a 
 write heavy workload(i.e., write/read: 50%/50%), Cassandra0.7.4 obtains a 
 really satisfactory latency. I mean both the read latency and write latency 
 is much lower than those of Cassandra0.6.6.
 However, under a read heavy workload(i.e., write/read:5%/95%), 
 Cassandra0.7.4 performs far worse than Cassandra0.6.6 does.
 
 Did I miss something?
 
 
 体验网易邮箱2G超大附件,轻松发优质大电影、大照片,提速3倍!
 
 
 体验网易邮箱2G超大附件,轻松发优质大电影、大照片,提速3倍!



Re: Two versions of schema

2011-04-16 Thread mcasandra
I don't think I got correct answer to my original post. Can someone please
help?

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Two-versions-of-schema-tp6277365p6280070.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: Consistency model

2011-04-16 Thread James Cipar
Here it is.  There is some setup code and global variable definitions that I 
left out of the previous code, but they are pretty similar to the setup code 
here.

import pycassa
import random
import time

consistency_level = pycassa.cassandra.ttypes.ConsistencyLevel.QUORUM
duration = 600
sleeptime = 0.0
hostlist = 'worker-hostlist'

def read_servers(fn):
f = open(fn)
servers = []
for line in f:
servers.append(line.strip())
f.close()
return servers

servers = read_servers(hostlist)
start_time = time.time()
seqnum = -1
timestamp = 0

while time.time()  start_time + duration:
target_server = random.sample(servers, 1)[0]
target_server = '%s:9160'%target_server

try:
pool = pycassa.connect('Keyspace1', [target_server])
cf = pycassa.ColumnFamily(pool, 'Standard1')
row = cf.get('foo', read_consistency_level=consistency_level)
pool.dispose()
except:
time.sleep(sleeptime)
continue

sq = int(row['seqnum'])
ts = float(row['timestamp'])

if sq  seqnum:
print 'Row changed: %i %f - %i %f'%(seqnum, timestamp, sq, ts)
seqnum = sq
timestamp = ts

if sleeptime  0.0:
time.sleep(sleeptime)




On Apr 16, 2011, at 5:20 PM, Tyler Hobbs wrote:

 James,
 
 Would you mind sharing your reader process code as well?
 
 On Fri, Apr 15, 2011 at 1:14 PM, James Cipar jci...@cmu.edu wrote:
 I've been experimenting with the consistency model of Cassandra, and I found 
 something that seems a bit unexpected.  In my experiment, I have 2 processes, 
 a reader and a writer, each accessing a Cassandra cluster with a replication 
 factor greater than 1.  In addition, sometimes I generate background traffic 
 to simulate a busy cluster by uploading a large data file to another table.
 
 The writer executes a loop where it writes a single row that contains just an 
 sequentially increasing sequence number and a timestamp.  In python this 
 looks something like:
 
while time.time()  start_time + duration:
target_server = random.sample(servers, 1)[0]
target_server = '%s:9160'%target_server
 
row = {'seqnum':str(seqnum), 'timestamp':str(time.time())}
seqnum += 1
# print 'uploading to server %s, %s'%(target_server, row)
 
pool = pycassa.connect('Keyspace1', [target_server])
cf = pycassa.ColumnFamily(pool, 'Standard1')
cf.insert('foo', row, write_consistency_level=consistency_level)
pool.dispose()
 
if sleeptime  0.0:
time.sleep(sleeptime)
 
 
 The reader simply executes a loop reading this row and reporting whenever a 
 sequence number is *less* than the previous sequence number.  As expected, 
 with consistency_level=ConsistencyLevel.ONE there are many inconsistencies, 
 especially with a high replication factor.
 
 What is unexpected is that I still detect inconsistencies when it is set at 
 ConsistencyLevel.QUORUM.  This is unexpected because the documentation seems 
 to imply that QUORUM will give consistent results.  With background traffic 
 the average difference in timestamps was 0.6s, and the maximum was 3.5s.  
 This means that a client sees a version of the row, and can subsequently see 
 another version of the row that is 3.5s older than the previous.
 
 What I imagine is happening is this, but I'd like someone who knows that 
 they're talking about to tell me if it's actually the case:
 
 I think Cassandra is not using an atomic commit protocol to commit to the 
 quorum of servers chosen when the write is made.  This means that at some 
 point in the middle of the write, some subset of the quorum have seen the 
 write, while others have not.  At this time, there is a quorum of servers 
 that have not seen the update, so depending on which quorum the client reads 
 from, it may or may not see the update.
 
 Of course, I understand that the client is not *choosing* a bad quorum to 
 read from, it is just the first `q` servers to respond, but in this case it 
 is effectively random and sometimes an bad quorum is chosen.
 
 Does anyone have any other insight into what is going on here?
 
 
 
 -- 
 Tyler Hobbs
 Software Engineer, DataStax
 Maintainer of the pycassa Cassandra Python client library
 



Re: Consistency model

2011-04-16 Thread Tyler Hobbs
Here's what's probably happening:

I'm assuming RF=3 and QUORUM writes/reads here.  I'll call the replicas A,
B, and C.

1.  Writer process writes sequence number 1 and everything works fine.  A,
B, and C all have sequence number 1.
2.  Writer process writes sequence number 2.  Replica A writes successfully,
B and C fail to respond in time, and a TimedOutException is returned.
pycassa waits to retry the operation.
3.  Reader process reads, gets a response from A and B.  When the row from A
and B is merged, sequence number 2 is the newest and is returned.  A read
repair is pushed to B and C, but they don't yet update their data.
4.  Reader process reads again, gets a response from B and C (before they've
repaired).  These both report sequence number 1, so that's returned to the
client.  This is were you get a decreasing sequence number.
5.  pycassa eventually retries the write; B and C eventually repair their
data.  Either way, both B and C shortly have sequence number 2.

I've left out some of the details of read repair, and this scenario could
happen in several slightly different ways, but it should give you an idea of
what's happening.

On Sat, Apr 16, 2011 at 8:35 PM, James Cipar jci...@cmu.edu wrote:

 Here it is.  There is some setup code and global variable definitions that
 I left out of the previous code, but they are pretty similar to the setup
 code here.

 import pycassa
 import random
 import time

 consistency_level = pycassa.cassandra.ttypes.ConsistencyLevel.QUORUM
 duration = 600
 sleeptime = 0.0
 hostlist = 'worker-hostlist'

 def read_servers(fn):
 f = open(fn)
 servers = []
 for line in f:
 servers.append(line.strip())
 f.close()
 return servers

 servers = read_servers(hostlist)
 start_time = time.time()
 seqnum = -1
 timestamp = 0

 while time.time()  start_time + duration:
 target_server = random.sample(servers, 1)[0]
 target_server = '%s:9160'%target_server

 try:
 pool = pycassa.connect('Keyspace1', [target_server])
 cf = pycassa.ColumnFamily(pool, 'Standard1')
 row = cf.get('foo', read_consistency_level=consistency_level)
 pool.dispose()
 except:
 time.sleep(sleeptime)
 continue

 sq = int(row['seqnum'])
 ts = float(row['timestamp'])

 if sq  seqnum:
 print 'Row changed: %i %f - %i %f'%(seqnum, timestamp, sq, ts)
 seqnum = sq
 timestamp = ts

 if sleeptime  0.0:
 time.sleep(sleeptime)




 On Apr 16, 2011, at 5:20 PM, Tyler Hobbs wrote:

 James,

 Would you mind sharing your reader process code as well?

 On Fri, Apr 15, 2011 at 1:14 PM, James Cipar jci...@cmu.edu wrote:

 I've been experimenting with the consistency model of Cassandra, and I
 found something that seems a bit unexpected.  In my experiment, I have 2
 processes, a reader and a writer, each accessing a Cassandra cluster with a
 replication factor greater than 1.  In addition, sometimes I generate
 background traffic to simulate a busy cluster by uploading a large data file
 to another table.

 The writer executes a loop where it writes a single row that contains just
 an sequentially increasing sequence number and a timestamp.  In python this
 looks something like:

while time.time()  start_time + duration:
target_server = random.sample(servers, 1)[0]
target_server = '%s:9160'%target_server

row = {'seqnum':str(seqnum), 'timestamp':str(time.time())}
seqnum += 1
# print 'uploading to server %s, %s'%(target_server, row)

pool = pycassa.connect('Keyspace1', [target_server])
cf = pycassa.ColumnFamily(pool, 'Standard1')
cf.insert('foo', row, write_consistency_level=consistency_level)
pool.dispose()

if sleeptime  0.0:
time.sleep(sleeptime)


 The reader simply executes a loop reading this row and reporting whenever
 a sequence number is *less* than the previous sequence number.  As expected,
 with consistency_level=ConsistencyLevel.ONE there are many inconsistencies,
 especially with a high replication factor.

 What is unexpected is that I still detect inconsistencies when it is set
 at ConsistencyLevel.QUORUM.  This is unexpected because the documentation
 seems to imply that QUORUM will give consistent results.  With background
 traffic the average difference in timestamps was 0.6s, and the maximum was
 3.5s.  This means that a client sees a version of the row, and can
 subsequently see another version of the row that is 3.5s older than the
 previous.

 What I imagine is happening is this, but I'd like someone who knows that
 they're talking about to tell me if it's actually the case:

 I think Cassandra is not using an atomic commit protocol to commit to the
 quorum of servers chosen when the write is made.  This means that at some
 point in the 

unsubscribe

2011-04-16 Thread subrahmanya harve




Re: Consistency model

2011-04-16 Thread Sean Bridges
Tyler, your answer seems to contradict this email by Jonathan Ellis
[1].  In it Jonathan says,

The important guarantee this gives you is that once one quorum read
sees the new value, all others will too.   You can't see the newest
version, then see an older version on a subsequent write [sic, I
assume he meant read], which is the characteristic of non-strong
consistency

Jonathan also says,

{X, Y} and {X, Z} are equivalent: one node with the write, and one
without. The read will recognize that X's version needs to be sent to
Z, and the write will be complete.  This read and all subsequent ones
will see the write.  (Z [sic, I assume he meant Y] will be replicated
to asynchronously via read repair.)

To me, the statement this read and all subsequent ones will see the
write implies that the new value must be committed to Y or Z before
the read can return.  If not, the statement must be false.

Sean


[1] : 
http://mail-archives.apache.org/mod_mbox/cassandra-user/201102.mbox/%3caanlktimegp8h87mgs_bxzknck-a59whxf-xx58hca...@mail.gmail.com%3E

Sean

On Sat, Apr 16, 2011 at 7:44 PM, Tyler Hobbs ty...@datastax.com wrote:
 Here's what's probably happening:

 I'm assuming RF=3 and QUORUM writes/reads here.  I'll call the replicas A,
 B, and C.

 1.  Writer process writes sequence number 1 and everything works fine.  A,
 B, and C all have sequence number 1.
 2.  Writer process writes sequence number 2.  Replica A writes successfully,
 B and C fail to respond in time, and a TimedOutException is returned.
 pycassa waits to retry the operation.
 3.  Reader process reads, gets a response from A and B.  When the row from A
 and B is merged, sequence number 2 is the newest and is returned.  A read
 repair is pushed to B and C, but they don't yet update their data.
 4.  Reader process reads again, gets a response from B and C (before they've
 repaired).  These both report sequence number 1, so that's returned to the
 client.  This is were you get a decreasing sequence number.
 5.  pycassa eventually retries the write; B and C eventually repair their
 data.  Either way, both B and C shortly have sequence number 2.

 I've left out some of the details of read repair, and this scenario could
 happen in several slightly different ways, but it should give you an idea of
 what's happening.

 On Sat, Apr 16, 2011 at 8:35 PM, James Cipar jci...@cmu.edu wrote:

 Here it is.  There is some setup code and global variable definitions that
 I left out of the previous code, but they are pretty similar to the setup
 code here.
     import pycassa
     import random
     import time
     consistency_level = pycassa.cassandra.ttypes.ConsistencyLevel.QUORUM
     duration = 600
     sleeptime = 0.0
     hostlist = 'worker-hostlist'
     def read_servers(fn):
         f = open(fn)
         servers = []
         for line in f:
             servers.append(line.strip())
         f.close()
         return servers
     servers = read_servers(hostlist)
     start_time = time.time()
     seqnum = -1
     timestamp = 0
     while time.time()  start_time + duration:
         target_server = random.sample(servers, 1)[0]
         target_server = '%s:9160'%target_server
         try:
             pool = pycassa.connect('Keyspace1', [target_server])
             cf = pycassa.ColumnFamily(pool, 'Standard1')
             row = cf.get('foo', read_consistency_level=consistency_level)
             pool.dispose()
         except:
             time.sleep(sleeptime)
             continue
         sq = int(row['seqnum'])
         ts = float(row['timestamp'])
         if sq  seqnum:
             print 'Row changed: %i %f - %i %f'%(seqnum, timestamp, sq,
 ts)
         seqnum = sq
         timestamp = ts
         if sleeptime  0.0:
             time.sleep(sleeptime)



 On Apr 16, 2011, at 5:20 PM, Tyler Hobbs wrote:

 James,

 Would you mind sharing your reader process code as well?

 On Fri, Apr 15, 2011 at 1:14 PM, James Cipar jci...@cmu.edu wrote:

 I've been experimenting with the consistency model of Cassandra, and I
 found something that seems a bit unexpected.  In my experiment, I have 2
 processes, a reader and a writer, each accessing a Cassandra cluster with a
 replication factor greater than 1.  In addition, sometimes I generate
 background traffic to simulate a busy cluster by uploading a large data file
 to another table.

 The writer executes a loop where it writes a single row that contains
 just an sequentially increasing sequence number and a timestamp.  In python
 this looks something like:

    while time.time()  start_time + duration:
        target_server = random.sample(servers, 1)[0]
        target_server = '%s:9160'%target_server

        row = {'seqnum':str(seqnum), 'timestamp':str(time.time())}
        seqnum += 1
        # print 'uploading to server %s, %s'%(target_server, row)

        pool = pycassa.connect('Keyspace1', [target_server])
        cf = pycassa.ColumnFamily(pool, 'Standard1')
        cf.insert('foo', row,