In your step 5 which node are you connected to ? There is a possibility that during the time node A was turning off, mutations which had been started were completed on node B. In which case the client would have gotten a TimedOutExceptions. Meaning the operation did not complete at the required CL, but was started and probably written to node B.
In the scenario you describe it's possible for a key to be on node B and not node A. What happens you do you the read in step 5 at QUORUM? AFAIK configuring the commit log as you have will result in every write blocking on the commit log sync'ing. Hope that helps. ----------------- Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 31 May 2011, at 20:46, Preston Chang wrote: > My RF is 2. > > When the node A is down, the commit log should be fsynced to disk in my test > scene, so there should be no NOTFOUND key, but there are some NOTFOUND keys. > I am puzzled. > > 2011/5/31 Maki Watanabe <watanabe.m...@gmail.com> > How much replication factor did you set for the keyspace? > If the RF is 2, your data should be replicated to both of nodes. If > the RF is 1, you will lose the half of data when the node A is down. > > maki > > > 2011/5/31 Preston Chang <zhangyf2...@gmail.com>: > > Hi, > > I have a cluster with two nodes (node A and node B) and make a test as > > follows: > > 1). set commitlog sync in batch mode and the sync batch window in 0 ms > > 2). one client wrote random keys in infinite loop with consistency level > > QUORUM and record the keys in file after the insert() method return normally > > 3). unplug one server (node A) power cord > > 4). restart the server and cassandra service > > 5). read the key list generated in step 2) with consistency level ONE > > I thought the result of test is all the key in list can be read normally, > > but actually there are some NotFound keys. > > My question is why there are NotFound keys. In my opinion server would not > > ack the client before finishing syncing the commitlog if I set commitlog > > sync in batch mode and the sync batch window in 0 ms. So if the insert() > > method return normally it means the mutation had been written in commitlog > > and the commitlog had been synced to the disk. Am I right? > > My Cassandra version is 0.7.3. > > Thanks for your help very much. > > -- > > by Preston Chang > > > > > > > > -- > w3m > > > > -- > by Preston Chang >