Re: Cassandra 2.0: Paxos Prepare response always false

2013-07-29 Thread aaron morton
Thanks for looking into this. 

If you have a way to reproduce this the best thing to do is create a ticket at 
https://issues.apache.org/jira/browse/CASSANDRA as 2.0 is still under 
development. 

Cheers


-
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 26/07/2013, at 11:29 AM, Soumava Ghosh soum...@cs.utexas.edu wrote:

 Hi,
 
 I have test setup where clients randomly make a controlled number of cas() 
 requests (among other requests) at a cluster of cassandra 2.0 servers. After 
 one point, I'm seeing that all requests are pending and my client's 
 throughput has reduced to 0.0 for all kinds of requests. For this specific 
 case I had 10 clients each making around 30 cas() requests per second at a 
 cluster of 72 instances of cassandra. 
 
 Clients are set up to register a request as a success after the cas() call 
 returns with CASResult.success = true, else an exception is thrown. Since I 
 see that no client requests were actually registered and no exceptions were 
 thrown, which indicates that the cas() call itself is hung.
 
 On the server side, I see Paxos logs as follows - they go on for 50 log files 
 for each of the servers involved, and they span at least an hour. I have 
 marked a particular instance where the prepare response is true but the 
 propose response is false from all the involved servers:
 
 At the Paxos Initiator:  None of the files among the 50 system logs have the 
 phrase 'Propose response true', these logs just go on and on.
 
 DEBUG [RequestResponseStage:110] 2013-07-25 15:09:05,332 PrepareCallback.java 
 (line 58) Prepare response PrepareResponse(true, 
 Commit(d145fe46f5d02a54b5ea95852f94c402, 
 1a0c4220-f561-11e2-a409-019f62d610d7, ColumnFamily(P 
 [81d271b2125c59cb0003:false:27@1374780817218000,])), 
 Commit(d145fe46f5d02a54b5ea95852f94c402, 
 d2093120-f576-11e2-a57e-a154d605509d, ColumnFamily(P 
 [81d271b2125c59cb0003:false:27@1374780817218000,]))) from 
 /17.163.7.195
 
 DEBUG [RequestResponseStage:92] 2013-07-25 15:09:05,346 PrepareCallback.java 
 (line 58) Prepare response PrepareResponse(true, 
 Commit(d145fe46f5d02a54b5ea95852f94c402, 
 1a0c4220-f561-11e2-a409-019f62d610d7, ColumnFamily(P 
 [81d271b2125c59cb0003:false:27@1374780817218000,])), 
 Commit(d145fe46f5d02a54b5ea95852f94c402, 
 d2093120-f576-11e2-a57e-a154d605509d, ColumnFamily(P 
 [81d271b2125c59cb0003:false:27@1374780817218000,]))) from 
 /17.163.7.184
 
 DEBUG [RequestResponseStage:98] 2013-07-25 15:09:05,347 PrepareCallback.java 
 (line 58) Prepare response PrepareResponse(true, 
 Commit(d145fe46f5d02a54b5ea95852f94c402, 
 1a0c4220-f561-11e2-a409-019f62d610d7, ColumnFamily(P 
 [81d271b2125c59cb0003:false:27@1374780817218000,])), 
 Commit(d145fe46f5d02a54b5ea95852f94c402, 
 d2093120-f576-11e2-a57e-a154d605509d, ColumnFamily(P 
 [81d271b2125c59cb0003:false:27@1374780817218000,]))) from 
 /17.163.7.20
 
 DEBUG [RequestResponseStage:93] 2013-07-25 15:09:05,350 ProposeCallback.java 
 (line 44) Propose response false from /17.163.7.20
 DEBUG [RequestResponseStage:100] 2013-07-25 15:09:05,350 ProposeCallback.java 
 (line 44) Propose response false from /17.163.7.184
 DEBUG [RequestResponseStage:111] 2013-07-25 15:09:05,350 ProposeCallback.java 
 (line 44) Propose response false from /17.163.7.195
 
 DEBUG [RequestResponseStage:102] 2013-07-25 15:09:05,351 PrepareCallback.java 
 (line 58) Prepare response PrepareResponse(true, 
 Commit(d145fe46f5d02a54b5ea95852f94c402, 
 1a0c4220-f561-11e2-a409-019f62d610d7, ColumnFamily(P 
 [81d271b2125c59cb0003:false:27@1374780817218000,])), 
 Commit(d145fe46f5d02a54b5ea95852f94c402, 
 d20c3e60-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P 
 [81d271b2125c59cb0003:false:27@1374780817218000,]))) from 
 /17.163.7.195
 
 DEBUG [RequestResponseStage:107] 2013-07-25 15:09:05,352 PrepareCallback.java 
 (line 58) Prepare response PrepareResponse(true, 
 Commit(d145fe46f5d02a54b5ea95852f94c402, 
 1a0c4220-f561-11e2-a409-019f62d610d7, ColumnFamily(P 
 [81d271b2125c59cb0003:false:27@1374780817218000,])), 
 Commit(d145fe46f5d02a54b5ea95852f94c402, 
 d20c3e60-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P 
 [81d271b2125c59cb0003:false:27@1374780817218000,]))) from 
 /17.163.7.20
 
 DEBUG [RequestResponseStage:108] 2013-07-25 15:09:05,352 PrepareCallback.java 
 (line 58) Prepare response PrepareResponse(true, 
 Commit(d145fe46f5d02a54b5ea95852f94c402, 
 1a0c4220-f561-11e2-a409-019f62d610d7, ColumnFamily(P 
 [81d271b2125c59cb0003:false:27@1374780817218000,])), 
 Commit(d145fe46f5d02a54b5ea95852f94c402, 
 d20c3e60-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P 
 [81d271b2125c59cb0003:false:27@1374780817218000,]))) from 
 /17.163.7.184
 
 DEBUG [RequestResponseStage:104] 2013-07-25 15:09:05,352 ProposeCallback.java 
 (line 44) Propose response false from /17.163.7.20
 DEBUG 

Cassandra 2.0: Paxos Prepare response always false

2013-07-25 Thread Soumava Ghosh
Hi,

I have test setup where clients randomly make a controlled number of cas()
requests (among other requests) at a cluster of cassandra 2.0 servers.
After one point, I'm seeing that all requests are pending and my client's
throughput has reduced to 0.0 for all kinds of requests. For this specific
case I had 10 clients each making around 30 cas() requests per second at a
cluster of 72 instances of cassandra.

Clients are set up to register a request as a success after the cas() call
returns with CASResult.success = true, else an exception is thrown. Since I
see that no client requests were actually registered and no exceptions were
thrown, which indicates that the cas() call itself is hung.

On the server side, I see Paxos logs as follows - they go on for 50 log
files for each of the servers involved, and they span at least an hour. I
have marked a particular instance where the prepare response is true but
the propose response is false from all the involved servers:

*At the Paxos Initiator: * None of the files among the 50 system logs have
the phrase 'Propose response true', these logs just go on and on.
*
*
DEBUG [RequestResponseStage:110] 2013-07-25 15:09:05,332
PrepareCallback.java (line 58) Prepare response PrepareResponse(true,
Commit(d145fe46f5d02a54b5ea95852f94c402,
1a0c4220-f561-11e2-a409-019f62d610d7, ColumnFamily(P
[81d271b2125c59cb0003:false:27@1374780817218000,])),
Commit(d145fe46f5d02a54b5ea95852f94c402,
d2093120-f576-11e2-a57e-a154d605509d, ColumnFamily(P
[81d271b2125c59cb0003:false:27@1374780817218000,]))) from /
17.163.7.195
*
*
DEBUG [RequestResponseStage:92] 2013-07-25 15:09:05,346
PrepareCallback.java (line 58) Prepare response PrepareResponse(true,
Commit(d145fe46f5d02a54b5ea95852f94c402,
1a0c4220-f561-11e2-a409-019f62d610d7, ColumnFamily(P
[81d271b2125c59cb0003:false:27@1374780817218000,])),
Commit(d145fe46f5d02a54b5ea95852f94c402,
d2093120-f576-11e2-a57e-a154d605509d, ColumnFamily(P
[81d271b2125c59cb0003:false:27@1374780817218000,]))) from /
17.163.7.184

DEBUG [RequestResponseStage:98] 2013-07-25 15:09:05,347
PrepareCallback.java (line 58) Prepare response PrepareResponse(true,
Commit(d145fe46f5d02a54b5ea95852f94c402,
1a0c4220-f561-11e2-a409-019f62d610d7, ColumnFamily(P
[81d271b2125c59cb0003:false:27@1374780817218000,])),
Commit(d145fe46f5d02a54b5ea95852f94c402,
d2093120-f576-11e2-a57e-a154d605509d, ColumnFamily(P
[81d271b2125c59cb0003:false:27@1374780817218000,]))) from /
17.163.7.20

DEBUG [RequestResponseStage:93] 2013-07-25 15:09:05,350
ProposeCallback.java (line 44) Propose response false from /17.163.7.20
DEBUG [RequestResponseStage:100] 2013-07-25 15:09:05,350
ProposeCallback.java (line 44) Propose response false from /17.163.7.184
DEBUG [RequestResponseStage:111] 2013-07-25 15:09:05,350
ProposeCallback.java (line 44) Propose response false from /17.163.7.195

DEBUG [RequestResponseStage:102] 2013-07-25 15:09:05,351
PrepareCallback.java (line 58) Prepare response PrepareResponse(true,
Commit(d145fe46f5d02a54b5ea95852f94c402,
1a0c4220-f561-11e2-a409-019f62d610d7, ColumnFamily(P
[81d271b2125c59cb0003:false:27@1374780817218000,])),
Commit(d145fe46f5d02a54b5ea95852f94c402,
d20c3e60-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P
[81d271b2125c59cb0003:false:27@1374780817218000,]))) from /
17.163.7.195

DEBUG [RequestResponseStage:107] 2013-07-25 15:09:05,352
PrepareCallback.java (line 58) Prepare response PrepareResponse(true,
Commit(d145fe46f5d02a54b5ea95852f94c402,
1a0c4220-f561-11e2-a409-019f62d610d7, ColumnFamily(P
[81d271b2125c59cb0003:false:27@1374780817218000,])),
Commit(d145fe46f5d02a54b5ea95852f94c402,
d20c3e60-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P
[81d271b2125c59cb0003:false:27@1374780817218000,]))) from /
17.163.7.20

DEBUG [RequestResponseStage:108] 2013-07-25 15:09:05,352
PrepareCallback.java (line 58) Prepare response PrepareResponse(true,
Commit(d145fe46f5d02a54b5ea95852f94c402,
1a0c4220-f561-11e2-a409-019f62d610d7, ColumnFamily(P
[81d271b2125c59cb0003:false:27@1374780817218000,])),
Commit(d145fe46f5d02a54b5ea95852f94c402,
d20c3e60-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P
[81d271b2125c59cb0003:false:27@1374780817218000,]))) from /
17.163.7.184

DEBUG [RequestResponseStage:104] 2013-07-25 15:09:05,352
ProposeCallback.java (line 44) Propose response false from /17.163.7.20
DEBUG [RequestResponseStage:99] 2013-07-25 15:09:05,353
ProposeCallback.java (line 44) Propose response false from /17.163.7.195
DEBUG [RequestResponseStage:105] 2013-07-25 15:09:05,353
ProposeCallback.java (line 44) Propose response false from /17.163.7.184

*At 17.163.7.20:*
*
*
DEBUG [MutationStage:58] 2013-07-25 15:09:05,347 PaxosState.java (line 100)
accept requested for Commit(d145fe46f5d02a54b5ea95852f94c402,
d20b05e0-f576-11e2-9bbe-bf2ad4fe6707, ColumnFamily(P
[81d271b2125c59cb0003:false:27@1374780817218000,])) but
inProgress is