[ https://issues.apache.org/jira/browse/CASSANDRA-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12989100#comment-12989100 ]
Aaron Morton commented on CASSANDRA-2081: ----------------------------------------- Had a read through the code for think I have an idea what my problem was, not sure if it applies to the previous issue and not sure if its a real bug. o.a.c.service.ReadCallback.response() will only signal the o.a.c.utils.SimpleCondition if the data request has been received. If the signal is not set after rpc_timeout then ReadCallback.get() will raise a j.u.c.TimeoutException() which comes out of the StorageProxy and it caught in CassandraServer and turned into a o.a.c.thrift.TimedOutException. So if the node that is asked for the data fails to return, the entire request will timeout even if there are enough nodes to serve the request. I think I've seen this discussed before as by design, and the client should just retry in response to the timeout. Is that correct ? > Consistency QUORUM does not work anymore (hector:Could not fullfill request > on this host) > ----------------------------------------------------------------------------------------- > > Key: CASSANDRA-2081 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2081 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: linux, hector + cassandra > Reporter: Thibaut > Priority: Blocker > Fix For: 0.7.1 > > > I'm using apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25. > Using consistency level Quorum won't work anymore (tested it on read). > Consisteny level ONE still works though > I have tried this with one dead node in my cluster. > If I restart cassandra with an older svn revision > (apache-cassandra-2011-01-28_20-06-01.jar), I can access the cluster with > consistency level QUORUM again, while still using > apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25 in my application. > 11/01/31 19:54:38 ERROR connection.CassandraHostRetryService: Downed > intr1n18(192.168.0.18):9160 host still appears to be down: Unable to open > transport to intr1n18(192.168.0.18):9160 , java.net.NoRouteToHostException: > No route to host > 11/01/31 19:54:38 INFO connection.CassandraHostRetryService: Downed Host > retry status false with host: intr1n18(192.168.0.18):9160 > 11/01/31 19:54:45 ERROR connection.HConnectionManager: Could not fullfill > request on this host CassandraClient<intr1n11:9160-483> > intr1n11 is marked as up however and I can also access the node through the > cassandra cli. > 192.168.0.1 Up Normal 8.02 GB 5.00% 0cc > 192.168.0.2 Up Normal 7.96 GB 5.00% 199 > 192.168.0.3 Up Normal 8.24 GB 5.00% 266 > 192.168.0.4 Up Normal 4.94 GB 5.00% 333 > 192.168.0.5 Up Normal 5.02 GB 5.00% 400 > 192.168.0.6 Up Normal 5 GB 5.00% 4cc > 192.168.0.7 Up Normal 5.1 GB 5.00% 599 > 192.168.0.8 Up Normal 5.07 GB 5.00% 666 > 192.168.0.9 Up Normal 4.78 GB 5.00% 733 > 192.168.0.10 Up Normal 4.34 GB 5.00% 7ff > 192.168.0.11 Up Normal 5.01 GB 5.00% 8cc > 192.168.0.12 Up Normal 5.31 GB 5.00% 999 > 192.168.0.13 Up Normal 5.56 GB 5.00% a66 > 192.168.0.14 Up Normal 5.82 GB 5.00% b33 > 192.168.0.15 Up Normal 5.57 GB 5.00% c00 > 192.168.0.16 Up Normal 5.03 GB 5.00% ccc > 192.168.0.17 Up Normal 4.77 GB 5.00% d99 > 192.168.0.18 Down Normal ? 5.00% e66 > 192.168.0.19 Up Normal 4.78 GB 5.00% f33 > 192.168.0.20 Up Normal 4.83 GB 5.00% ffffffffffffffff -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira