[ 
https://issues.apache.org/jira/browse/CASSANDRA-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12989375#comment-12989375
 ] 

Aaron Morton commented on CASSANDRA-2081:
-----------------------------------------

My understanding here is the 0.19 node is sending read requests to the 0.1, 0.2 
and 0.3 nodes and only getting a reply from the 0.1 node before timing out. The 
0.1 node is the first node the request is sent to, so this is the data request 
the others are digest. 

The timeout is the rpc_timeout, and can be seen here...

DEBUG [pool-1-thread-1] 2011-02-01 11:48:28,949 ReadCallback.java (line 58) 
ReadCallback blocking for 2 responses
...10 seconds... 
DEBUG [pool-1-thread-1] 2011-02-01 11:48:38,950 CassandraServer.java (line 483) 
... timed out

Whats happening on the 0.2 and 0.3 nodes at this point? Are they logging errors 
or WARN messages about dropped messages ? Can you see any logs about processing 
messages from the 0.19 node? I'm not sure the down 0.18 node is a factor here.

The client should be retrying when it gets a timeout, which I think you said 
Hector was doing. 

 

> Consistency QUORUM does not work anymore (hector:Could not fullfill request 
> on this host)
> -----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2081
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2081
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: linux, hector + cassandra
>            Reporter: Thibaut
>            Priority: Blocker
>             Fix For: 0.7.1
>
>
> I'm using apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25.
> Using consistency level Quorum won't work anymore (tested it on read). 
> Consisteny level ONE still works though
> I have tried this with one dead node in my cluster.
> If I restart cassandra with an older svn revision 
> (apache-cassandra-2011-01-28_20-06-01.jar), I can access the cluster with 
> consistency level QUORUM again, while still using 
> apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25 in my application.
> 11/01/31 19:54:38 ERROR connection.CassandraHostRetryService: Downed 
> intr1n18(192.168.0.18):9160 host still appears to be down: Unable to open 
> transport to intr1n18(192.168.0.18):9160 , java.net.NoRouteToHostException: 
> No route to host
> 11/01/31 19:54:38 INFO connection.CassandraHostRetryService: Downed Host 
> retry status false with host: intr1n18(192.168.0.18):9160
> 11/01/31 19:54:45 ERROR connection.HConnectionManager: Could not fullfill 
> request on this host CassandraClient<intr1n11:9160-483>
> intr1n11 is marked as up however and I can also access the node through the 
> cassandra cli.
> 192.168.0.1     Up     Normal  8.02 GB         5.00%   0cc
> 192.168.0.2     Up     Normal  7.96 GB         5.00%   199
> 192.168.0.3     Up     Normal  8.24 GB         5.00%   266
> 192.168.0.4     Up     Normal  4.94 GB         5.00%   333
> 192.168.0.5     Up     Normal  5.02 GB         5.00%   400
> 192.168.0.6     Up     Normal  5 GB            5.00%   4cc
> 192.168.0.7     Up     Normal  5.1 GB          5.00%   599
> 192.168.0.8     Up     Normal  5.07 GB         5.00%   666
> 192.168.0.9     Up     Normal  4.78 GB         5.00%   733
> 192.168.0.10    Up     Normal  4.34 GB         5.00%   7ff
> 192.168.0.11    Up     Normal  5.01 GB         5.00%   8cc
> 192.168.0.12    Up     Normal  5.31 GB         5.00%   999
> 192.168.0.13    Up     Normal  5.56 GB         5.00%   a66
> 192.168.0.14    Up     Normal  5.82 GB         5.00%   b33
> 192.168.0.15    Up     Normal  5.57 GB         5.00%   c00
> 192.168.0.16    Up     Normal  5.03 GB         5.00%   ccc
> 192.168.0.17    Up     Normal  4.77 GB         5.00%   d99
> 192.168.0.18    Down   Normal  ?               5.00%   e66
> 192.168.0.19    Up     Normal  4.78 GB         5.00%   f33
> 192.168.0.20    Up     Normal  4.83 GB         5.00%   ffffffffffffffff

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to