[jira] [Commented] (CASSANDRA-2768) AntiEntropyService excluding nodes that are on version 0.7 or sooner

Sasha Dolgy (JIRA) Tue, 14 Jun 2011 08:54:55 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-2768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049243#comment-13049243
 ]


Sasha Dolgy commented on CASSANDRA-2768:
----------------------------------------

Hi ... able to give more information now:


cassandra:~$ nodetool ring
Address         Status State   Load            Owns    Token
                                                       
170141183460469231731687303715884105726
10.128.103.148  Up     Normal  961.38 KB       11.22%  
19095547144942516281182777765338228798
10.128.94.227   Up     Normal  667.56 KB       22.11%  
56713727820156410577229101238628035242
10.128.34.18    Up     Normal  688.1 KB        33.33%  
113427455640312821154458202477256070484
10.128.90.109   Up     Normal  965.76 KB       33.33%  
170141183460469231731687303715884105726

Not a lot of data.  I created a new keyspace with (RF=2), dropped the old one.  
Ran repair on the nodes, and now I no longer get the error on some of the 
nodes. 

I can confirm again all systems are reporting:  ReleaseVersion: 0.8.0  from 
'nodetool version'

I am seeing this error on two of the nodes:   

 ERROR [pool-2-thread-14] 2011-06-14 23:33:40,544 CustomTThreadPoolServer.java 
(line 199) Thrift error occurred during processing of message.
org.apache.thrift.protocol.TProtocolException: Missing version in 
readMessageBegin, old client?
        at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:213)
        at 
org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2877)
        at 
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
ERROR [pool-2-thread-16] 2011-06-14 23:33:42,024 CustomTThreadPoolServer.java 
(line 199) Thrift error occurred during processing of message.
org.apache.thrift.protocol.TProtocolException: Missing version in 
readMessageBegin, old client?
        at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:213)
        at 
org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2877)
        at 
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)

109 and 148 look to be communicating fine.

18 --> 109 (version error)
18 --> 227 (version error)
227 --> 18 (version error)
227 --> 148 (version error)

For my sanity, I checked and confirmed that all four instances are part of the 
same security group and there are firewall rules allow communication between 
all four nodes on ports 7000 and 9090

Configuration on all nodes is standard with the following exceptions:


#listen_address: localhost
endpoint_snitch: org.apache.cassandra.locator.Ec2Snitch


> AntiEntropyService excluding nodes that are on version 0.7 or sooner
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-2768
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2768
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.8.0
>         Environment: 4 node environment -- 
> Originally 0.7.6-2 with a Keyspace defined with RF=3
> Upgraded all nodes ( 1 at a time ) to version 0.8.0:  For each node, the node 
> was shut down, new version was turned on, using the existing data files / 
> directories and a nodetool repair was run.  
>            Reporter: Sasha Dolgy
>            Assignee: Brandon Williams
>
> When I run nodetool repair on any of the nodes, the 
> /var/log/cassandra/system.log reports errors similar to:
> INFO [manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec] 2011-06-13 
> 21:28:39,877 AntiEntropyService.java (line 177) Excluding /10.128.34.18 from 
> repair because it is on version 0.7 or sooner. You should consider updating 
> this node before running repair again.
> ERROR [manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec] 2011-06-13 
> 21:28:39,877 AbstractCassandraDaemon.java (line 113) Fatal exception in 
> thread Thread[manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec,5,RMI 
> Runtime]
> java.util.ConcurrentModificationException
>       at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
>       at java.util.HashMap$KeyIterator.next(HashMap.java:828)
>       at 
> org.apache.cassandra.service.AntiEntropyService.getNeighbors(AntiEntropyService.java:173)
>       at 
> org.apache.cassandra.service.AntiEntropyService$RepairSession.run(AntiEntropyService.java:776)
> The INFO message and subsequent ERROR message are logged for 2 nodes .. I 
> suspect that this is because RF=3.  
> nodetool ring shows that all nodes are up.  
> Client connections (read / write) are not having issues..  
> nodetool version on all nodes shows that each node is 0.8.0
> At suggestion of some contributors, I have restarted each node and tried to 
> run a nodetool repair again ... the result is the same with the messages 
> being logged.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2768) AntiEntropyService excluding nodes that are on version 0.7 or sooner

Reply via email to