[ 
https://issues.apache.org/jira/browse/CASSANDRA-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Morton updated CASSANDRA-3519:
------------------------------------

    Attachment: 3519.patch

Use a CopyOnWriteArrayList in the FailureDetector to track listeners, like the 
Gossiper does. 
                
> ConcurrentModificationException in FailureDetector
> --------------------------------------------------
>
>                 Key: CASSANDRA-3519
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3519
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.1
>         Environment: Free BSD 8.2 
> /java -version
> java version "1.6.0_07"
> Diablo Java(TM) SE Runtime Environment (build 1.6.0_07-b02)
> Diablo Java HotSpot(TM) 64-Bit Server VM (build 10.0-b23, mixed mode)
>            Reporter: Aaron Morton
>            Assignee: Aaron Morton
>            Priority: Minor
>         Attachments: 3519.patch
>
>
> Noticed in a 2 DC cluster, error was on node in DC 2 streaming to a node in 
> DC 1. 
> {code:java}
> INFO [GossipTasks:1] 2011-11-20 18:36:05,153 Gossiper.java (line 759) 
> InetAddress /10.6.130.70 is now dead.
> ERROR [GossipTasks:1] 2011-11-20 18:36:25,252 StreamOutSession.java (line 
> 232) StreamOutSession /10.6.130.70 failed because {} died or was 
> restarted/removed
> ERROR [AntiEntropySessions:21] 2011-11-20 18:36:25,252 
> AntiEntropyService.java (line 688) [repair 
> #7fb5b1b0-11f1-11e1-0000-baed0a2090fe] session completed with the following 
> err
> or
> java.io.IOException: Endpoint /10.6.130.70 died
>         at 
> org.apache.cassandra.service.AntiEntropyService$RepairSession.failedNode(AntiEntropyService.java:725)
>         at 
> org.apache.cassandra.service.AntiEntropyService$RepairSession.convict(AntiEntropyService.java:762)
>         at 
> org.apache.cassandra.gms.FailureDetector.interpret(FailureDetector.java:192)
>         at org.apache.cassandra.gms.Gossiper.doStatusCheck(Gossiper.java:559)
>         at org.apache.cassandra.gms.Gossiper.access$700(Gossiper.java:62)
>         at org.apache.cassandra.gms.Gossiper$GossipTask.run(Gossiper.java:167)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>         at 
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> ERROR [GossipTasks:1] 2011-11-20 18:36:25,256 Gossiper.java (line 172) Gossip 
> error
> java.util.ConcurrentModificationException
>         at 
> java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
>         at java.util.AbstractList$Itr.next(AbstractList.java:343)
>         at 
> org.apache.cassandra.gms.FailureDetector.interpret(FailureDetector.java:190)
>         at org.apache.cassandra.gms.Gossiper.doStatusCheck(Gossiper.java:559)
>         at org.apache.cassandra.gms.Gossiper.access$700(Gossiper.java:62)
>         at org.apache.cassandra.gms.Gossiper$GossipTask.run(Gossiper.java:167)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>         at 
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> ERROR [AntiEntropySessions:21] 2011-11-20 18:36:25,256 
> AbstractCassandraDaemon.java (line 133) Fatal exception in thread 
> Thread[AntiEntropySessions:21,5,RMI Runtime]
> java.lang.RuntimeException: java.io.IOException: Endpoint /10.6.130.70 died
>         at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.IOException: Endpoint /10.6.130.70 died
>         at 
> org.apache.cassandra.service.AntiEntropyService$RepairSession.failedNode(AntiEntropyService.java:725)
>         at 
> org.apache.cassandra.service.AntiEntropyService$RepairSession.convict(AntiEntropyService.java:762)
>         at 
> org.apache.cassandra.gms.FailureDetector.interpret(FailureDetector.java:192)
>         at org.apache.cassandra.gms.Gossiper.doStatusCheck(Gossiper.java:559)
>         at org.apache.cassandra.gms.Gossiper.access$700(Gossiper.java:62)
>         at org.apache.cassandra.gms.Gossiper$GossipTask.run(Gossiper.java:167)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>         at 
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
>         ... 3 more
> ERROR [RMI TCP Connection(3634)-10.29.60.10] 2011-11-20 18:36:25,256 
> StorageService.java (line 1712) Repair session 
> 7fb5b1b0-11f1-11e1-0000-baed0a2090fe failed.
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> java.io.IOException: Endpoint /10.6.130.70 died
>         at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>         at 
> org.apache.cassandra.service.StorageService.forceTableRepair(StorageService.java:1708)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
>         at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
>         at 
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
>         at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
>         at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
>         at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
>         at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
>         at 
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1426)
>         at 
> javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
>         at 
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1264)
>         at 
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1359)
>         at 
> javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
>         at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
>         at sun.rmi.transport.Transport$1.run(Transport.java:159)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
>         at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
>         at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
>         at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.RuntimeException: java.io.IOException: Endpoint 
> /10.6.130.70 died
>         at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         ... 3 more
> Caused by: java.io.IOException: Endpoint /10.6.130.70 died
>         at 
> org.apache.cassandra.service.AntiEntropyService$RepairSession.failedNode(AntiEntropyService.java:725)
>         at 
> org.apache.cassandra.service.AntiEntropyService$RepairSession.convict(AntiEntropyService.java:762)
>         at 
> org.apache.cassandra.gms.FailureDetector.interpret(FailureDetector.java:192)
>         at org.apache.cassandra.gms.Gossiper.doStatusCheck(Gossiper.java:559)
>         at org.apache.cassandra.gms.Gossiper.access$700(Gossiper.java:62)
>         at org.apache.cassandra.gms.Gossiper$GossipTask.run(Gossiper.java:167)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>         at 
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
>         ... 3 more
>  INFO [GossipStage:1] 2011-11-20 18:36:28,173 Gossiper.java (line 777) Node 
> /10.6.130.70 has restarted, now UP
>  INFO [GossipStage:1] 2011-11-20 18:36:28,175 Gossiper.java (line 745) 
> InetAddress /10.6.130.70 is now UP
>  INFO [GossipStage:1] 2011-11-20 18:36:28,175 StorageService.java (line 885) 
> Node /10.6.130.70 state jump to normal
> {code}
> FailureDetector uses a normal ArrayList for the listeners.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to