[jira] [Commented] (CASSANDRA-11345) Assertion Errors "Memory was freed" during streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-11345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15256291#comment-15256291 ] Jean-Francois Gosselin commented on CASSANDRA-11345: During a sequential repair. > Assertion Errors "Memory was freed" during streaming > > > Key: CASSANDRA-11345 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11345 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jean-Francois Gosselin >Assignee: Paulo Motta > > We encountered the following AssertionError (twice on the same node) during a > repair : > On node /172.16.63.41 > {noformat} > INFO [STREAM-IN-/10.174.216.160] 2016-03-09 02:38:13,900 > StreamResultFuture.java:180 - [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] > Session with /10.174.216.160 is complete > > WARN [STREAM-IN-/10.174.216.160] 2016-03-09 02:38:13,900 > StreamResultFuture.java:207 - [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] > Stream failed > ERROR [STREAM-OUT-/10.174.216.160] 2016-03-09 02:38:13,906 > StreamSession.java:505 - [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] > Streaming error occurred > java.lang.AssertionError: Memory was freed > > > at > org.apache.cassandra.io.util.SafeMemory.checkBounds(SafeMemory.java:97) > ~[apache-cassandra-2.1.13.jar:2.1.13] > > at org.apache.cassandra.io.util.Memory.getLong(Memory.java:249) > ~[apache-cassandra-2.1.13.jar:2.1.13] > > at > org.apache.cassandra.io.compress.CompressionMetadata.getTotalSizeForSections(CompressionMetadata.java:247) > ~[apache-cassandra-2.1.13.jar:2.1.13] > at > org.apache.cassandra.streaming.messages.FileMessageHeader.size(FileMessageHeader.java:112) > ~[apache-cassandra-2.1.13.jar:2.1.13] > at > org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:546) > ~[apache-cassandra-2.1.13.jar:2.1.13] > > at > org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:50) > ~[apache-cassandra-2.1.13.jar:2.1.13] > at > org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41) > ~[apache-cassandra-2.1.13.jar:2.1.13] > at > org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45) > ~[apache-cassandra-2.1.13.jar:2.1.13] > at > org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351) > ~[apache-cassandra-2.1.13.jar:2.1.13] > at > org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331) > ~[apache-cassandra-2.1.13.jar:2.1.13] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65] > > > {noformat} > On node /10.174.216.160 > > {noformat} > ERROR [STREAM-OUT-/172.16.63.41] 2016-03-09 02:38:14,140 > StreamSession.java:505 - [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] > Streaming error occurred > java.io.IOException: Connection reset by peer > > > at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[na:1.7.0_65] > > > at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) > ~[na:1.7.0_65] > > at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) > ~[na:1.7.0_65] > > at sun.nio.ch.IOUtil.write(IOUtil.java:65) ~[na:1.7.0_65] > > > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487) > ~[na:1.7.0_65]
[jira] [Commented] (CASSANDRA-10769) "received out of order wrt DecoratedKey" after scrub
[ https://issues.apache.org/jira/browse/CASSANDRA-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15208597#comment-15208597 ] Jean-Francois Gosselin commented on CASSANDRA-10769: Based on the comments in CASSANDRA-9935 the "AssertionError: row DecoratedKey" is still present in 2.1.13. > "received out of order wrt DecoratedKey" after scrub > > > Key: CASSANDRA-10769 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10769 > Project: Cassandra > Issue Type: Bug > Environment: C* 2.1.11, Debian Wheezy >Reporter: mlowicki > > After running scrub and cleanup on all nodes in single data center I'm > getting: > {code} > ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,530 Validator.java:245 - > Failed creating a merkle tree for [repair > #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, > (-5867793819051725444,-5865919628027816979]], /10.210.3.221 (see log for > details) > ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,531 > CassandraDaemon.java:227 - Exception in thread > Thread[ValidationExecutor:103,1,main] > java.lang.AssertionError: row DecoratedKey(-5867787467868737053, > 000932373633313036313204808800) received out of order wrt > DecoratedKey(-5865937851627253360, 000933313230313737333204c3c700) > at org.apache.cassandra.repair.Validator.add(Validator.java:127) > ~[apache-cassandra-2.1.11.jar:2.1.11] > at > org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1010) > ~[apache-cassandra-2.1.11.jar:2.1.11] > at > org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94) > ~[apache-cassandra-2.1.11.jar:2.1.11] > at > org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:622) > ~[apache-cassandra-2.1.11.jar:2.1.11] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > ~[na:1.7.0_80] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > ~[na:1.7.0_80] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_80] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80] > {code} > What I did is to run repair on other node: > {code} > time nodetool repair --in-local-dc > {code} > Corresponding log on the node where repair has been started: > {code} > ERROR [AntiEntropySessions:414] 2015-11-25 06:28:21,533 > RepairSession.java:303 - [repair #89fa2b70-933d-11e5-b036-75bb514ae072] > session completed with the following error > org.apache.cassandra.exceptions.RepairException: [repair > #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, > (-5867793819051725444,-5865919628027816979]] Validation failed in > /10.210.3.117 > at > org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166) > ~[apache-cassandra-2.1.11.jar:2.1.11] > at > org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:406) > ~[apache-cassandra-2.1.11.jar:2.1.11] > at > org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134) > ~[apache-cassandra-2.1.11.jar:2.1.11] > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) > ~[apache-cassandra-2.1.11.jar:2.1.11] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_80] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_80] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80] > INFO [AntiEntropySessions:415] 2015-11-25 06:28:21,533 > RepairSession.java:260 - [repair #b9458fa0-933d-11e5-b036-75bb514ae072] new > session: will sync /10.210.3.221, /10.210.3.118, /10.210.3.117 on range > (7119703141488009983,7129744584776466802] for sync.[device_token, entity2, > user_stats, user_device, user_quota, user_store, user_device_progress, > entity_by_id2] > ERROR [AntiEntropySessions:414] 2015-11-25 06:28:21,533 > CassandraDaemon.java:227 - Exception in thread > Thread[AntiEntropySessions:414,5,RMI Runtime] > java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: > [repair #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, > (-5867793819051725444,-5865919628027816979]] Validation failed in > /10.210.3.117 > at com.google.common.base.Throwables.propagate(Throwables.java:160) > ~[guava-16.0.jar:na] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) > ~[apache-cassandra-2.1.11.jar:2.1.11] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > ~[na:1.7.0_80]
[jira] [Commented] (CASSANDRA-11374) LEAK DETECTED during repair
[ https://issues.apache.org/jira/browse/CASSANDRA-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199874#comment-15199874 ] Jean-Francois Gosselin commented on CASSANDRA-11374: Same issue as CASSANDRA-9117 but not fixed in 2.1.x ? > LEAK DETECTED during repair > --- > > Key: CASSANDRA-11374 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11374 > Project: Cassandra > Issue Type: Bug >Reporter: Jean-Francois Gosselin > > When running a range repair we are seeing the following LEAK DETECTED errors: > {noformat} > ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,261 Ref.java:179 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@5ee90b43) to class > org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@367168611:[[OffHeapBitSet]] > was not released before the reference was garbage collected > ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@4ea9d4a7) to class > org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1875396681:Memory@[7f34b905fd10..7f34b9060b7a) > was not released before the reference was garbage collected > ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@27a6b614) to class > org.apache.cassandra.io.util.SafeMemory$MemoryTidy@838594402:Memory@[7f34bae11ce0..7f34bae11d84) > was not released before the reference was garbage collected > ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,263 Ref.java:179 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@64e7b566) to class > org.apache.cassandra.io.util.SafeMemory$MemoryTidy@674656075:Memory@[7f342deab4e0..7f342deb7ce0) > was not released before the reference was garbage collected > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (CASSANDRA-11374) LEAK DETECTED during repair
[ https://issues.apache.org/jira/browse/CASSANDRA-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Francois Gosselin updated CASSANDRA-11374: --- Comment: was deleted (was: We are using the Reaper https://nuance.webex.com/join/sylvain_boily, so a subrange repair (we are not using incremental repair).) > LEAK DETECTED during repair > --- > > Key: CASSANDRA-11374 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11374 > Project: Cassandra > Issue Type: Bug >Reporter: Jean-Francois Gosselin >Assignee: Marcus Eriksson > > When running a range repair we are seeing the following LEAK DETECTED errors: > {noformat} > ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,261 Ref.java:179 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@5ee90b43) to class > org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@367168611:[[OffHeapBitSet]] > was not released before the reference was garbage collected > ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@4ea9d4a7) to class > org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1875396681:Memory@[7f34b905fd10..7f34b9060b7a) > was not released before the reference was garbage collected > ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@27a6b614) to class > org.apache.cassandra.io.util.SafeMemory$MemoryTidy@838594402:Memory@[7f34bae11ce0..7f34bae11d84) > was not released before the reference was garbage collected > ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,263 Ref.java:179 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@64e7b566) to class > org.apache.cassandra.io.util.SafeMemory$MemoryTidy@674656075:Memory@[7f342deab4e0..7f342deb7ce0) > was not released before the reference was garbage collected > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11374) LEAK DETECTED during repair
Jean-Francois Gosselin created CASSANDRA-11374: -- Summary: LEAK DETECTED during repair Key: CASSANDRA-11374 URL: https://issues.apache.org/jira/browse/CASSANDRA-11374 Project: Cassandra Issue Type: Bug Reporter: Jean-Francois Gosselin When running a range repair we are seeing the following LEAK DETECTED errors: {noformat} ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,261 Ref.java:179 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@5ee90b43) to class org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@367168611:[[OffHeapBitSet]] was not released before the reference was garbage collected ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@4ea9d4a7) to class org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1875396681:Memory@[7f34b905fd10..7f34b9060b7a) was not released before the reference was garbage collected ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@27a6b614) to class org.apache.cassandra.io.util.SafeMemory$MemoryTidy@838594402:Memory@[7f34bae11ce0..7f34bae11d84) was not released before the reference was garbage collected ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,263 Ref.java:179 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@64e7b566) to class org.apache.cassandra.io.util.SafeMemory$MemoryTidy@674656075:Memory@[7f342deab4e0..7f342deb7ce0) was not released before the reference was garbage collected {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11374) LEAK DETECTED during repair
[ https://issues.apache.org/jira/browse/CASSANDRA-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201920#comment-15201920 ] Jean-Francois Gosselin commented on CASSANDRA-11374: We are using the Reaper from Spotify https://github.com/spotify/cassandra-reaper, so subrange repair . We are not using incremental repair. > LEAK DETECTED during repair > --- > > Key: CASSANDRA-11374 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11374 > Project: Cassandra > Issue Type: Bug >Reporter: Jean-Francois Gosselin >Assignee: Marcus Eriksson > > When running a range repair we are seeing the following LEAK DETECTED errors: > {noformat} > ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,261 Ref.java:179 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@5ee90b43) to class > org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@367168611:[[OffHeapBitSet]] > was not released before the reference was garbage collected > ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@4ea9d4a7) to class > org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1875396681:Memory@[7f34b905fd10..7f34b9060b7a) > was not released before the reference was garbage collected > ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@27a6b614) to class > org.apache.cassandra.io.util.SafeMemory$MemoryTidy@838594402:Memory@[7f34bae11ce0..7f34bae11d84) > was not released before the reference was garbage collected > ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,263 Ref.java:179 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@64e7b566) to class > org.apache.cassandra.io.util.SafeMemory$MemoryTidy@674656075:Memory@[7f342deab4e0..7f342deb7ce0) > was not released before the reference was garbage collected > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11374) LEAK DETECTED during repair
[ https://issues.apache.org/jira/browse/CASSANDRA-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201915#comment-15201915 ] Jean-Francois Gosselin commented on CASSANDRA-11374: We are using the Reaper https://nuance.webex.com/join/sylvain_boily, so a subrange repair (we are not using incremental repair). > LEAK DETECTED during repair > --- > > Key: CASSANDRA-11374 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11374 > Project: Cassandra > Issue Type: Bug >Reporter: Jean-Francois Gosselin >Assignee: Marcus Eriksson > > When running a range repair we are seeing the following LEAK DETECTED errors: > {noformat} > ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,261 Ref.java:179 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@5ee90b43) to class > org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@367168611:[[OffHeapBitSet]] > was not released before the reference was garbage collected > ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@4ea9d4a7) to class > org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1875396681:Memory@[7f34b905fd10..7f34b9060b7a) > was not released before the reference was garbage collected > ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@27a6b614) to class > org.apache.cassandra.io.util.SafeMemory$MemoryTidy@838594402:Memory@[7f34bae11ce0..7f34bae11d84) > was not released before the reference was garbage collected > ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,263 Ref.java:179 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@64e7b566) to class > org.apache.cassandra.io.util.SafeMemory$MemoryTidy@674656075:Memory@[7f342deab4e0..7f342deb7ce0) > was not released before the reference was garbage collected > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-11345) Assertion Errors "Memory was freed" during streaming
Jean-Francois Gosselin created CASSANDRA-11345: -- Summary: Assertion Errors "Memory was freed" during streaming Key: CASSANDRA-11345 URL: https://issues.apache.org/jira/browse/CASSANDRA-11345 Project: Cassandra Issue Type: Bug Components: Streaming and Messaging Reporter: Jean-Francois Gosselin We encountered the following AssertionError (twice on the same node) during a repair : On node /172.16.63.41 {noformat} INFO [STREAM-IN-/10.174.216.160] 2016-03-09 02:38:13,900 StreamResultFuture.java:180 - [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] Session with /10.174.216.160 is complete WARN [STREAM-IN-/10.174.216.160] 2016-03-09 02:38:13,900 StreamResultFuture.java:207 - [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] Stream failed ERROR [STREAM-OUT-/10.174.216.160] 2016-03-09 02:38:13,906 StreamSession.java:505 - [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] Streaming error occurred java.lang.AssertionError: Memory was freed at org.apache.cassandra.io.util.SafeMemory.checkBounds(SafeMemory.java:97) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.io.util.Memory.getLong(Memory.java:249) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.io.compress.CompressionMetadata.getTotalSizeForSections(CompressionMetadata.java:247) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.streaming.messages.FileMessageHeader.size(FileMessageHeader.java:112) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.streaming.StreamSession.fileSent(StreamSession.java:546) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:50) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331) ~[apache-cassandra-2.1.13.jar:2.1.13] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65] {noformat} On node /10.174.216.160 {noformat} ERROR [STREAM-OUT-/172.16.63.41] 2016-03-09 02:38:14,140 StreamSession.java:505 - [Stream #f6980580-e55f-11e5-8f08-ef9e099ce99e] Streaming error occurred java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[na:1.7.0_65] at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) ~[na:1.7.0_65] at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) ~[na:1.7.0_65] at sun.nio.ch.IOUtil.write(IOUtil.java:65) ~[na:1.7.0_65] at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487) ~[na:1.7.0_65] at org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44) ~[apache-cassandra-2.1.13.jar:2.1.13] at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMes
[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting
[ https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157028#comment-15157028 ] Jean-Francois Gosselin commented on CASSANDRA-9625: --- I will give it a try. > GraphiteReporter not reporting > -- > > Key: CASSANDRA-9625 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9625 > Project: Cassandra > Issue Type: Bug > Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3 >Reporter: Eric Evans >Assignee: T Jake Luciani > Attachments: metrics.yaml, thread-dump.log > > > When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops > working. The usual startup is logged, and one batch of samples is sent, but > the reporting interval comes and goes, and no other samples are ever sent. > The logs are free from errors. > Frustratingly, metrics reporting works in our smaller (staging) environment > on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not > on a 3 node (otherwise identical) staging cluster (maybe it takes a certain > level of concurrency?). > Attached is a thread dump, and our metrics.yaml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException
[ https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154789#comment-15154789 ] Jean-Francois Gosselin commented on CASSANDRA-9935: --- We are doing range repair with https://github.com/spotify/cassandra-reaper . We don't use incremental repair . We also see the issue with : nodetool repair -pr > Repair fails with RuntimeException > -- > > Key: CASSANDRA-9935 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9935 > Project: Cassandra > Issue Type: Bug > Environment: C* 2.1.8, Debian Wheezy >Reporter: mlowicki >Assignee: Yuki Morishita > Fix For: 2.1.x > > Attachments: db1.sync.lati.osa.cassandra.log, > db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, > system.log.10.210.3.221, system.log.10.210.3.230 > > > We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade > to 2.1.8 it started to work faster but now it fails with: > {code} > ... > [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde > for range (-5474076923322749342,-5468600594078911162] finished > [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde > for range (-8631877858109464676,-8624040066373718932] finished > [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde > for range (-5372806541854279315,-5369354119480076785] finished > [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde > for range (8166489034383821955,8168408930184216281] finished > [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde > for range (6084602890817326921,6088328703025510057] finished > [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde > for range (-781874602493000830,-781745173070807746] finished > [2015-07-29 20:44:03,957] Repair command #4 finished > error: nodetool failed, check server logs > -- StackTrace -- > java.lang.RuntimeException: nodetool failed, check server logs > at > org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290) > at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202) > {code} > After running: > {code} > nodetool repair --partitioner-range --parallel --in-local-dc sync > {code} > Last records in logs regarding repair are: > {code} > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range > (-7695808664784761779,-7693529816291585568] finished > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range > (806371695398849,8065203836608925992] finished > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range > (-5474076923322749342,-5468600594078911162] finished > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range > (-8631877858109464676,-8624040066373718932] finished > INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - > Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range > (-5372806541854279315,-5369354119480076785] finished > INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - > Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range > (8166489034383821955,8168408930184216281] finished > INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - > Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range > (6084602890817326921,6088328703025510057] finished > INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - > Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range > (-781874602493000830,-781745173070807746] finished > {code} > but a bit above I see (at least two times in attached log): > {code} > ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - > Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range > (5765414319217852786,5781018794516851576] failed with error > org.apache.cassandra.exceptions.RepairException: [repair > #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, > (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162 > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > org.apache.cassandra.exceptions.RepairException: [repair > #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, > (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162 > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > [na:1.7.0_80] > at java.util.concur
[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException
[ https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15145336#comment-15145336 ] Jean-Francois Gosselin commented on CASSANDRA-9935: --- [~yukim] What's the next step to troubleshoot this issue ? Any specific log we could enable at DEBUG ? > Repair fails with RuntimeException > -- > > Key: CASSANDRA-9935 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9935 > Project: Cassandra > Issue Type: Bug > Environment: C* 2.1.8, Debian Wheezy >Reporter: mlowicki >Assignee: Yuki Morishita > Fix For: 2.1.x > > Attachments: db1.sync.lati.osa.cassandra.log, > db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, > system.log.10.210.3.221, system.log.10.210.3.230 > > > We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade > to 2.1.8 it started to work faster but now it fails with: > {code} > ... > [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde > for range (-5474076923322749342,-5468600594078911162] finished > [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde > for range (-8631877858109464676,-8624040066373718932] finished > [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde > for range (-5372806541854279315,-5369354119480076785] finished > [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde > for range (8166489034383821955,8168408930184216281] finished > [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde > for range (6084602890817326921,6088328703025510057] finished > [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde > for range (-781874602493000830,-781745173070807746] finished > [2015-07-29 20:44:03,957] Repair command #4 finished > error: nodetool failed, check server logs > -- StackTrace -- > java.lang.RuntimeException: nodetool failed, check server logs > at > org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290) > at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202) > {code} > After running: > {code} > nodetool repair --partitioner-range --parallel --in-local-dc sync > {code} > Last records in logs regarding repair are: > {code} > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range > (-7695808664784761779,-7693529816291585568] finished > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range > (806371695398849,8065203836608925992] finished > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range > (-5474076923322749342,-5468600594078911162] finished > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range > (-8631877858109464676,-8624040066373718932] finished > INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - > Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range > (-5372806541854279315,-5369354119480076785] finished > INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - > Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range > (8166489034383821955,8168408930184216281] finished > INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - > Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range > (6084602890817326921,6088328703025510057] finished > INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - > Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range > (-781874602493000830,-781745173070807746] finished > {code} > but a bit above I see (at least two times in attached log): > {code} > ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - > Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range > (5765414319217852786,5781018794516851576] failed with error > org.apache.cassandra.exceptions.RepairException: [repair > #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, > (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162 > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > org.apache.cassandra.exceptions.RepairException: [repair > #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, > (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162 > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > [na:1.7.0_80] > at java.util.concurrent.FutureTask.get(FutureTask.java:188) > [na:1.7.0_80] >
[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException
[ https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15143400#comment-15143400 ] Jean-Francois Gosselin commented on CASSANDRA-9935: --- Ok from 172.16.63.39, same error "received out of order wrt DecoratedKey" : {noformat} ERROR [ValidationExecutor:118] 2016-02-11 17:21:27,512 Validator.java:245 - Failed creating a merkle tree for [repair #d78e02b0-d0e3-11e5-a04a-4ffa10ef584b on foo/bar, (-5525881226490706160,-5525442713957813067]], /10.174.216.158 (see log for details) ERROR [ValidationExecutor:118] 2016-02-11 17:21:27,516 CassandraDaemon.java:223 - Exception in thread Thread[ValidationExecutor:118,1,main] java.lang.AssertionError: row DecoratedKey(-5525725068665570338, 0010e3a74bf82717394598e2b7421c89382e250265336137346266382d323731372d333934352d393865322d62373432316338393338326510f64b1c2b7d1c3ff893b70c24c5dbdc6b00) received out of order wrt DecoratedKey(-5525444669477674618, 0010581499f0b99337e1bf468611fd0233e4250235383134393966302d623939332d333765312d626634362d3836313166643032653410f64b1c2b7d1c3ff893b70c24c5dbdc6b00) at org.apache.cassandra.repair.Validator.add(Validator.java:126) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1003) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:615) ~[apache-cassandra-2.1.9.jar:2.1.9] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_65] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_65] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_65] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65] {noformat} > Repair fails with RuntimeException > -- > > Key: CASSANDRA-9935 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9935 > Project: Cassandra > Issue Type: Bug > Environment: C* 2.1.8, Debian Wheezy >Reporter: mlowicki >Assignee: Yuki Morishita > Fix For: 2.1.x > > Attachments: db1.sync.lati.osa.cassandra.log, > db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, > system.log.10.210.3.221, system.log.10.210.3.230 > > > We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade > to 2.1.8 it started to work faster but now it fails with: > {code} > ... > [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde > for range (-5474076923322749342,-5468600594078911162] finished > [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde > for range (-8631877858109464676,-8624040066373718932] finished > [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde > for range (-5372806541854279315,-5369354119480076785] finished > [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde > for range (8166489034383821955,8168408930184216281] finished > [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde > for range (6084602890817326921,6088328703025510057] finished > [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde > for range (-781874602493000830,-781745173070807746] finished > [2015-07-29 20:44:03,957] Repair command #4 finished > error: nodetool failed, check server logs > -- StackTrace -- > java.lang.RuntimeException: nodetool failed, check server logs > at > org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290) > at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202) > {code} > After running: > {code} > nodetool repair --partitioner-range --parallel --in-local-dc sync > {code} > Last records in logs regarding repair are: > {code} > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range > (-7695808664784761779,-7693529816291585568] finished > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range > (806371695398849,8065203836608925992] finished > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range > (-5474076923322749342,-5468600594078911162] finished > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range > (-8631877858109464676,-86240400663737
[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException
[ https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15143307#comment-15143307 ] Jean-Francois Gosselin commented on CASSANDRA-9935: --- Here's a new one with no clear message from the exception : {noformat} INFO [AntiEntropyStage:1] 2016-02-11 17:21:20,947 RepairSession.java:171 - [repair #d78e02b0-d0e3-11e5-a04a-4ffa10ef584b] Received merkle tree for bar from /10.53.10.30 ERROR [AntiEntropySessions:28] 2016-02-11 17:21:21,033 RepairSession.java:303 - [repair #d78e02b0-d0e3-11e5-a04a-4ffa10ef584b] session completed with the following error org.apache.cassandra.exceptions.RepairException: [repair #d78e02b0-d0e3-11e5-a04a-4ffa10ef584b on foo/bar, (-5525881226490706160,-5525442713957813067]] Validation failed in /172.16.63.39 at org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:406) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) ~[apache-cassandra-2.1.9.jar:2.1.9] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_65] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_65] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65] ERROR [AntiEntropySessions:28] 2016-02-11 17:21:21,034 CassandraDaemon.java:223 - Exception in thread Thread[AntiEntropySessions:28,5,RMI Runtime] java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: [repair #d78e02b0-d0e3-11e5-a04a-4ffa10ef584b on foo/bar, (-5525881226490706160,-5525442713957813067]] Validation failed in /172.16.63.39 at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.jar:na] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) ~[apache-cassandra-2.1.9.jar:2.1.9] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_65] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_65] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_65] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_65] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65] Caused by: org.apache.cassandra.exceptions.RepairException: [repair #d78e02b0-d0e3-11e5-a04a-4ffa10ef584b on foo/bar, (-5525881226490706160,-5525442713957813067]] Validation failed in /172.16.63.39 at org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:406) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:134) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) ~[apache-cassandra-2.1.9.jar:2.1.9] ... 3 common frames omitted ERROR [Thread-20728] 2016-02-11 17:21:21,034 StorageService.java:2966 - Repair session d78e02b0-d0e3-11e5-a04a-4ffa10ef584b for range (-5525881226490706160,-5525442713957813067] failed with error org.apache.cassandra.exceptions.RepairException: [repair #d78e02b0-d0e3-11e5-a04a-4ffa10ef584b on foo/bar, (-5525881226490706160,-5525442713957813067]] Validation failed in /172.16.63.39 java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: [repair #d78e02b0-d0e3-11e5-a04a-4ffa10ef584b on foo/bar, (-5525881226490706160,-5525442713957813067]] Validation failed in /172.16.63.39 at java.util.concurrent.FutureTask.report(FutureTask.java:122) [na:1.7.0_65] at java.util.concurrent.FutureTask.get(FutureTask.java:188) [na:1.7.0_65] at org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2957) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) [apache-cassandra-2.1.9.jar:2.1.9] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_65] at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_65] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65] Caused by: java.lang.RuntimeException: org.apache.cassandra.exceptions.RepairException: [repair #d78e02b0-d0e3-11e5-a04a-4ffa10ef584b on foo/
[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException
[ https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15143270#comment-15143270 ] Jean-Francois Gosselin commented on CASSANDRA-9935: --- [~yukim] Yesterday we ran nodetool scrub on all the nodes and restarted the nodes. No luck we're still getting "received out of order wrt DecoratedKey" . Any suggestions for the next step ? > Repair fails with RuntimeException > -- > > Key: CASSANDRA-9935 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9935 > Project: Cassandra > Issue Type: Bug > Environment: C* 2.1.8, Debian Wheezy >Reporter: mlowicki >Assignee: Yuki Morishita > Fix For: 2.1.x > > Attachments: db1.sync.lati.osa.cassandra.log, > db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, > system.log.10.210.3.221, system.log.10.210.3.230 > > > We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade > to 2.1.8 it started to work faster but now it fails with: > {code} > ... > [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde > for range (-5474076923322749342,-5468600594078911162] finished > [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde > for range (-8631877858109464676,-8624040066373718932] finished > [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde > for range (-5372806541854279315,-5369354119480076785] finished > [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde > for range (8166489034383821955,8168408930184216281] finished > [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde > for range (6084602890817326921,6088328703025510057] finished > [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde > for range (-781874602493000830,-781745173070807746] finished > [2015-07-29 20:44:03,957] Repair command #4 finished > error: nodetool failed, check server logs > -- StackTrace -- > java.lang.RuntimeException: nodetool failed, check server logs > at > org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290) > at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202) > {code} > After running: > {code} > nodetool repair --partitioner-range --parallel --in-local-dc sync > {code} > Last records in logs regarding repair are: > {code} > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range > (-7695808664784761779,-7693529816291585568] finished > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range > (806371695398849,8065203836608925992] finished > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range > (-5474076923322749342,-5468600594078911162] finished > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range > (-8631877858109464676,-8624040066373718932] finished > INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - > Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range > (-5372806541854279315,-5369354119480076785] finished > INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - > Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range > (8166489034383821955,8168408930184216281] finished > INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - > Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range > (6084602890817326921,6088328703025510057] finished > INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - > Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range > (-781874602493000830,-781745173070807746] finished > {code} > but a bit above I see (at least two times in attached log): > {code} > ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - > Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range > (5765414319217852786,5781018794516851576] failed with error > org.apache.cassandra.exceptions.RepairException: [repair > #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, > (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162 > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > org.apache.cassandra.exceptions.RepairException: [repair > #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, > (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162 > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > [na:1.7.0_80] >
[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException
[ https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15141341#comment-15141341 ] Jean-Francois Gosselin commented on CASSANDRA-9935: --- No, we haven't seen this WARN. The only thing we haven't tried is a node restart (based on you comment above " ... The latter may be fixed by restarting the node." ) . Although I'm not sure it will fix the problem since we've used C* 2.1.9 from the beginning. > Repair fails with RuntimeException > -- > > Key: CASSANDRA-9935 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9935 > Project: Cassandra > Issue Type: Bug > Environment: C* 2.1.8, Debian Wheezy >Reporter: mlowicki >Assignee: Yuki Morishita > Fix For: 2.1.x > > Attachments: db1.sync.lati.osa.cassandra.log, > db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, > system.log.10.210.3.221, system.log.10.210.3.230 > > > We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade > to 2.1.8 it started to work faster but now it fails with: > {code} > ... > [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde > for range (-5474076923322749342,-5468600594078911162] finished > [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde > for range (-8631877858109464676,-8624040066373718932] finished > [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde > for range (-5372806541854279315,-5369354119480076785] finished > [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde > for range (8166489034383821955,8168408930184216281] finished > [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde > for range (6084602890817326921,6088328703025510057] finished > [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde > for range (-781874602493000830,-781745173070807746] finished > [2015-07-29 20:44:03,957] Repair command #4 finished > error: nodetool failed, check server logs > -- StackTrace -- > java.lang.RuntimeException: nodetool failed, check server logs > at > org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290) > at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202) > {code} > After running: > {code} > nodetool repair --partitioner-range --parallel --in-local-dc sync > {code} > Last records in logs regarding repair are: > {code} > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range > (-7695808664784761779,-7693529816291585568] finished > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range > (806371695398849,8065203836608925992] finished > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range > (-5474076923322749342,-5468600594078911162] finished > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range > (-8631877858109464676,-8624040066373718932] finished > INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - > Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range > (-5372806541854279315,-5369354119480076785] finished > INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - > Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range > (8166489034383821955,8168408930184216281] finished > INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - > Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range > (6084602890817326921,6088328703025510057] finished > INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - > Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range > (-781874602493000830,-781745173070807746] finished > {code} > but a bit above I see (at least two times in attached log): > {code} > ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - > Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range > (5765414319217852786,5781018794516851576] failed with error > org.apache.cassandra.exceptions.RepairException: [repair > #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, > (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162 > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > org.apache.cassandra.exceptions.RepairException: [repair > #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, > (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162 > at java.u
[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException
[ https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15141245#comment-15141245 ] Jean-Francois Gosselin commented on CASSANDRA-9935: --- [~yukim] The WARN message should be in the C* log or on the stdout of nodetool ? > Repair fails with RuntimeException > -- > > Key: CASSANDRA-9935 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9935 > Project: Cassandra > Issue Type: Bug > Environment: C* 2.1.8, Debian Wheezy >Reporter: mlowicki >Assignee: Yuki Morishita > Fix For: 2.1.x > > Attachments: db1.sync.lati.osa.cassandra.log, > db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, > system.log.10.210.3.221, system.log.10.210.3.230 > > > We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade > to 2.1.8 it started to work faster but now it fails with: > {code} > ... > [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde > for range (-5474076923322749342,-5468600594078911162] finished > [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde > for range (-8631877858109464676,-8624040066373718932] finished > [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde > for range (-5372806541854279315,-5369354119480076785] finished > [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde > for range (8166489034383821955,8168408930184216281] finished > [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde > for range (6084602890817326921,6088328703025510057] finished > [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde > for range (-781874602493000830,-781745173070807746] finished > [2015-07-29 20:44:03,957] Repair command #4 finished > error: nodetool failed, check server logs > -- StackTrace -- > java.lang.RuntimeException: nodetool failed, check server logs > at > org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290) > at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202) > {code} > After running: > {code} > nodetool repair --partitioner-range --parallel --in-local-dc sync > {code} > Last records in logs regarding repair are: > {code} > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range > (-7695808664784761779,-7693529816291585568] finished > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range > (806371695398849,8065203836608925992] finished > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range > (-5474076923322749342,-5468600594078911162] finished > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range > (-8631877858109464676,-8624040066373718932] finished > INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - > Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range > (-5372806541854279315,-5369354119480076785] finished > INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - > Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range > (8166489034383821955,8168408930184216281] finished > INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - > Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range > (6084602890817326921,6088328703025510057] finished > INFO [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - > Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range > (-781874602493000830,-781745173070807746] finished > {code} > but a bit above I see (at least two times in attached log): > {code} > ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - > Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range > (5765414319217852786,5781018794516851576] failed with error > org.apache.cassandra.exceptions.RepairException: [repair > #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, > (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162 > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > org.apache.cassandra.exceptions.RepairException: [repair > #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, > (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162 > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > [na:1.7.0_80] > at java.util.concurrent.FutureTask.get(FutureTask.java:188) > [na:1.7.0_80] > at > org.apa
[jira] [Commented] (CASSANDRA-10769) "received out of order wrt DecoratedKey" after scrub
[ https://issues.apache.org/jira/browse/CASSANDRA-10769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15140965#comment-15140965 ] Jean-Francois Gosselin commented on CASSANDRA-10769: We are also seeing this issue in our multi datacenters cluster (3 DCs), C* 2.1.9 (and using LCS). We ran nodetool scrub on all the nodes but the error keeps coming back . How can we get into this state ? {noformat} ERROR [ValidationExecutor:5884] 2016-02-03 09:27:41,703 Validator.java:245 - Failed creating a merkle tree for [repair #a8f3f040-ca58-11e5-9dda-130298de45de on keyspace1/xyz, (5126461213031423923,5128334161692376535]], /10.174.216.163 (see log for details) ERROR [ValidationExecutor:5884] 2016-02-03 09:27:41,704 CassandraDaemon.java:223 - Exception in thread Thread[ValidationExecutor:5884,1,main] java.lang.AssertionError: row DecoratedKey(5126475305931285312, 00103cee13c2c0ea38328138fcad86515eef250233636565313363322d633065612d333833322d383133382d666361643836353135656566105cc950f02b6239f0bf9af60ac7dd452400) received out of order wrt DecoratedKey(5128167525973821686, 00105fe2e7db8810387a9a2955a07ecfa7d3250235666532653764622d383831302d333837612d396132392d35356130376563666137643310f64b1c2b7d1c3ff893b70c24c5dbdc6b00) at org.apache.cassandra.repair.Validator.add(Validator.java:126) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1003) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:615) ~[apache-cassandra-2.1.9.jar:2.1.9] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_65] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_65] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_65] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65] {noformat} > "received out of order wrt DecoratedKey" after scrub > > > Key: CASSANDRA-10769 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10769 > Project: Cassandra > Issue Type: Bug > Environment: C* 2.1.11, Debian Wheezy >Reporter: mlowicki > > After running scrub and cleanup on all nodes in single data center I'm > getting: > {code} > ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,530 Validator.java:245 - > Failed creating a merkle tree for [repair > #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, > (-5867793819051725444,-5865919628027816979]], /10.210.3.221 (see log for > details) > ERROR [ValidationExecutor:103] 2015-11-25 06:28:21,531 > CassandraDaemon.java:227 - Exception in thread > Thread[ValidationExecutor:103,1,main] > java.lang.AssertionError: row DecoratedKey(-5867787467868737053, > 000932373633313036313204808800) received out of order wrt > DecoratedKey(-5865937851627253360, 000933313230313737333204c3c700) > at org.apache.cassandra.repair.Validator.add(Validator.java:127) > ~[apache-cassandra-2.1.11.jar:2.1.11] > at > org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1010) > ~[apache-cassandra-2.1.11.jar:2.1.11] > at > org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94) > ~[apache-cassandra-2.1.11.jar:2.1.11] > at > org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:622) > ~[apache-cassandra-2.1.11.jar:2.1.11] > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > ~[na:1.7.0_80] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > ~[na:1.7.0_80] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_80] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80] > {code} > What I did is to run repair on other node: > {code} > time nodetool repair --in-local-dc > {code} > Corresponding log on the node where repair has been started: > {code} > ERROR [AntiEntropySessions:414] 2015-11-25 06:28:21,533 > RepairSession.java:303 - [repair #89fa2b70-933d-11e5-b036-75bb514ae072] > session completed with the following error > org.apache.cassandra.exceptions.RepairException: [repair > #89fa2b70-933d-11e5-b036-75bb514ae072 on sync/entity_by_id2, > (-5867793819051725444,-5865919628027816979]] Validation failed in > /10.210.3.117 > at > org.apache.cassandra.repair.RepairSession.validationComplete(RepairSession.java:166) > ~[apac
[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting
[ https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15139500#comment-15139500 ] Jean-Francois Gosselin commented on CASSANDRA-9625: --- It's not fixed. I end up adding a catch for the AssertionError in the GraphiteReporter as a workaround. > GraphiteReporter not reporting > -- > > Key: CASSANDRA-9625 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9625 > Project: Cassandra > Issue Type: Bug > Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3 >Reporter: Eric Evans > Attachments: metrics.yaml, thread-dump.log > > > When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops > working. The usual startup is logged, and one batch of samples is sent, but > the reporting interval comes and goes, and no other samples are ever sent. > The logs are free from errors. > Frustratingly, metrics reporting works in our smaller (staging) environment > on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not > on a 3 node (otherwise identical) staging cluster (maybe it takes a certain > level of concurrency?). > Attached is a thread dump, and our metrics.yaml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException
[ https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15134437#comment-15134437 ] Jean-Francois Gosselin commented on CASSANDRA-9935: --- [~yukim] We are also seeing this issue in our multi datacenters cluster (3 DCs), C* 2.1.9 (and using LCS). We ran nodetool scrub on all the nodes but the error keeps coming back . We did have some network glitch, as [~mlowicki] was saying, can it be related to network issues ? {noformat} ERROR [ValidationExecutor:5884] 2016-02-03 09:27:41,703 Validator.java:245 - Failed creating a merkle tree for [repair #a8f3f040-ca58-11e5-9dda-130298de45de on keyspace1/xyz, (5126461213031423923,5128334161692376535]], /10.174.216.163 (see log for details) ERROR [ValidationExecutor:5884] 2016-02-03 09:27:41,704 CassandraDaemon.java:223 - Exception in thread Thread[ValidationExecutor:5884,1,main] java.lang.AssertionError: row DecoratedKey(5126475305931285312, 00103cee13c2c0ea38328138fcad86515eef250233636565313363322d633065612d333833322d383133382d666361643836353135656566105cc950f02b6239f0bf9af60ac7dd452400) received out of order wrt DecoratedKey(5128167525973821686, 00105fe2e7db8810387a9a2955a07ecfa7d3250235666532653764622d383831302d333837612d396132392d35356130376563666137643310f64b1c2b7d1c3ff893b70c24c5dbdc6b00) at org.apache.cassandra.repair.Validator.add(Validator.java:126) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1003) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:94) ~[apache-cassandra-2.1.9.jar:2.1.9] at org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:615) ~[apache-cassandra-2.1.9.jar:2.1.9] at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_65] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_65] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_65] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65] {noformat} > Repair fails with RuntimeException > -- > > Key: CASSANDRA-9935 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9935 > Project: Cassandra > Issue Type: Bug > Environment: C* 2.1.8, Debian Wheezy >Reporter: mlowicki >Assignee: Yuki Morishita > Fix For: 2.1.x > > Attachments: db1.sync.lati.osa.cassandra.log, > db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, > system.log.10.210.3.221, system.log.10.210.3.230 > > > We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade > to 2.1.8 it started to work faster but now it fails with: > {code} > ... > [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde > for range (-5474076923322749342,-5468600594078911162] finished > [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde > for range (-8631877858109464676,-8624040066373718932] finished > [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde > for range (-5372806541854279315,-5369354119480076785] finished > [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde > for range (8166489034383821955,8168408930184216281] finished > [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde > for range (6084602890817326921,6088328703025510057] finished > [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde > for range (-781874602493000830,-781745173070807746] finished > [2015-07-29 20:44:03,957] Repair command #4 finished > error: nodetool failed, check server logs > -- StackTrace -- > java.lang.RuntimeException: nodetool failed, check server logs > at > org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290) > at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202) > {code} > After running: > {code} > nodetool repair --partitioner-range --parallel --in-local-dc sync > {code} > Last records in logs regarding repair are: > {code} > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range > (-7695808664784761779,-7693529816291585568] finished > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range > (806371695398849,8065203836608925992] finished > INFO [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - > Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range > (-5474076923322749
[jira] [Commented] (CASSANDRA-10502) Cassandra query degradation with high frequency updated tables
[ https://issues.apache.org/jira/browse/CASSANDRA-10502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15086025#comment-15086025 ] Jean-Francois Gosselin commented on CASSANDRA-10502: [~thobbs] Have you tried to dump the data for this key with sstable2json ? > Cassandra query degradation with high frequency updated tables > -- > > Key: CASSANDRA-10502 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10502 > Project: Cassandra > Issue Type: Bug >Reporter: Dodong Juan >Priority: Minor > Labels: perfomance, query, triage > Fix For: 2.2.x > > > Hi, > So we are developing a system that computes profile of things that it > observes. The observation comes in form of events. Each thing that it > observe has an id and each thing has a set of subthings in it which has > measurement of some kind. Roughly there are about 500 subthings within each > thing. We receive events containing measurements of these 500 subthings every > 10 seconds or so. > So as we receive events, we read the old profile value, calculate the new > profile based on the new value and save it back. > One of the things we observe are the processes running on the server. > We use the following schema to hold the profile. > {noformat} > CREATE TABLE processinfometric_profile ( > profilecontext text, > id text, > month text, > day text, > hour text, > minute text, > command text, > cpu map, > majorfaults map, > minorfaults map, > nice map, > pagefaults map, > pid map, > ppid map, > priority map, > resident map, > rss map, > sharesize map, > size map, > starttime map, > state map, > threads map, > user map, > vsize map, > PRIMARY KEY ((profilecontext, id, month, day, hour, minute), command) > ) WITH CLUSTERING ORDER BY (command ASC) > AND bloom_filter_fp_chance = 0.1 > AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'} > AND compression = {'sstable_compression': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99.0PERCENTILE'; > {noformat} > This profile will then be use for certain analytics that can use in the > context of the ‘thing’ or in the context of specific thing and subthing. > A profile can be defined as monthly, daily, hourly. So in case of monthly the > month will be set to the current month (i.e. ‘Oct’) and the day and hour will > be set to empty ‘’ string. > The problem that we have observed is that over time (actually in just a > matter of hours) we will see a huge degradation of query response for the > monthly profile. At the start it will be respinding in 10-100 ms and after a > couple of hours it will go to 2000-3000 ms . If you leave it for a couple of > days you will start experiencing readtimeouts . The query is basically just : > {noformat} > select * from myprofile where id=‘1’ and month=‘Oct’ and day=‘’ and hour=‘' > and minute='' > {noformat} > This will have only about 500 rows or so. > We were using Cassandra 2.2.1 , but upgraded to 2.2.2 to see if it fixed the > issue to no avail. And since this is a test, we are running on a single node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-9625) GraphiteReporter not reporting
[ https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14943468#comment-14943468 ] Jean-Francois Gosselin edited comment on CASSANDRA-9625 at 10/5/15 2:51 PM: I ran into another assert that breaks the GraphiteReporter on 2.1.9 . When SSTableReader.getApproximateKeyCount is called, how can I get in a state where the CompactionMetadata is null ? {code:title=SSTableReader.java|borderStyle=solid} 276 try 278 { 279 CompactionMetadata metadata = (CompactionMetadata) sstable.descriptor.getMetadataSerializer().deserialize(sstable.descriptor, MetadataType.COMPACTION); 280 assert metadata != null : sstable.getFilename(); 281 if (cardinality == null) {code} {noformat} at org.apache.cassandra.io.sstable.SSTableReader.getApproximateKeyCount(SSTableReader.java:279) at org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:296) at org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:290) at com.yammer.metrics.reporting.GraphiteReporter.processGauge(GraphiteReporter.java:292) at com.yammer.metrics.reporting.GraphiteReporter.processGauge(GraphiteReporter.java:27) at com.yammer.metrics.core.Gauge.processWith(Gauge.java:28) at com.yammer.metrics.reporting.GraphiteReporter.printRegularMetrics(GraphiteReporter.java:235) at com.yammer.metrics.reporting.GraphiteReporter.run(GraphiteReporter.java:199) {noformat} was (Author: jfgosselin): I ran into another assert that breaks the GraphiteReporter on 2.1.9 . When SSTableReader.getApproximateKeyCount is called, how can I get in a state where the CompactionMetadata is null ? {code:title=SSTableReader.java|borderStyle=solid} 276try 278{ 279CompactionMetadata metadata = (CompactionMetadata) sstable.descriptor.getMetadataSerializer().deserialize(sstable.descriptor, MetadataType.COMPACTION); 280assert metadata != null : sstable.getFilename(); 281if (cardinality == null) {code} {noformat} at org.apache.cassandra.io.sstable.SSTableReader.getApproximateKeyCount(SSTableReader.java:279) at org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:296) at org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:290) at com.yammer.metrics.reporting.GraphiteReporter.processGauge(GraphiteReporter.java:292) at com.yammer.metrics.reporting.GraphiteReporter.processGauge(GraphiteReporter.java:27) at com.yammer.metrics.core.Gauge.processWith(Gauge.java:28) at com.yammer.metrics.reporting.GraphiteReporter.printRegularMetrics(GraphiteReporter.java:235) at com.yammer.metrics.reporting.GraphiteReporter.run(GraphiteReporter.java:199) {noformat} > GraphiteReporter not reporting > -- > > Key: CASSANDRA-9625 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9625 > Project: Cassandra > Issue Type: Bug > Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3 >Reporter: Eric Evans > Attachments: metrics.yaml, thread-dump.log > > > When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops > working. The usual startup is logged, and one batch of samples is sent, but > the reporting interval comes and goes, and no other samples are ever sent. > The logs are free from errors. > Frustratingly, metrics reporting works in our smaller (staging) environment > on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not > on a 3 node (otherwise identical) staging cluster (maybe it takes a certain > level of concurrency?). > Attached is a thread dump, and our metrics.yaml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting
[ https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14943468#comment-14943468 ] Jean-Francois Gosselin commented on CASSANDRA-9625: --- I ran into another assert that breaks the GraphiteReporter on 2.1.9 . When SSTableReader.getApproximateKeyCount is called, how can I get in a state where the CompactionMetadata is null ? {code:title=SSTableReader.java|borderStyle=solid} 276try 278{ 279CompactionMetadata metadata = (CompactionMetadata) sstable.descriptor.getMetadataSerializer().deserialize(sstable.descriptor, MetadataType.COMPACTION); 280assert metadata != null : sstable.getFilename(); 281if (cardinality == null) {code} {noformat} at org.apache.cassandra.io.sstable.SSTableReader.getApproximateKeyCount(SSTableReader.java:279) at org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:296) at org.apache.cassandra.metrics.ColumnFamilyMetrics$9.value(ColumnFamilyMetrics.java:290) at com.yammer.metrics.reporting.GraphiteReporter.processGauge(GraphiteReporter.java:292) at com.yammer.metrics.reporting.GraphiteReporter.processGauge(GraphiteReporter.java:27) at com.yammer.metrics.core.Gauge.processWith(Gauge.java:28) at com.yammer.metrics.reporting.GraphiteReporter.printRegularMetrics(GraphiteReporter.java:235) at com.yammer.metrics.reporting.GraphiteReporter.run(GraphiteReporter.java:199) {noformat} > GraphiteReporter not reporting > -- > > Key: CASSANDRA-9625 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9625 > Project: Cassandra > Issue Type: Bug > Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3 >Reporter: Eric Evans > Attachments: metrics.yaml, thread-dump.log > > > When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops > working. The usual startup is logged, and one batch of samples is sent, but > the reporting interval comes and goes, and no other samples are ever sent. > The logs are free from errors. > Frustratingly, metrics reporting works in our smaller (staging) environment > on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not > on a 3 node (otherwise identical) staging cluster (maybe it takes a certain > level of concurrency?). > Attached is a thread dump, and our metrics.yaml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting
[ https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14715194#comment-14715194 ] Jean-Francois Gosselin commented on CASSANDRA-9625: --- [~benedict] Can this issue be reopened ? > GraphiteReporter not reporting > -- > > Key: CASSANDRA-9625 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9625 > Project: Cassandra > Issue Type: Bug > Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3 >Reporter: Eric Evans >Assignee: T Jake Luciani > Attachments: metrics.yaml, thread-dump.log > > > When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops > working. The usual startup is logged, and one batch of samples is sent, but > the reporting interval comes and goes, and no other samples are ever sent. > The logs are free from errors. > Frustratingly, metrics reporting works in our smaller (staging) environment > on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not > on a 3 node (otherwise identical) staging cluster (maybe it takes a certain > level of concurrency?). > Attached is a thread dump, and our metrics.yaml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting
[ https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14711887#comment-14711887 ] Jean-Francois Gosselin commented on CASSANDRA-9625: --- [~tjake] I think that I've found the issue. When the Gauge metric for CompressionMetadataOffHeapMemoryUsed is called, the following method is called in org.apache.cassandra.io.util.Memory : {code:title=org.apache.cassandra.io.util.Memory.java|borderStyle=solid} public long size() { assert peer != 0; return size; } {code} and for some reason peer was 0. After the AssertionError the metrics graphite reporter thread is no longer executed. > GraphiteReporter not reporting > -- > > Key: CASSANDRA-9625 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9625 > Project: Cassandra > Issue Type: Bug > Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3 >Reporter: Eric Evans >Assignee: T Jake Luciani > Attachments: metrics.yaml, thread-dump.log > > > When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops > working. The usual startup is logged, and one batch of samples is sent, but > the reporting interval comes and goes, and no other samples are ever sent. > The logs are free from errors. > Frustratingly, metrics reporting works in our smaller (staging) environment > on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not > on a 3 node (otherwise identical) staging cluster (maybe it takes a certain > level of concurrency?). > Attached is a thread dump, and our metrics.yaml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting
[ https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695853#comment-14695853 ] Jean-Francois Gosselin commented on CASSANDRA-9625: --- [~tjake] I can easily reproduce the issue after ~ 12h, com.yammer.metrics.reporting at DEBUG didn't provide anything . Any specific places where I should add traces in GraphiteRepoter ? > GraphiteReporter not reporting > -- > > Key: CASSANDRA-9625 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9625 > Project: Cassandra > Issue Type: Bug > Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3 >Reporter: Eric Evans >Assignee: T Jake Luciani > Attachments: metrics.yaml, thread-dump.log > > > When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops > working. The usual startup is logged, and one batch of samples is sent, but > the reporting interval comes and goes, and no other samples are ever sent. > The logs are free from errors. > Frustratingly, metrics reporting works in our smaller (staging) environment > on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not > on a 3 node (otherwise identical) staging cluster (maybe it takes a certain > level of concurrency?). > Attached is a thread dump, and our metrics.yaml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting
[ https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14682117#comment-14682117 ] Jean-Francois Gosselin commented on CASSANDRA-9625: --- We are seeing this issue on 2.1.8. > GraphiteReporter not reporting > -- > > Key: CASSANDRA-9625 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9625 > Project: Cassandra > Issue Type: Bug > Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3 >Reporter: Eric Evans >Assignee: T Jake Luciani > Fix For: 2.1.x > > Attachments: metrics.yaml, thread-dump.log > > > When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops > working. The usual startup is logged, and one batch of samples is sent, but > the reporting interval comes and goes, and no other samples are ever sent. > The logs are free from errors. > Frustratingly, metrics reporting works in our smaller (staging) environment > on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not > on a 3 node (otherwise identical) staging cluster (maybe it takes a certain > level of concurrency?). > Attached is a thread dump, and our metrics.yaml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)